What do you mean by Data preprocessing and why it is needed?

December 20, 2017

Data Preprocessing is a technique that used to improve the quality of the data before applied mining, so that data will lead to high quality mining results. Data processing technique can substantially improve the overall quality of the patterns mined and/or the time required for the actual mining. Data preprocessing include data cleaning, data integration, data transformation, and data reduction.

Data Cleaning: Data cleaning can be applied to remove noise and correct inconsistencies in the data.

Data integration: Data integration merges data from multiple sources in to a coherent data store, such as a data warehouse.

Data transformations: Data transformations such as normalization, may be applied for example, normalization may improve the accuracy and efficiency of mining algorithms involving distance measurements.

Data reduction: Data Reduction can reduce the data size by aggregating, eliminating redundant features, or clustering, for instance.
These techniques are not mutually exclusive. They may work together.

Why it is needed?
Incomplete, noisy, and inconsistent data are common place properties of large real world database and data warehouse. Incomplete data can occur for a number of reasons. Attributes of interest may not always be available, such as customer information for sales transaction important at the time of entry. Relevant data may not be recorded due to a misunderstanding, or because of equipment malfunctions. Data what where inconsistent with other recorded data may have been deleted. Furthermore recording of the history or modifications to the data may have been overlooked. Missing data, particularly for tuples with missing value for some mining results. Therefore to improve the quality of data and, consequently, of the mining results, data preprocessing needed.

What do you mean by Data preprocessing and why it is needed?

Interview

What Is Driving Infineon Technologies’ Dominance in Automotive Semiconductors?

STMicroelectronics Debuts First Automotive MCU with Integrated NPU for AI-Driven Powertrains

India’s Electronics Powerplay: Why 2026 Will Set the Industry Pace

Building the Future of Automotive Skills in India: ASDC

From Semiconductor Leadership to Social Impact: Chhavi Bajpai’s Journey in Technology...

SX-PCEBE-AP Wi-Fi 7 Module Delivers High-Speed Wireless Connectivity for Industrial Applications

Megawatt Charging System: Infineon and ADVANTICS Drive High-Power EV Charging Innovation

PaPIRs+ PIR Motion Sensor Launched by Panasonic for Smarter Building Automation