Data reduction strategies in data mining

Data reduction strategies applied on huge data set. Complex data and mining on huge amounts of data can take a long time, making such analysis impractical or infeasible.

Data reduction techniques can be applied to obtain a reduces data should be more efficient yet produce the same analytical results.

Strategies for data reduction include the following-

1 Data cube aggregation, where aggregation operations are applied to the data in the construction of a data cube.

2 Attribute subset selections, where irrelevant, weakly relevant or redundant attributes or dimensions may be detected and removed,

3 Dimensionality reduction, where encoding mechanism are used to reduce the data set size.

4 Numerosity reductions, where the data are replaced or estimated by alternative, smaller data representations such as parametric models or non parametric method such as clustering, sampling, and the use of histograms.

5 Discretization and concept hierarchy generation, where raw data values for attributes are replaced by range or higher conceptual levels. Data discretization is a form of numerosity reduction that is very useful for the automatic generation of concept hierarchies. Discretization and concept hierarchy generation are powerful tools for data mining, in that they allow the mining of data at multiple levels of abstraction.

SHARE
Previous articleEdge Computing: Solution for Data Flood
Next articleControl Strategies used in Artificial Intelligence to reach the solution
Electronics Media is an Indian electronics and tech journalism platform dedicated for international electronics and tech industry. EM covers news from semiconductor, aerospace, defense-e, IOT, design, tech startup, emerging technology, innovation and business trends worldwide. Follow us on twitter for latest update in industry.