Impacts of Dirty Data: and Experimental Evaluation

03/16/2018
by   Zhixin Qi, et al.
0

Data quality issues have attracted widespread attention due to the negative impacts of dirty data on data mining and machine learning results. The relationship between data quality and the accuracy of results could be applied on the selection of the appropriate algorithm with the consideration of data quality and the determination of the data share to clean. However, rare research has focused on exploring such relationship. Motivated by this, this paper conducts an experimental comparison for the effects of missing, inconsistent and conflicting data on classification, clustering, and regression algorithms. Based on the experimental findings, we provide guidelines for algorithm selection and data cleaning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2017

Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case

In an effort to overcome the data deluge in computational biology and bi...
research
11/16/2020

Imputation techniques on missing values in breast cancer treatment and fertility data

Clinical decision support using data mining techniques offers more intel...
research
09/04/2019

Epistemological Issues in Educational Data Mining

Educational Data Mining (EDM) shows interesting scientific results latel...
research
08/10/2019

A Critical Note on the Evaluation of Clustering Algorithms

Experimental evaluation is a major research methodology for investigatin...
research
05/13/2020

Systematic Ensemble Model Selection Approach for Educational Data Mining

A plethora of research has been done in the past focusing on predicting ...
research
02/20/2023

Towards Unbounded Machine Unlearning

Deep machine unlearning is the problem of removing the influence of a co...
research
01/18/2019

Data Quality Measures and Data Cleansing for Research Information Systems

The collection, transfer and integration of research information into di...

Please sign up or login with your details

Forgot password? Click here to reset