Une nouvelle approche de complétion des valeurs manquantes dans les bases de données

01/03/2019
by   Leila Ben Othman, et al.
0

When tackling real-life datasets, it is common to face the existence of scrambled missing values within data. Considered as 'dirty data', usually it is removed during a pre-processing step. Starting from the fact that 'making up this missing data is better than throwing out it away', we present a new approach trying to complete missing data. The main singularity of the introduced approach is that it sheds light on a fruitful synergy between generic basis of association rules and the topic of missing values handling. In fact, beyond interesting compactness rate, such generic association rules make it possible to get a considerable reduction of conflicts during the completion step. A new metric called 'Robustness' is also introduced, and aims to select the robust association rule for the completion of a missing value whenever a conflict appears. Carried out experiments on benchmark datasets confirm the soundness of our approach. Thus, it reduces conflict during the completion step while offering a high percentage of correct completion accuracy.

READ FULL TEXT
research
04/21/2009

Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values

Handling missing values in training datasets for constructing learning m...
research
11/17/2015

Optimized Linear Imputation

Often in real-world datasets, especially in high dimensional data, some ...
research
09/01/2014

Multi-tensor Completion for Estimating Missing Values in Video Data

Many tensor-based data completion methods aim to solve image and video i...
research
09/07/2018

Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations

Despite the large body of research on missing value distributions and im...
research
08/13/2019

R-miss-tastic: a unified platform for missing values methods and workflows

Missing values are unavoidable when working with data. Their occurrence ...
research
10/14/2019

Kernel transfer over multiple views for missing data completion

We consider the kernel completion problem with the presence of multiple ...
research
11/01/2019

Contributions to the Formalization and Extraction of Generic Bases of Association Rules

In this thesis, a detailed study shows that closed itemsets and minimal ...

Please sign up or login with your details

Forgot password? Click here to reset