Optimized Linear Imputation

11/17/2015
by   Yehezkel S. Resheff, et al.
0

Often in real-world datasets, especially in high dimensional data, some feature values are missing. Since most data analysis and statistical methods do not handle gracefully missing values, the first step in the analysis requires the imputation of missing values. Indeed, there has been a long standing interest in methods for the imputation of missing values as a pre-processing step. One recent and effective approach, the IRMI stepwise regression imputation method, uses a linear regression model for each real-valued feature on the basis of all other features in the dataset. However, the proposed iterative formulation lacks convergence guarantee. Here we propose a closely related method, stated as a single optimization problem and a block coordinate-descent solution which is guaranteed to converge to a local minimum. Experiments show results on both synthetic and benchmark datasets, which are comparable to the results of the IRMI method whenever it converges. However, while in the set of experiments described here IRMI often does not converge, the performance of our methods is shown to be markedly superior in comparison with other methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2021

FCMI: Feature Correlation based Missing Data Imputation

Processed data are insightful, and crude data are obtuse. A serious thre...
research
09/01/2021

RIFLE: Robust Inference from Low Order Marginals

The ubiquity of missing values in real-world datasets poses a challenge ...
research
11/09/2019

Missing Features Reconstruction and Its Impact on Classification Accuracy

In real-world applications, we can encounter situations when a well-trai...
research
04/08/2020

Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints

Today, data analysts largely rely on intuition to determine whether miss...
research
01/03/2019

Une nouvelle approche de complétion des valeurs manquantes dans les bases de données

When tackling real-life datasets, it is common to face the existence of ...
research
04/07/2021

Prediction with Missing Data

Missing information is inevitable in real-world data sets. While imputat...
research
04/07/2020

Learning Individual Models for Imputation (Technical Report)

Missing numerical values are prevalent, e.g., owing to unreliable sensor...

Please sign up or login with your details

Forgot password? Click here to reset