Evaluation of imputation techniques with varying percentage of missing data

09/09/2021
by   Seema Sangari, et al.
0

Missing data is a common problem which has consistently plagued statisticians and applied analytical researchers. While replacement methods like mean-based or hot deck imputation have been well researched, emerging imputation techniques enabled through improved computational resources have had limited formal assessment. This study formally considers five more recently developed imputation methods: Amelia, Mice, mi, Hmisc and missForest - compares their performances using RMSE against actual values and against the well-established mean-based replacement approach. The RMSE measure was consolidated by method using a ranking approach. Our results indicate that the missForest algorithm performed best and the mi algorithm performed worst.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2020

Establishing strong imputation performance of a denoising autoencoder in a wide range of missing data problems

Dealing with missing data in data analysis is inevitable. Although power...
research
07/06/2020

Multiple Imputation with Massive Data: an Application to the Panel Study of Income Dynamics

Multiple imputation (MI) is a popular and well-established method for ha...
research
08/21/2021

A computational study on imputation methods for missing environmental data

Data acquisition and recording in the form of databases are routine oper...
research
06/16/2021

Projective Resampling Imputation Mean Estimation Method for Missing Covariates Problem

Missing data is a common problem in clinical data collection, which caus...
research
06/22/2010

Large gaps imputation in remote sensed imagery of the environment

Imputation of missing data in large regions of satellite imagery is nece...
research
01/19/2021

Goodness (of fit) of Imputation Accuracy: The GoodImpact Analysis

In statistical survey analysis, (partial) non-responders are integral el...
research
02/08/2023

IRTCI: Item Response Theory for Categorical Imputation

Most datasets suffer from partial or complete missing values, which has ...

Please sign up or login with your details

Forgot password? Click here to reset