On the Relation between Prediction and Imputation Accuracy under Missing Covariates

12/09/2021
by   Burim Ramosaj, et al.
12

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for imputation. It originates from their capability of showing favourable prediction accuracy in different learning problems. In this work, we analyze through simulation the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when Machine Learning based methods for both, imputation and prediction are used. In addition, we explore imputation performance when using statistical inference procedures in prediction settings, such as coverage rates of (valid) prediction intervals. Our analysis is based on empirical datasets provided by the UCI Machine Learning repository and an extensive simulation study.

READ FULL TEXT

page 9

page 10

page 12

page 13

page 14

page 26

page 31

page 32

research
06/04/2017

Evolving imputation strategies for missing data in classification problems with TPOT

Missing data has a ubiquitous presence in real-life applications of mach...
research
06/18/2018

A cautionary tale on using imputation methods for inference in matched pairs design

Imputation procedures in biomedical fields have turned into statistical ...
research
11/22/2019

Bootstrap Inference for Multiple Imputation under Uncongeniality and Misspecification

Multiple imputation has become one of the most popular approaches for ha...
research
11/16/2020

Imputation techniques on missing values in breast cancer treatment and fertility data

Clinical decision support using data mining techniques offers more intel...
research
04/23/2020

Influence of parallel computing strategies of iterative imputation of missing data: a case study on missForest

Machine learning iterative imputation methods have been well accepted by...
research
01/11/2023

Multiple-level Point Embedding for Solving Human Trajectory Imputation with Prediction

Sparsity is a common issue in many trajectory datasets, including human ...
research
02/20/2023

Conformal Prediction for Network-Assisted Regression

An important problem in network analysis is predicting a node attribute ...

Please sign up or login with your details

Forgot password? Click here to reset