DeepAI
Log In Sign Up

On the Relation between Prediction and Imputation Accuracy under Missing Covariates

12/09/2021
by   Burim Ramosaj, et al.
12

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for imputation. It originates from their capability of showing favourable prediction accuracy in different learning problems. In this work, we analyze through simulation the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when Machine Learning based methods for both, imputation and prediction are used. In addition, we explore imputation performance when using statistical inference procedures in prediction settings, such as coverage rates of (valid) prediction intervals. Our analysis is based on empirical datasets provided by the UCI Machine Learning repository and an extensive simulation study.

READ FULL TEXT

page 9

page 10

page 12

page 13

page 14

page 26

page 31

page 32

06/04/2017

Evolving imputation strategies for missing data in classification problems with TPOT

Missing data has a ubiquitous presence in real-life applications of mach...
06/18/2018

A cautionary tale on using imputation methods for inference in matched pairs design

Imputation procedures in biomedical fields have turned into statistical ...
11/22/2019

Bootstrap Inference for Multiple Imputation under Uncongeniality and Misspecification

Multiple imputation has become one of the most popular approaches for ha...
01/11/2023

Multiple-level Point Embedding for Solving Human Trajectory Imputation with Prediction

Sparsity is a common issue in many trajectory datasets, including human ...
04/23/2020

Influence of parallel computing strategies of iterative imputation of missing data: a case study on missForest

Machine learning iterative imputation methods have been well accepted by...
10/17/2022

Efficient surrogate-assisted inference for patient-reported outcome measures with complex missing mechanism

Patient-reported outcome (PRO) measures are increasingly collected as a ...
07/13/2020

Imputation procedures in surveys using nonparametric and machine learning methods: an empirical comparison

Nonparametric and machine learning methods are flexible methods for obta...