To Impute or not to Impute? – Missing Data in Treatment Effect Estimation

02/04/2022
by   Jeroen Berrevoets, et al.
5

Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the individual and the outcome. Having a treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment divides the population in distinct subpopulations, where estimates across these populations will be biased. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data.

READ FULL TEXT
research
02/02/2022

Application of Multiple Imputation When Using Propensity Score Methods to Generalize Clinical Trials to Target Populations of Interest

When the distribution of treatment effect modifiers differs between the ...
research
08/02/2019

Identifying Treatment Effects using Trimmed Means when Data are Missing Not at Random

Patients often discontinue treatment in a clinical trial because their h...
research
08/24/2023

Estimating hypothetical estimands with causal inference and missing data estimators in a diabetes trial

The recently published ICH E9 addendum on estimands in clinical trials p...
research
09/02/2022

Assessing treatment effect heterogeneity in the presence of missing effect modifier data in cluster-randomized trials

Understanding whether and how treatment effects vary across individuals ...
research
12/20/2018

Accounting for selection bias due to death in estimating the effect of wealth shock on cognition for the Health and Retirement Study

The Health and Retirement Study is a longitudinal study of US adults enr...
research
12/04/2018

Local average treatment effects estimation via substantive model compatible multiple imputation

Non-adherence to assigned treatment is common in randomised controlled t...

Please sign up or login with your details

Forgot password? Click here to reset