Multiple imputation with missing data indicators

03/02/2021
by   Lauren J Beesley, et al.
0

Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation (SRMI), also called chained equations multiple imputation. In this approach, we impute missing values using regression models for each variable, conditional on the other variables in the data. This approach, however, assumes that the missingness mechanism is missing at random, and it is not well-justified under not-at-random missingness without additional modification. In this paper, we describe how we can generalize the SRMI imputation procedure to handle not-at-random missingness (MNAR) in the setting where missingness may depend on other variables that are also missing. We provide algebraic justification for several generalizations of standard SRMI using Taylor series and other approximations of the target imputation distribution under MNAR. Resulting regression model approximations include indicators for missingness, interactions, or other functions of the MNAR missingness model and observed data. In a simulation study, we demonstrate that the proposed SRMI modifications result in reduced bias in the final analysis compared to standard SRMI, with an approximation strategy involving inclusion of an offset in the imputation model performing the best overall. The method is illustrated in a breast cancer study, where the goal is to estimate the prevalence of a specific genetic pathogenic variant.

READ FULL TEXT

page 13

page 14

research
04/14/2020

A logic-based resampling with matching approach to multiple imputation of missing data

Researchers often use model-based multiple imputation to handle missing ...
research
11/26/2022

Multiple imputation for logistic regression models: incorporating an interaction

Background: Multiple imputation is often used to reduce bias and gain ef...
research
11/16/2020

Imputation techniques on missing values in breast cancer treatment and fertility data

Clinical decision support using data mining techniques offers more intel...
research
05/04/2018

Population-calibrated multiple imputation for a binary/categorical covariate in categorical regression models

Multiple imputation (MI) has become popular for analyses with missing da...
research
06/07/2021

Proper Scoring Rules for Missing Value Imputation

Given the prevalence of missing data in modern statistical research, a b...
research
04/22/2022

Imputation with verifiable identification condition for nonignorable missing outcomes

Missing data often results in undesirable bias and loss of efficiency. T...
research
05/04/2022

The Effect of Multiple Imputation of Routine Pathology Variables on Laboratory Diagnosis of Hepatitis C Infection

Pathology tests are central to modern healthcare in terms of diagnosis a...

Please sign up or login with your details

Forgot password? Click here to reset