Construction and assessment of prediction rules for binary outcome in the presence of missing predictor data using multiple imputation: theoretical perspective and data-based e

10/11/2018
by   B. J. A. Mertens, et al.
0

We investigate the problem of calibration and assessment of predictive rules in prognostic designs when missing values are present in the predictors. Our paper has two key objectives which are entwined. The first is to investigate how the calibration of the prediction rule can be combined with the use of multiple imputation to account for missing predictor observations. The second objective is to propose such methods that can be implemented with current multiple imputation software, while allowing for unbiased predictive assessment through validation on new observations for which outcome is not yet available. To inform the definition of methodology, we commence with a review of the theoretical background of multiple imputation as a model estimation approach as opposed to a purely algorithmic description. We specifically contrast application of multiple imputation for parameter (effect) estimation with predictive calibration. Based on this review, two approaches are formulated, of which the second utilizes application of the classical Rubin's rules for parameter estimation, while the first approach averages probabilities from models fitted on single imputations to directly approximate the predictive density for future observations. We present implementations using current software which allow for validatory or cross-validatory estimation of performance measures, as well as imputation of missing data in predictors on the future data where outcome is by definition as yet unobserved. We restrict discussion to binary outcome and logistic regression throughout, though the principles discussed are generally applicable. We present two data sets as examples from our regular consultative practice. Results show little difference between methods for accuracy but substantial reductions in variation of calibrated probabilities when using the first approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2021

Proper Scoring Rules for Missing Value Imputation

Given the prevalence of missing data in modern statistical research, a b...
research
12/02/2020

Real-time imputation of missing predictor values in clinical practice

Use of prediction models is widely recommended by clinical guidelines, b...
research
11/02/2022

Small area estimation using multiple imputation in three-parameter logistic models

We propose a novel methodology relating item response theory methods wit...
research
05/04/2022

The Effect of Multiple Imputation of Routine Pathology Variables on Laboratory Diagnosis of Hepatitis C Infection

Pathology tests are central to modern healthcare in terms of diagnosis a...
research
06/03/2022

Estimation of Over-parameterized Models via Fitting to Future Observations

From a model-building perspective, in this paper we propose a paradigm s...
research
02/24/2023

Multiple Imputation for Non-Monotone Missing Not at Random Binary Data using the No Self-Censoring Model

Although approaches for handling missing data from longitudinal studies ...

Please sign up or login with your details

Forgot password? Click here to reset