Missing data analysis and imputation via latent Gaussian Markov random fields

12/23/2019
by   Virgilio Gomez-Rubio, et al.
0

In this paper we recast the problem of missing values in the covariates of a regression model as a latent Gaussian Markov random field (GMRF) model in a fully Bayesian framework. Our proposed approach is based on the definition of the covariate imputation sub-model as a latent effect with a GMRF structure. We show how this formulation works for continuous covariates and provide some insight on how this could be extended to categorical covariates. The resulting Bayesian hierarchical model naturally fits within the integrated nested Laplace approximation (INLA) framework, which we use for model fitting. Hence, our work fills an important gap in the INLA methodology as it allows to treat models with missing values in the covariates. As in any other fully Bayesian framework, by relying on INLA for model fitting it is possible to formulate a joint model for the data, the imputed covariates and their missingness mechanism. In this way, we are able to tackle the more general problem of assessing the missingness mechanism by conducting a sensitivity analysis on the different alternatives to model the non-observed covariates. Finally, we illustrate the proposed approach with two examples on modeling health risk factors and disease mapping. Here, we rely on two different imputation mechanisms based on a typical multiple linear regression and a spatial model, respectively. Given the speed of model fitting with INLA we are able to fit joint models in a short time, and to easily conduct sensitivity analyses.

READ FULL TEXT
research
03/27/2023

A joint Bayesian framework for missing data and measurement error using integrated nested Laplace approximations

Measurement error (ME) and missing values in covariates are often unavoi...
research
06/16/2022

Modeling rates of disease with missing categorical data

Covariates like age, sex, and race/ethnicity provide invaluable insight ...
research
04/06/2021

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning

The missing data issue is ubiquitous in health studies. Variable selecti...
research
01/19/2022

Bayesian Prediction with Covariates Subject to Detection Limits

Missing values in covariates due to censoring by signal interference or ...
research
10/30/2019

Fully Bayesian imputation model for non-random missing data in qPCR

We propose a new statistical approach to obtain differential gene expres...
research
06/12/2019

A Bayesian Hierarchical Model for Evaluating Forensic Footwear Evidence

When a latent shoeprint is discovered at a crime scene, forensic analyst...
research
06/28/2019

missSBM: An R Package for Handling Missing Values in the Stochastic Block Model

The Stochastic Block Model (SBM) is a popular probabilistic model for ra...

Please sign up or login with your details

Forgot password? Click here to reset