Variable Selection for Multiply-imputed Data: A Bayesian Framework

10/31/2022
by   Jungang Zou, et al.
0

Multiple imputation is a widely used technique to handle missing data in large observational studies. For variable selection on multiply-imputed datasets, however, if we conduct selection on each imputed dataset separately, different sets of important variables may be obtained. MI-LASSO, one of the most popular solutions to this problem, regards the same variable across all separate imputed datasets as a group of variables and exploits Group-LASSO to yield a consistent variable selection across all the multiply-imputed datasets. In this paper, we extend the MI-LASSO model into Bayesian framework and utilize five different Bayesian MI-LASSO models to perform variable selection on multiply-imputed data. These five models consist of three shrinkage priors based and two discrete mixture prior based approaches. We conduct a simulation study investigating the practical characteristics of each model across various settings. We further demonstrate these methods via a case study using the multiply-imputed data from the University of Michigan Dioxin Exposure Study. The Python package BMIselect is hosted on Github under an Apache-2.0 license: https://github.com/zjg540066169/Bmiselect.

READ FULL TEXT

page 1

page 10

page 13

page 14

page 15

page 22

page 23

page 25

research
03/30/2022

A comparison of strategies for selecting auxiliary variables for multiple imputation

Multiple imputation (MI) is a popular method for handling missing data. ...
research
01/23/2020

The Reciprocal Bayesian LASSO

A reciprocal LASSO (rLASSO) regularization employs a decreasing penalty ...
research
03/16/2020

Variable selection with multiply-imputed datasets: choosing between stacked and grouped methods

Penalized regression methods, such as lasso and elastic net, are used in...
research
10/21/2018

Signal Adaptive Variable Selector for the Horseshoe Prior

In this article, we propose a simple method to perform variable selectio...
research
09/22/2020

ABM: an automatic supervised feature engineering method for loss based models based on group and fused lasso

A vital problem in solving classification or regression problem is to ap...
research
09/18/2019

Evaluating Effects of Tuition Fees: Lasso for the Case of Germany

We study the effect of the introduction of university tuition fees on th...
research
07/20/2021

Strategies for variable selection in large-scale healthcare database studies with missing covariate and outcome data

Prior work has shown that combining bootstrap imputation with tree-based...

Please sign up or login with your details

Forgot password? Click here to reset