In Nonparametric and High-Dimensional Models, Bayesian Ignorability is an Informative Prior

11/06/2021
by   Antonio R. Linero, et al.
0

In problems with large amounts of missing data one must model two distinct data generating processes: the outcome process which generates the response and the missing data mechanism which determines the data we observe. Under the ignorability condition of Rubin (1976), however, likelihood-based inference for the outcome process does not depend on the missing data mechanism so that only the former needs to be estimated; partially because of this simplification, ignorability is often used as a baseline assumption. We study the implications of Bayesian ignorability in the presence of high-dimensional nuisance parameters and argue that ignorability is typically incompatible with sensible prior beliefs about the amount of selection bias. We show that, for many problems, ignorability directly implies that the prior on the selection bias is tightly concentrated around zero. This is demonstrated on several models of practical interest, and the effect of ignorability on the posterior distribution is characterized for high-dimensional linear models with a ridge regression prior. We then show both how to build high-dimensional models which encode sensible beliefs about the selection bias and also show that under certain narrow circumstances ignorability is less problematic.

READ FULL TEXT
research
01/31/2023

Naive imputation implicitly regularizes high-dimensional linear models

Two different approaches exist to handle missing values for prediction: ...
research
05/11/2023

Bayesian sensitivity analysis for a missing data model

In causal inference, sensitivity analysis is important to assess the rob...
research
02/06/2018

An Imputation-Consistency Algorithm for High-Dimensional Missing Data Problems and Beyond

Missing data are frequently encountered in high-dimensional problems, bu...
research
09/14/2019

Adaptive Bayesian SLOPE – High-dimensional Model Selection with Missing Values

The selection of variables with high-dimensional and missing data is a m...
research
08/04/2022

Using Instruments for Selection to Adjust for Selection Bias in Mendelian Randomization

Selection bias is a common concern in epidemiologic studies. In the lite...
research
03/29/2023

Correcting for Selection Bias and Missing Response in Regression using Privileged Information

When estimating a regression model, we might have data where some labels...
research
07/20/2021

Strategies for variable selection in large-scale healthcare database studies with missing covariate and outcome data

Prior work has shown that combining bootstrap imputation with tree-based...

Please sign up or login with your details

Forgot password? Click here to reset