Challenges of the inconsistency regime: Novel debiasing methods for missing data models

09/04/2023
by   Michael Celentano, et al.
0

We study semi-parametric estimation of the population mean when data is observed missing at random (MAR) in the n < p "inconsistency regime", in which neither the outcome model nor the propensity/missingness model can be estimated consistently. Consider a high-dimensional linear-GLM specification in which the number of confounders is proportional to the sample size. In the case n > p, past work has developed theory for the classical AIPW estimator in this model and established its variance inflation and asymptotic normality when the outcome model is fit by ordinary least squares. Ordinary least squares is no longer feasible in the case n < p studied here, and we also demonstrate that a number of classical debiasing procedures become inconsistent. This challenge motivates our development and analysis of a novel procedure: we establish that it is consistent for the population mean under proportional asymptotics allowing for n < p, and also provide confidence intervals for the linear model coefficients. Providing such guarantees in the inconsistency regime requires a new debiasing approach that combines penalized M-estimates of both the outcome and propensity/missingness models in a non-standard way.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2021

Two-stage least squares with a randomly right censored outcome

This note develops a simple two-stage least squares (2SLS) procedure to ...
research
02/09/2017

Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates

Although a majority of the theoretical literature in high-dimensional st...
research
02/02/2019

Learning Linear Dynamical Systems with Semi-Parametric Least Squares

We analyze a simple prefiltered variation of the least squares estimator...
research
06/20/2018

Regression adjustment in randomized experiments with a diverging number of covariates

Extending R. A. Fisher and D. A. Freedman's results on the analysis of c...
research
06/23/2016

Semi-supervised Inference: General Theory and Estimation of Means

We propose a general semi-supervised inference framework focused on the ...
research
02/02/2019

High-dimensional semi-supervised learning: in search for optimal inference of the mean

We provide a high-dimensional semi-supervised inference framework focuse...
research
12/17/2021

The Effect of Sample Size and Missingness on Inference with Missing Data

When are inferences (whether Direct-Likelihood, Bayesian, or Frequentist...

Please sign up or login with your details

Forgot password? Click here to reset