Missing at Random or Not: A Semiparametric Testing Approach

03/25/2020
by   Rui Duan, et al.
0

Practical problems with missing data are common, and statistical methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. The conventional notions include the three common potential classes – missing completely at random, missing at random, and missing not at random. In this paper, we present a new hypothesis testing approach for deciding between missing at random and missing not at random. Since the potential alternatives of missing at random are broad, we focus our investigation on a general class of models with instrumental variables for data missing not at random. Our setting is broadly applicable, thanks to that the model concerning the missing data is nonparametric, requiring no explicit model specification for the data missingness. The foundational idea is to develop appropriate discrepancy measures between estimators whose properties significantly differ only when missing at random does not hold. We show that our new hypothesis testing approach achieves an objective data oriented choice between missing at random or not. We demonstrate the feasibility, validity, and efficacy of the new test by theoretical analysis, simulation studies, and a real data analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2022

An integrated approach to test for missingness not at random

Missing data can lead to inefficiencies and biases in analyses, in parti...
research
06/03/2022

Hypothesis testing for matched pairs with missing data by maximum mean discrepancy: An application to continuous glucose monitoring

A frequent problem in statistical science is how to properly handle miss...
research
05/27/2021

Score test for missing at random or not

Missing data are frequently encountered in various disciplines and can b...
research
11/10/2021

variable selection and missing data imputation in categorical genomic data analysis by integrated ridge regression and random forest

Genomic data arising from a genome-wide association study (GWAS) are oft...
research
02/28/2022

On Testability and Goodness of Fit Tests in Missing Data Models

Significant progress has been made in developing identification and esti...
research
03/11/2021

Likelihood-based missing data analysis in multivariate crossover trials

For gene expression data measured in a crossover trial, a multivariate m...
research
04/01/2018

Missing Data as Part of the Social Behavior in Real-World Financial Complex Systems

Many real-world networks are known to exhibit facts that counter our kno...

Please sign up or login with your details

Forgot password? Click here to reset