Pretest estimation in combining probability and non-probability samples

05/28/2023
by   Chenyin Gao, et al.
0

Multiple heterogeneous data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we develop a unified framework of the test-and-pool approach to general parameter estimation by combining gold-standard probability and non-probability samples. We focus on the case when the study variable is observed in both datasets for estimating the target parameters, and each contains other auxiliary variables. Utilizing the probability design, we conduct a pretest procedure to determine the comparability of the non-probability data with the probability data and decide whether or not to leverage the non-probability data in a pooled analysis. When the probability and non-probability data are comparable, our approach combines both data for efficient estimation. Otherwise, we retain only the probability data for estimation. We also characterize the asymptotic distribution of the proposed test-and-pool estimator under a local alternative and provide a data-adaptive procedure to select the critical tuning parameters that target the smallest mean square error of the test-and-pool estimator. Lastly, to deal with the non-regularity of the test-and-pool estimator, we construct a robust confidence interval that has a good finite-sample coverage property.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2019

Doubly Robust Inference when Combining Probability and Non-probability Samples with High-dimensional Data

Non-probability samples become increasingly popular in survey statistics...
research
05/16/2018

Doubly Robust Inference with Non-probability Survey Samples

We establish a general framework for statistical inferences with non-pro...
research
07/08/2018

Integration of survey data and big observational data for finite population inference using mass imputation

Multiple data sources are becoming increasingly available for statistica...
research
01/09/2020

Statistical Data Integration in Survey Sampling: A Review

Finite population inference is a central goal in survey sampling. Probab...
research
12/27/2018

Combining Non-probability and Probability Survey Samples Through Mass Imputation

This paper presents theoretical results on combining non-probability and...
research
05/16/2019

Non-Asymptotic Inference in a Class of Optimization Problems

This paper describes a method for carrying out non-asymptotic inference ...
research
10/01/2018

On valid descriptive inference from non-probability sample

We examine the conditions under which descriptive inference can be based...

Please sign up or login with your details

Forgot password? Click here to reset