Integration of survey data and big observational data for finite population inference using mass imputation

07/08/2018
by   Shu Yang, et al.
0

Multiple data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we consider an imputation approach to combining a probability sample with big observational data. Unlike the usual imputation for missing data analysis, we create imputed values for the whole elements in the probability sample. Such mass imputation is attractive in the context of survey data integration (Kim and Rao, 2012). We extend mass imputation as a tool for data integration of survey data and big non-survey data. The mass imputation methods and their statistical properties are presented. The matching estimator of Rivers (2007) is also covered as a special case. Variance estimation with mass-imputed data is discussed. The simulation results demonstrate the proposed estimators outperform existing competitors in terms of robustness and efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2018

Combining Non-probability and Probability Survey Samples Through Mass Imputation

This paper presents theoretical results on combining non-probability and...
research
03/26/2020

Data Integration by combining big data and survey sample data for finite population inference

The statistical challenges in using big data for making valid statistica...
research
01/09/2020

Statistical Data Integration in Survey Sampling: A Review

Finite population inference is a central goal in survey sampling. Probab...
research
05/28/2023

Pretest estimation in combining probability and non-probability samples

Multiple heterogeneous data sources are becoming increasingly available ...
research
09/03/2022

Estimating Demand for Online Delivery using Limited Historical Observations

Driven in part by the COVID-19 pandemic, the pace of online purchases fo...
research
04/25/2019

A Preferential Attachment Model for the Stellar Initial Mass Function

Accurate specification of a likelihood function is becoming increasingly...
research
01/02/2018

Combining multiple observational data sources to estimate causal effects

The era of big data has witnessed an increasing availability of multiple...

Please sign up or login with your details

Forgot password? Click here to reset