On application of a response propensity model to estimation from web samples

06/20/2019
by   Vladislav Beresovsky, et al.
0

Increasing nonresponse rates and the cost of data collection are two pressing problems encountered in traditional randomized surveys. The proliferation of inexpensive data from web surveys stimulates interest in statistical techniques for valid inferences from web samples. We consider estimation of population and domain means in the two-sample setup, where the web sample contains variables of interest and covariates that are shared with an auxiliary random sample. First, we propose an estimator of population mean, based on the estimated propensity of response to a web survey, a.k.a. web response propensity. This makes inferences from web samples that are similar to well-established techniques used for observational studies and missing data problems. Second, we propose an "implicit logistic" regression for estimating parameters of the web response model in the two-sample setup. In addition to random sample design information, it utilizes random sample inclusion probabilities, nominally assigned to web sample units, and the size of the subpopulation of web responders. A simulation study confirms validity of the proposed estimator in comparison with alternative approximate estimators. We illustrate our method by estimating prevalence of chronic health conditions and related medication use for the U.S. population of adults, using web and random samples from experimental web survey and the National Health Interview Survey.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset