Automated Selection of Post-Strata using a Model-Assisted Regression Tree Estimator

12/15/2017
by   Kelly S. McConville, et al.
0

Auxiliary information can increase the efficiency of survey estimators through an assisting model when the model captures some of the relationship between the auxiliary data and the study variables. Despite their superior properties, model-assisted estimators are rarely used in anything but their simplest form by statistical agencies to produce official statistics. This is due to the fact that the more complicated models that have been used in model-assisted estimation are often ill suited to the available auxiliary data. Under a model-assisted framework, we propose a regression tree estimator for a finite population total. Regression tree models are adept at handling the type of auxiliary data usually available in the sampling frame and provide a model that is easy to explain and justify. The estimator can be viewed as a post-stratification estimator where the post-strata are automatically selected by the recursive partitioning algorithm of the regression tree. We establish consistency of the regression tree estimator and compare its performance to other survey estimators using the US Bureau of Labor Statistics Occupational Employment Statistics Survey.

READ FULL TEXT
research
12/14/2020

Model-assisted estimation in high-dimensional settings for survey data

Model-assisted estimators have attracted a lot of attention in the last ...
research
05/19/2020

Generalised regression estimation given imperfectly matched auxiliary data

Generalised regression estimation allows one to make use of available au...
research
08/09/2022

Model-Assisted Estimators under Nonresponse in Sample Surveys

In the presence of auxiliary information, model-assisted estimators use ...
research
03/25/2020

Design-unbiased statistical learning in survey sampling

Design-consistent model-assisted estimation has become the standard prac...
research
03/12/2018

Adaptive two-stage sequential double sampling

In many surveys inexpensive auxiliary variables are available that can h...
research
02/22/2020

Model-assisted estimation through random forests in finite population sampling

Surveys are used to collect data on a subset of a finite population. Mos...
research
05/09/2019

Double-calibration estimators accounting for under-coverage and nonresponse in socio-economic surveys

Under-coverage and nonresponse problems are jointly present in most soci...

Please sign up or login with your details

Forgot password? Click here to reset