Bayesian Uncertainty Estimation Under Complex Sampling

07/31/2018
by   Matthew R. Williams, et al.
0

Multistage sampling designs utilized by federal statistical agencies are typically constructed to maximize the efficiency of the target domain level estimator (e.g., indexed by geographic area) within cost constraints to administer survey instruments. Sampling designs are usually constructed to be informative, whereby inclusion probabilities are correlated with the response variable of interest to minimize the variance of the resulting estimator. Multistage sampling designs may induce dependence between the sampled units; for example, employment of a sampling step that selects geographically-indexed clusters of units in order to efficiently manage the cost of collection. A data analyst may use a sampling-weighted pseudo-posterior distribution to estimate the population model on the observed sample. The dependence induced between co-clustered units inflates the scale of the resulting pseudo-posterior covariance matrix that has been shown to induce under coverage of the credibility sets. While the pseudo-posterior distribution contracts on the true population model parameters, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the MLE, the pseudo-posterior and the MLE under simple random sampling. Motivated by the different forms of the asymptotic covariance matrices and the within cluster dependence, we devise a correction applied as a simple and fast post-processing step to our MCMC draws from the pseudo-posterior distribution. Our updating step projects the pseudo-posterior covariance matrix such that the nominal coverage is approximately achieved with credibility sets that account for both the distributions for population generation, P_θ_0, and the multistage, informative sampling, P_ν. We demonstrate the efficacy of our procedure on synthetic data and make an application to the National Survey on Drug Use and Health.

READ FULL TEXT
research
07/12/2018

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

An informative sampling design leads to unit inclusion probabilities tha...
research
09/21/2023

Pseudo-Bayesian unit level modeling for small area estimation under informative sampling

When mapping subnational health and demographic indicators, direct weigh...
research
01/15/2021

Fully Bayesian Estimation under Dependent and Informative Cluster Sampling

Survey data are often collected under multistage sampling designs where ...
research
05/10/2022

Mechanisms for Global Differential Privacy under Bayesian Data Synthesis

This paper introduces a new method that embeds any Bayesian model used t...
research
01/09/2019

The Universal model and prior: multinomial GLMs

This paper generalises the exponential family GLM to allow arbitrary dis...
research
03/18/2021

The effect of Informative Selection on the estimation of parameters related to Spatial Processes

This paper extends the concept of informative selection, population dist...
research
02/11/2021

The Bernstein-von Mises theorem for the Pitman-Yor process of nonnegative type

The Pitman-Yor process is a nonparametric species sampling prior with nu...

Please sign up or login with your details

Forgot password? Click here to reset