Assessing small area estimates via artificial populations from KBAABB: a kNN-based approximation to ABB

06/27/2023
by   Jerzy A. Wieczorek, et al.
0

Comparing and evaluating small area estimation (SAE) models for a given application is inherently difficult. Typically, we do not have enough data in many areas to check unit-level modeling assumptions or to assess unit-level predictions empirically; and there is no ground truth available for checking area-level estimates. Design-based simulation from artificial populations can help with each of these issues, but only if the artificial populations (a) realistically represent the application at hand and (b) are not built using assumptions that could inherently favor one SAE model over another. In this paper, we borrow ideas from random hot deck, approximate Bayesian bootstrap (ABB), and k nearest neighbor (kNN) imputation methods, which are often used for multiple imputation of missing data. We propose a kNN-based approximation to ABB (KBAABB) for a different purpose: generating an artificial population when rich unit-level auxiliary data is available. We introduce diagnostic checks on the process of building the artificial population itself, and we demonstrate how to use such an artificial population for design-based simulation studies to compare and evaluate SAE models, using real data from the Forest Inventory and Analysis (FIA) program of the US Forest Service. We illustrate how such simulation studies may be disseminated and explored interactively through an online R Shiny application.

READ FULL TEXT

page 9

page 27

page 28

page 38

page 39

page 40

page 41

page 42

research
02/02/2022

Application of Multiple Imputation When Using Propensity Score Methods to Generalize Clinical Trials to Target Populations of Interest

When the distribution of treatment effect modifiers differs between the ...
research
08/27/2019

Unit Level Modeling of Survey Data for Small Area Estimation Under Informative Sampling: A Comprehensive Overview with Extensions

Model-based small area estimation is frequently used in conjunction with...
research
06/29/2023

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

Numerical data imputation algorithms replace missing values by estimates...
research
09/17/2023

Fully Synthetic Data for Complex Surveys

When seeking to release public use files for confidential data, statisti...
research
03/16/2020

Model-based Inference for Rare and Clustered Populations from Adaptive Cluster Sampling using Auxiliary Variables

Rare populations, such as endangered animals and plants, drug users and ...
research
11/14/2019

rFIA: An R package for space-time estimation of forest attributes with the Forest Inventory and Analysis Database

rFIA is an R package designed to simplify the estimation of forest attri...

Please sign up or login with your details

Forgot password? Click here to reset