Nearest neighbor ratio imputation with incomplete multi-nomial outcome in survey sampling

02/23/2022
by   Chenyin Gao, et al.
0

Nonresponse is a common problem in survey sampling. Appropriate treatment can be challenging, especially when dealing with detailed breakdowns of totals. Often, the nearest neighbor imputation method is used to handle such incomplete multinomial data. In this article, we investigate the nearest neighbor ratio imputation estimator, in which auxiliary variables are used to identify the closest donor and the vector of proportions from the donor is applied to the total of the recipient to implement ratio imputation. To estimate the asymptotic variance, we first treat the nearest neighbor ratio imputation as a special case of predictive matching imputation and apply the linearization method of <cit.>. To account for the non-negligible sampling fractions, parametric and generalized additive models are employed to incorporate the smoothness of the imputation estimator, which results in a valid variance estimator. We apply the proposed method to estimate expenditures detail items based on empirical data from the 2018 collection of the Service Annual Survey, conducted by the United States Census Bureau. Our simulation results demonstrate the validity of our proposed estimators and also confirm that the derived variance estimators have good performance even when the sampling fraction is non-negligible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2019

Weak consistency of the 1-nearest neighbor measure with applications to missing data and covariate shift

When data is partially missing at random, imputation and importance weig...
research
12/27/2018

Combining Non-probability and Probability Survey Samples Through Mass Imputation

This paper presents theoretical results on combining non-probability and...
research
01/07/2021

Distances with mixed type variables some modified Gower's coefficients

Nearest neighbor methods have become popular in official statistics, mai...
research
02/24/2020

Clustering and Classification with Non-Existence Attributes: A Sentenced Discrepancy Measure Based Technique

For some or all of the data instances a number of independent-world clus...
research
04/28/2021

Reference based multiple imputation – what is the right variance and how to estimate it

Reference based multiple imputation methods have become popular for hand...
research
08/26/2015

Population Synthesis via k-Nearest Neighbor Crossover Kernel

The recent development of multi-agent simulations brings about a need fo...

Please sign up or login with your details

Forgot password? Click here to reset