Reducing Seed Bias in Respondent-Driven Sampling by Estimating Block Transition Probabilities

12/04/2018
by   Yilin Zhang, et al.
0

Respondent-driven sampling (RDS) is a popular approach to study marginalized or hard-to-reach populations. It collects samples from a networked population by incentivizing participants to refer their friends into the study. One major challenge in analyzing RDS samples is seed bias. Seed bias refers to the fact that when the social network is divided into multiple communities (or blocks), the RDS sample might not provide a balanced representation of the different communities in the population, and such unbalance is correlated with the initial participant (or the seed). In this case, the distributions of estimators are typically non-trivial mixtures, which are determined (1) by the seed and (2) by how the referrals transition from one block to another. This paper shows that (1) block-transition probabilities are easy to estimate with high accuracy, and (2) we can use these estimated block-transition probabilities to estimate the stationary distribution over blocks and thus, an estimate of the block proportions. This stationary distribution on blocks has previously been used in the RDS literature to evaluate whether the sampling process has appeared to `mix'. We use these estimated block proportions in a simple post-stratified (PS) estimator that greatly diminishes seed bias. By aggregating over the blocks/strata in this way, we prove that the PS estimator is √(n)-consistent under a Markov model, even when other estimators are not. Simulations show that the PS estimator has smaller Root Mean Square Error (RMSE) compared to the state-of-the-art estimators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2018

Asymptotic Seed Bias in Respondent-driven Sampling

Respondent-driven sampling (RDS) collects a sample of individuals in a n...
research
04/03/2018

Simple estimators for network sampling

Some conceptually simple estimators for network sampling are introduced....
research
12/26/2017

Reduced Bias for respondent driven sampling: accounting for non-uniform edge sampling probabilities in people who inject drugs in Mauritius

People who inject drugs are an important population to study in order to...
research
07/01/2019

Transformed Naive Ratio and Product Based Estimators for Estimating Population Mode in Simple Random Sampling

In this paper, we propose a transformed naïve ratio and product based es...
research
09/20/2021

Estimation of Measures for Two-Way Contingency Tables Using the Bayesian Estimators

In the analysis of two-way contingency tables, the measures for represen...
research
05/20/2015

Network driven sampling; a critical threshold for design effects

Web crawling, snowball sampling, and respondent-driven sampling (RDS) ar...
research
11/23/2015

Estimating the number of unseen species: A bird in the hand is worth n in the bush

Estimating the number of unseen species is an important problem in many ...

Please sign up or login with your details

Forgot password? Click here to reset