Asymptotic Seed Bias in Respondent-driven Sampling

08/31/2018
by   Yuling Yan, et al.
0

Respondent-driven sampling (RDS) collects a sample of individuals in a networked population by incentivizing the sampled individuals to refer their contacts into the sample. This iterative process is initialized from some seed node(s). Sometimes, this selection creates a large amount of seed bias. Other times, the seed bias is small. This paper gains a deeper understanding of this bias by characterizing its effect on the limiting distribution of various RDS estimators. Using classical tools and results from multi-type branching processes (Kesten and Stigum, 1966), we show that the seed bias is negligible for the Generalized Least Squares (GLS) estimator and non-negligible for both the inverse probability weighted and Volz-Heckathorn (VH) estimators. In particular, we show that (i) above a critical threshold, VH converge to a non-trivial mixture distribution, where the mixture component depends on the seed node, and the mixture distribution is possibly multi-modal. Moreover, (ii) GLS converges to a Gaussian distribution independent of the seed node, under a certain condition on the Markov process. Numerical experiments with both simulated data and empirical social networks suggest that these results appear to hold beyond the Markov conditions of the theorems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2018

Reducing Seed Bias in Respondent-Driven Sampling by Estimating Block Transition Probabilities

Respondent-driven sampling (RDS) is a popular approach to study marginal...
research
05/20/2015

Network driven sampling; a critical threshold for design effects

Web crawling, snowball sampling, and respondent-driven sampling (RDS) ar...
research
11/18/2020

A Deterministic Hitting-Time Moment Approach to Seed-set Expansion over a Graph

We introduce HITMIX, a new technique for network seed-set expansion, i.e...
research
04/06/2018

On the sample autocovariance of a Lévy driven moving average process when sampled at a renewal sequence

We consider a Lévy driven continuous time moving average process X sampl...
research
05/11/2023

Sampling distributions and estimation for multi-type Branching Processes

Consider a multi-dimensional supercritical branching process with offspr...
research
10/26/2018

Robust Inference Using Inverse Probability Weighting

Inverse Probability Weighting (IPW) is widely used in program evaluation...

Please sign up or login with your details

Forgot password? Click here to reset