Sampling from Networks: Respondent-Driven Sampling
Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. While the RDS sampling mechanism and associated methods of adjusting for the sampling at the analysis stage are well-documented in the statistical sciences literature, methodological focus has largely been restricted to estimation of population means and proportions (e.g. prevalence). As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily and differential activity (two measures of tendency for individuals with similar traits to share social links) are sensitive to the choice of a simulation and sampling method. Though not clearly described in the RDS literature, many simple methods exist to generate simulated simple RDS data, with a small number of covariates where the focus is on estimating simple estimands. There is little to no comprehensive framework on how to simulate realistic RDS samples so as to study multivariate analytic approaches such as regression. In this paper, we present strategies for simulating RDS samples with known network and sample characteristics, so as to provide a foundation from which to expand the study of RDS analyses beyond the univariate framework. We conduct an analysis to assess the accuracy of simulated RDS samples, in terms of their ability to generate the desired levels of homophily, differential activity, and relationships between covariates.
READ FULL TEXT