Combining Cluster Sampling and Link-Tracing Sampling to Estimate Totals and Means of Hidden Populations in Presence of Heterogeneous Probabilities of Links
We propose Horvitz-Thompson-like and Hajek-like estimators of the total and mean of the values of a variable of interest associated with the elements of a hard-to-reach population sampled by the variant of link-tracing sampling proposed by Felix-Medina and Thompson (2004). As examples of this type of population are drug users, homeless people and sex workers. In this sampling variant, a frame of venues or places where the members of the population tend to gather, such as parks and bars, is constructed. The frame is not assumed to cover the whole population. An initial cluster sample of elements is selected from the frame, where the clusters are the venues, and the elements in the initial sample are asked to name their contacts who are also members of the population. The sample size is increased by including in the sample the named elements who are not in the initial sample. The proposed estimators do not use design-based inclusion probabilities, but model-based inclusion probabilities which are derived from a model proposed by Felix-Medina et al. (2015) and are estimated by maximum likelihood estimators. The inclusion probabilities are assumed to be heterogeneous, that is, that they depend on the sampled people. Estimates of the variances of the proposed estimators are obtained by bootstrap and they are used to construct confidence intervals of the totals and means. The performance of the proposed estimators and confidence intervals is evaluated by two numerical studies, one of them based on real data, and the results show that their performance is acceptable.
READ FULL TEXT