Generalized Linear Models for Longitudinal Data with Biased Sampling Designs: A Sequential Offsetted Regressions Approach

01/13/2020
by   Lee S. McDaniel, et al.
0

Biased sampling designs can be highly efficient when studying rare (binary) or low variability (continuous) endpoints. We consider longitudinal data settings in which the probability of being sampled depends on a repeatedly measured response through an outcome-related, auxiliary variable. Such auxiliary variable- or outcome-dependent sampling improves observed response and possibly exposure variability over random sampling, even though the auxiliary variable is not of scientific interest. For analysis, we propose a generalized linear model based approach using a sequence of two offsetted regressions. The first estimates the relationship of the auxiliary variable to response and covariate data using an offsetted logistic regression model. The offset hinges on the (assumed) known ratio of sampling probabilities for different values of the auxiliary variable. Results from the auxiliary model are used to estimate observation-specific probabilities of being sampled conditional on the response and covariates, and these probabilities are then used to account for bias in the second, target population model. We provide asymptotic standard errors accounting for uncertainty in the estimation of the auxiliary model, and perform simulation studies demonstrating substantial bias reduction, correct coverage probability, and improved design efficiency over simple random sampling designs. We illustrate the approaches with two examples.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset