Designing Experiments Informed by Observational Studies
The increasing availability of passively observed data has yielded a growing methodological interest in "data fusion." These methods involve merging data from observational and experimental sources to draw causal conclusions – and they typically require a precarious tradeoff between the unknown bias in the observational dataset and the often-large variance in the experimental dataset. We propose an alternative approach to leveraging observational data, which avoids this tradeoff: rather than using observational data for inference, we use it to design a more efficient experiment. We consider the case of a stratified experiment with a binary outcome, and suppose pilot estimates for the stratum potential outcome variances can be obtained from the observational study. We extend results from Zhao et al. (2019) in order to generate confidence sets for these variances, while accounting for the possibility of unmeasured confounding. Then, we pose the experimental design problem as one of regret minimization, subject to the constraints imposed by our confidence sets. We show that this problem can be converted into a convex minimization and solved using conventional methods. Lastly, we demonstrate the practical utility of our methods using data from the Women's Health Initiative.
READ FULL TEXT