Combining Observational and Experimental Data Using First-stage Covariates

by   George Gui, et al.

Randomized controlled trials generate experimental variation that can credibly identify causal effects, but often suffer from limited scale, while observational datasets are large, but often violate desired identification assumptions. To improve estimation efficiency, I propose a method that combines experimental and observational datasets when 1) units from these two datasets are sampled from the same population and 2) some characteristics of these units are observed. I show that if these characteristics can partially explain treatment assignment in the observational data, they can be used to derive moment restrictions that, in combination with the experimental data, improve estimation efficiency. I outline three estimators (weighting, shrinkage, or GMM) for implementing this strategy, and show that my methods can reduce variance by up to 50 the experimental sample is required to attain the same statistical precision. If researchers are allowed to design experiments differently, I show that they can further improve the precision by directly leveraging this correlation between characteristics and assignment. I apply the method to a search listing dataset from Expedia that studies the causal effect of search rankings, and show that my method can substantially improve the precision.


Combining observational and experimental data for causal inference considering data privacy

Combining observational and experimental data for causal inference can i...

Leveraging Population Outcomes to Improve the Generalization of Experimental Results

Generalizing causal estimates in randomized experiments to a broader tar...

Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions

Randomized Controlled Trials (RCT)s are relied upon to assess new treatm...

CoBWeb: a user-friendly web application to estimate causal treatment effects from observational data using multiple algorithms

Background/aims: While randomized controlled trials are the gold standar...

Empirical Bayes Double Shrinkage for Combining Biased and Unbiased Causal Estimates

Motivated by the proliferation of observational datasets and the need to...

Combining Observational and Experimental Datasets Using Shrinkage Estimators

We consider the problem of combining data from observational and experim...

Combining randomized field experiments with observational satellite data to assess the benefits of crop rotations on yields

With climate change threatening agricultural productivity and global foo...

Please sign up or login with your details

Forgot password? Click here to reset