Combining multiple observational data sources to estimate causal effects

01/02/2018
by   Shu Yang, et al.
0

The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. As an important example in causal inference, we consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data with supplementary information on these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. The proposed framework applies to asymptotically normal estimators, including the commonly-used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. Coupled with appropriate bootstrap procedures, our method is straightforward to implement using software routines for existing estimators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2019

A Unified Framework for Causal Inference with Multiple Imputation Using Martingale

Multiple imputation is widely used to handle confounders missing at rand...
research
03/21/2020

Borrowing from Supplemental Sources to Estimate Causal Effects from a Primary Data Source

The increasing multiplicity of data sources offers exciting possibilitie...
research
05/03/2022

Three-phase generalized raking and multiple imputation estimators to address error-prone data

Validation studies are often used to obtain more reliable information in...
research
07/08/2018

Integration of survey data and big observational data for finite population inference using mass imputation

Multiple data sources are becoming increasingly available for statistica...
research
02/11/2021

Clarifying causal mediation analysis: From simple to more robust strategies for estimation of marginal natural (in)direct effects

This paper aims to contribute to helping practitioners of causal mediati...
research
08/11/2014

Optimum Statistical Estimation with Strategic Data Sources

We propose an optimum mechanism for providing monetary incentives to the...
research
03/26/2020

Estimating Treatment Effects with Observed Confounders and Mediators

Given a causal graph, the do-calculus can express treatment effects as f...

Please sign up or login with your details

Forgot password? Click here to reset