Causal Inference With Selectively-Deconfounded Data

02/25/2020
by   Kyra Gan, et al.
4

Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data; (b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporating a (large) confounded observational dataset alongside a (small) deconfounded observational dataset when estimating the ATE. Our theoretical results show that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level. Moreover, in some cases—say, genetics—we could imagine retrospectively selecting samples to deconfound. We demonstrate that by strategically selecting these examples based upon the (already observed) treatment and outcome, we can reduce our data dependence further. Our theoretical and empirical results establish that the worst-case relative performance of our approach (vs. a natural benchmark) is bounded while our best-case gains are unbounded. Next, we demonstrate the benefits of selective deconfounding using a large real-world dataset related to genetic mutation in cancer. Finally, we introduce an online version of the problem, proposing two adaptive heuristics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2021

Multi-Source Causal Inference Using Control Variates

While many areas of machine learning have benefited from the increasing ...
research
03/08/2021

Efficient Causal Inference from Combined Observational and Interventional Data through Causal Reductions

Unobserved confounding is one of the main challenges when estimating cau...
research
07/17/2019

Assessing Treatment Effect Variation in Observational Studies: Results from a Data Challenge

A growing number of methods aim to assess the challenging question of tr...
research
10/04/2021

Estimating Potential Outcome Distributions with Collaborating Causal Networks

Many causal inference approaches have focused on identifying an individu...
research
08/20/2022

Unit Selection with Nonbinary Treatment and Effect

The unit selection problem aims to identify a set of individuals who are...
research
09/20/2019

Applications of Generalized Maximum Likelihood Estimators to stratified sampling and post-stratification with many unobserved strata

Consider the problem of estimating a weighted average of the means of n ...

Please sign up or login with your details

Forgot password? Click here to reset