Learning causal effects from many randomized experiments using regularized instrumental variables

01/04/2017
by   Alexander Peysakhovich, et al.
0

Scientific and business practices are increasingly resulting in large collections of randomized experiments. Analyzed together, these collections can tell us things that individual experiments in the collection cannot. We study how to learn causal relationships between variables from the kinds of collections faced by modern data scientists: the number of experiments is large, many experiments have very small effects, and the analyst lacks metadata (e.g., descriptions of the interventions). Here we use experimental groups as instrumental variables (IV) and show that a standard method (two-stage least squares) is biased even when the number of experiments is infinite. We show how a sparsity-inducing l0 regularization can --- in a reversal of the standard bias--variance tradeoff in regularization --- reduce bias (and thus error) of interventional predictions. Because we are interested in interventional loss minimization we also propose a modified cross-validation procedure (IVCV) to feasibly select the regularization parameter. We show, using a trick from Monte Carlo sampling, that IVCV can be done using summary statistics instead of raw data. This makes our full procedure simple to use in many real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2021

Causal aggregation: estimation and inference of causal effects by constraint-based data fusion

Randomized experiments are the gold standard for causal inference. In ex...
research
03/06/2020

Estimation of causal effects with small data under implicit functional constraints

We consider the problem of estimating causal effects of interventions fr...
research
07/04/2019

Subsampling Bias and The Best-Discrepancy Systematic Cross Validation

Statistical machine learning models should be evaluated and validated be...
research
07/16/2018

Density estimation by Randomized Quasi-Monte Carlo

We consider the problem of estimating the density of a random variable X...
research
07/09/2012

Estimating a Causal Order among Groups of Variables in Linear Models

The machine learning community has recently devoted much attention to th...
research
02/24/2021

Valid Instrumental Variables Selection Methods using Auxiliary Variable and Constructing Efficient Estimator

In observational studies, we are usually interested in estimating causal...
research
02/27/2019

ABCD-Strategy: Budgeted Experimental Design for Targeted Causal Structure Discovery

Determining the causal structure of a set of variables is critical for b...

Please sign up or login with your details

Forgot password? Click here to reset