Efficiently Learning and Sampling Interventional Distributions from Observations

by   Arnab Bhattacharyya, et al.

We study the problem of efficiently estimating the effect of an intervention on a single variable using observational samples in a causal Bayesian network. Our goal is to give algorithms that are efficient in both time and sample complexity in a non-parametric setting. Tian and Pearl (AAAI `02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose P is a causal model on a set V of n observable variables with respect to a given causal graph G with observable distribution P. Let P_x denote the interventional distribution over the observables with respect to an intervention of a designated variable X with x. We show that assuming that G has bounded in-degree, bounded c-components, and that the observational distribution is identifiable and satisfies certain strong positivity condition: 1. [Evaluation] There is an algorithm that outputs with probability 2/3 an evaluator for a distribution P' that satisfies d_tv(P_x, P') ≤ϵ using m=Õ(nϵ^-2) samples from P and O(mn) time. The evaluator can return in O(n) time the probability P'(v) for any assignment v to V. 2. [Generation] There is an algorithm that outputs with probability 2/3 a sampler for a distribution P̂ that satisfies d_tv(P_x, P̂) ≤ϵ using m=Õ(nϵ^-2) samples from P and O(mn) time. The sampler returns an iid sample from P̂ with probability 1-δ in O(nϵ^-1logδ^-1) time. We extend our techniques to estimate marginals P_x|_Y over a given Y ⊂ V of interest. We also show lower bounds for the sample complexity showing that our sample complexity has optimal dependence on the parameters n and ϵ as well as the strong positivity parameter.


page 1

page 2

page 3

page 4


Efficient inference of interventional distributions

We consider the problem of efficiently inferring interventional distribu...

Scalable Intervention Target Estimation in Linear Models

This paper considers the problem of estimating the unknown intervention ...

Active Structure Learning of Bayesian Networks in an Observational Setting

We study active structure learning of Bayesian networks in an observatio...

Learning linear structural equation models in polynomial time and sample complexity

The problem of learning structural equation models (SEMs) from data is a...

Efficient Causal Inference from Combined Observational and Interventional Data through Causal Reductions

Unobserved confounding is one of the main challenges when estimating cau...

Learning causal Bayes networks using interventional path queries in polynomial time and sample complexity

Causal discovery from empirical data is a fundamental problem in many sc...

Reproducibility in Learning

We introduce the notion of a reproducible algorithm in the context of le...