From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework

by   Francesca Dominici, et al.

This paper discusses the fundamental principles of causal inference - the area of statistics that estimates the effect of specific occurrences, treatments, interventions, and exposures on a given outcome from experimental and observational data. We explain the key assumptions required to identify causal effects, and highlight the challenges associated with the use of observational data. We emphasize that experimental thinking is crucial in causal inference. The quality of the data (not necessarily the quantity), the study design, the degree to which the assumptions are met, and the rigor of the statistical analysis allow us to credibly infer causal effects. Although we advocate leveraging the use of big data and the application of machine learning (ML) algorithms for estimating causal effects, they are not a substitute of thoughtful study design. Concepts are illustrated via examples.



There are no comments yet.


page 1

page 2

page 3

page 4


A Survey on Causal Inference

Causal inference is a critical research topic across many domains, such ...

Multi-Source Causal Inference Using Control Variates

While many areas of machine learning have benefited from the increasing ...

Identifying Candidate Risk Factors for Prescription Drug Side Effects using Causal Contrast Set Mining

Big longitudinal observational databases present the opportunity to extr...

On the reliability of published findings using the regression discontinuity design in political science

The regression discontinuity (RD) design offers identification of causal...

Controlling for Unknown Confounders in Neuroimaging

The aim of many studies in biomedicine is to infer cause-effect relation...

Study design in causal models

The causal assumptions, the study design and the data are the elements r...

Clinically Relevant Mediation Analysis using Controlled Indirect Effect

Mediation analysis allows one to use observational data to estimate the ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.