Causal Estimation for Text Data with (Apparent) Overlap Violations

09/30/2022
by   Lin Gui, et al.
0

Consider the problem of estimating the causal effect of some attribute of a text document; for example: what effect does writing a polite vs. rude email have on response time? To estimate a causal effect from observational data, we need to adjust for confounding aspects of the text that affect both the treatment and outcome – e.g., the topic or writing level of the text. These confounding aspects are unknown a priori, so it seems natural to adjust for the entirety of the text (e.g., using a transformer). However, causal identification and estimation procedures rely on the assumption of overlap: for all levels of the adjustment variables, there is randomness leftover so that every unit could have (not) received treatment. Since the treatment here is itself an attribute of the text, it is perfectly determined, and overlap is apparently violated. The purpose of this paper is to show how to handle causal identification and obtain robust causal estimation in the presence of apparent overlap violations. In brief, the idea is to use supervised representation learning to produce a data representation that preserves confounding information while eliminating information that is only predictive of the treatment. This representation then suffices for adjustment and can satisfy overlap. Adapting results on non-parametric estimation, we find that this procedure is robust to conditional outcome misestimation, yielding a low-bias estimator with valid uncertainty quantification under weak conditions. Empirical results show strong improvements in bias and uncertainty quantification relative to the natural baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2018

Confounding caused by causal-effect covariability

Confounding seriously impairs our ability to learn about causal relation...
research
10/24/2020

Causal Effects of Linguistic Properties

We consider the problem of estimating the causal effects of linguistic p...
research
11/24/2020

Invariant Representation Learning for Treatment Effect Estimation

The defining challenge for causal inference from observational data is t...
research
06/13/2022

Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa

Observational studies of causal effects require adjustment for confoundi...
research
04/12/2021

Deconfounding Scores: Feature Representations for Causal Effect Estimation with Weak Overlap

A key condition for obtaining reliable estimates of the causal effect of...
research
12/03/2019

Confounding Adjustment Methods for Multi-level Treatment Comparisons Under Lack of Positivity and Unknown Model Specification

Imbalances in covariates between treatment groups are frequent in observ...
research
09/15/2023

To Predict or to Reject: Causal Effect Estimation with Uncertainty on Networked Data

Due to the imbalanced nature of networked observational data, the causal...

Please sign up or login with your details

Forgot password? Click here to reset