Bayesian Topic Regression for Causal Inference

09/11/2021
by   Maximilian Ahrens, et al.
5

Causal inference using observational text data is becoming increasingly popular in many research areas. This paper presents the Bayesian Topic Regression (BTR) model that uses both text and numerical information to model an outcome variable. It allows estimation of both discrete and continuous treatment effects. Furthermore, it allows for the inclusion of additional numerical confounding factors next to text data. To this end, we combine a supervised Bayesian topic model with a Bayesian regression framework and perform supervised representation learning for the text features jointly with the regression parameter training, respecting the Frisch-Waugh-Lovell theorem. Our paper makes two main contributions. First, we provide a regression framework that allows causal inference in settings when both text and numerical confounders are of relevance. We show with synthetic and semi-synthetic datasets that our joint approach recovers ground truth with lower bias than any benchmark model, when text and numerical features are correlated. Second, experiments on two real-world datasets demonstrate that a joint and supervised learning strategy also yields superior prediction results compared to strategies that estimate regression weights for text and non-text features separately, being even competitive with more complex deep neural networks.

READ FULL TEXT
research
10/09/2021

Deep Learning of Potential Outcomes

This review systematizes the emerging literature for causal inference us...
research
02/10/2021

Generating Synthetic Text Data to Evaluate Causal Inference Methods

Drawing causal conclusions from observational data requires making assum...
research
12/24/2018

Bayesian Causal Inference

We address the problem of two-variable causal inference. This task is to...
research
04/23/2022

Local Gaussian process extrapolation for BART models with applications to causal inference

Bayesian additive regression trees (BART) is a semi-parametric regressio...
research
08/08/2023

Generalization bound for estimating causal effects from observational network data

Estimating causal effects from observational network data is a significa...
research
05/21/2019

Slamming the sham: A Bayesian model for adaptive adjustment with noisy control data

It is not always clear how to adjust for control data in causal inferenc...

Please sign up or login with your details

Forgot password? Click here to reset