Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests

05/31/2021
by   Victor Veitch, et al.
0

Informally, a `spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter. In machine learning, these have a know-it-when-you-see-it character; e.g., changing the gender of a sentence's subject changes a sentiment predictor's output. To check for spurious correlations, we can `stress test' models by perturbing irrelevant parts of input data and seeing if model predictions change. In this paper, we study stress testing using the tools of causal inference. We introduce counterfactual invariance as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions. We connect counterfactual invariance to out-of-domain model performance, and provide practical schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and implications of counterfactual invariance depend fundamentally on the true underlying causal structure of the data. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2023

Results on Counterfactual Invariance

In this paper we provide a theoretical analysis of counterfactual invari...
research
09/12/2022

Bias Challenges in Counterfactual Data Augmentation

Deep learning models tend not to be out-of-distribution robust primarily...
research
07/20/2022

Learning Counterfactually Invariant Predictors

We propose a method to learn predictors that are invariant under counter...
research
01/06/2023

Evaluating counterfactual explanations using Pearl's counterfactual method

Counterfactual explanations (CEs) are methods for generating an alternat...
research
06/14/2011

From Causal Models To Counterfactual Structures

Galles and Pearl claimed that "for recursive models, the causal model fr...
research
04/09/2022

Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language

Spurious correlations are a threat to the trustworthiness of natural lan...
research
03/16/2022

Counterfactual Inference of Second Opinions

Automated decision support systems that are able to infer second opinion...

Please sign up or login with your details

Forgot password? Click here to reset