Causally-motivated Shortcut Removal Using Auxiliary Labels

05/13/2021
by   Maggie Makar, et al.
0

Robustness to certain distribution shifts is a key requirement in many ML applications. Often, relevant distribution shifts can be formulated in terms of interventions on the process that generates the input data. Here, we consider the problem of learning a predictor whose risk across such shifts is invariant. A key challenge to learning such risk-invariant predictors is shortcut learning, or the tendency for models to rely on spurious correlations in practice, even when a predictor based on shift-invariant features could achieve optimal i.i.d generalization in principle. We propose a flexible, causally-motivated approach to address this challenge. Specifically, we propose a regularization scheme that makes use of auxiliary labels for potential shortcut features, which are often available at training time. Drawing on the causal structure of the problem, we enforce a conditional independence between the representation used to predict the main label and the auxiliary labels. We show both theoretically and empirically that this causally-motivated regularization scheme yields robust predictors that generalize well both in-distribution and under distribution shifts, and does so with better sample efficiency than standard regularization or weighting approaches.

READ FULL TEXT
research
01/02/2022

Improving Out-of-Distribution Robustness via Selective Augmentation

Machine learning algorithms typically assume that training and test exam...
research
10/19/2021

Learning Representations that Support Robust Transfer of Predictors

Ensuring generalization to unseen environments remains a challenge. Doma...
research
06/29/2022

When Does Group Invariant Learning Survive Spurious Correlations?

By inferring latent groups in the training data, recent works introduce ...
research
07/28/2023

Optimal multi-environment causal regularization

In this manuscript we derive the optimal out-of-sample causal predictor ...
research
07/06/2020

Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

When machine learning models are deployed on a test distribution differe...
research
06/15/2022

Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

Real-world data collected from multiple domains can have multiple, disti...
research
02/26/2020

Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures

We frame the meta-learning of prediction procedures as a search for an o...

Please sign up or login with your details

Forgot password? Click here to reset