Counterfactual Supervision-based Information Bottleneck for Out-of-Distribution Generalization

08/16/2022
by   Bin Deng, et al.
0

Learning invariant (causal) features for out-of-distribution (OOD) generalization has attracted extensive attention recently, and among the proposals invariant risk minimization (IRM) (Arjovsky et al., 2019) is a notable solution. In spite of its theoretical promise for linear regression, the challenges of using IRM in linear classification problems yet remain (Rosenfeld et al.,2020, Nagarajan et al., 2021). Along this line, a recent study (Arjovsky et al., 2019) has made a first step and proposes a learning principle of information bottleneck based invariant risk minimization (IB-IRM). In this paper, we first show that the key assumption of support overlap of invariant features used in (Arjovsky et al., 2019) is rather strong for the guarantee of OOD generalization and it is still possible to achieve the optimal solution without such assumption. To further answer the question of whether IB-IRM is sufficient for learning invariant features in linear classification problems, we show that IB-IRM would still fail in two cases whether or not the invariant features capture all information about the label. To address such failures, we propose a Counterfactual Supervision-based Information Bottleneck (CSIB) learning algorithm that provably recovers the invariant features. The proposed algorithm works even when accessing data from a single environment, and has theoretically consistent results for both binary and multi-class problems. We present empirical experiments on three synthetic datasets that verify the efficacy of our proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

The Risks of Invariant Risk Minimization

Invariant Causal Prediction (Peters et al., 2016) is a technique for out...
research
06/11/2021

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

The invariance principle from causality is at the heart of notable appro...
research
10/28/2020

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions

Recently, invariant risk minimization (IRM) (Arjovsky et al.) was propos...
research
01/16/2021

Out-of-distribution Prediction with Invariant Risk Minimization: The Limitation and An Effective Fix

This work considers the out-of-distribution (OOD) prediction problem whe...
research
01/25/2022

Conditional entropy minimization principle for learning domain invariant representation features

Invariance principle-based methods, for example, Invariant Risk Minimiza...
research
06/02/2022

Revisiting the General Identifiability Problem

We revisit the problem of general identifiability originally introduced ...
research
06/05/2021

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Can models with particular structure avoid being biased towards spurious...

Please sign up or login with your details

Forgot password? Click here to reset