The Missing Invariance Principle Found – the Reciprocal Twin of Invariant Risk Minimization

05/29/2022
by   Dongsung Huh, et al.
0

Machine learning models often generalize poorly to out-of-distribution (OOD) data as a result of relying on features that are spuriously correlated with the label during training. Recently, the technique of Invariant Risk Minimization (IRM) was proposed to learn predictors that only use invariant features by conserving the feature-conditioned class expectation 𝔼_e[y|f(x)] across environments. However, more recent studies have demonstrated that IRM can fail in various task settings. Here, we identify a fundamental flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, that is based on conserving the class-conditioned feature expectation 𝔼_e[f(x)|y] across environments, that corrects for the flaw in IRM. Further, we introduce a simplified, practical version of the MRI formulation called as MRI-v1. We note that this constraint is convex which confers it with an advantage over the practical version of IRM, IRM-v1, which imposes non-convex constraints. We prove that in a general linear problem setting, MRI-v1 can guarantee invariant predictors given sufficient environments. We also empirically demonstrate that MRI strongly out-performs IRM and consistently achieves near-optimal OOD generalization in image-based nonlinear problems.

READ FULL TEXT

page 5

page 16

research
01/04/2021

Does Invariant Risk Minimization Capture Invariance?

We show that the Invariant Risk Minimization (IRM) formulation of Arjovs...
research
12/17/2021

Balancing Fairness and Robustness via Partial Invariance

The Invariant Risk Minimization (IRM) framework aims to learn invariant ...
research
05/18/2022

An Invariant Matching Property for Distribution Generalization under Intervened Response

The task of distribution generalization concerns making reliable predict...
research
06/17/2021

On Invariance Penalties for Risk Minimization

The Invariant Risk Minimization (IRM) principle was first proposed by Ar...
research
02/14/2022

Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization

A common explanation for the failure of deep networks to generalize out-...
research
01/25/2022

Conditional entropy minimization principle for learning domain invariant representation features

Invariance principle-based methods, for example, Invariant Risk Minimiza...
research
06/11/2021

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

The invariance principle from causality is at the heart of notable appro...

Please sign up or login with your details

Forgot password? Click here to reset