Disentanglement and Generalization Under Correlation Shifts

12/29/2021
by   Christina M. Funke, et al.
16

Correlations between factors of variation are prevalent in real-world data. Machine learning algorithms may benefit from exploiting such correlations, as they can increase predictive performance on noisy data. However, often such correlations are not robust (e.g., they may change between domains, datasets, or applications) and we wish to avoid exploiting them. Disentanglement methods aim to learn representations which capture different factors of variation in latent subspaces. A common approach involves minimizing the mutual information between latent subspaces, such that each encodes a single underlying attribute. However, this fails when attributes are correlated. We solve this problem by enforcing independence between subspaces conditioned on the available attributes, which allows us to remove only dependencies that are not due to the correlation structure present in the training data. We achieve this via an adversarial approach to minimize the conditional mutual information (CMI) between subspaces with respect to categorical variables. We first show theoretically that CMI minimization is a good objective for robust disentanglement on linear problems with Gaussian data. We then apply our method on real-world datasets based on MNIST and CelebA, and show that it yields models that are disentangled and robust under correlation shift, including in weakly supervised settings.

READ FULL TEXT

page 7

page 8

page 18

page 19

page 20

research
05/23/2023

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Reinforcement Learning (RL) environments can produce training data with ...
research
06/14/2020

Is Independence all you need? On the Generalization of Representations Learned from Correlated Data

Despite impressive progress in the last decade, it still remains an open...
research
11/25/2019

Bridging Disentanglement with Independence and Conditional Independence via Mutual Information for Representation Learning

Existing works on disentangled representation learning usually lie on a ...
research
10/13/2022

Disentanglement of Correlated Factors via Hausdorff Factorized Support

A grand goal in deep learning research is to learn representations capab...
research
06/27/2022

Monitoring Shortcut Learning using Mutual Information

The failure of deep neural networks to generalize to out-of-distribution...
research
04/27/2018

Efficiently Learning Nonstationary Gaussian Processes for Real World Impact

Most real world phenomena such as sunlight distribution under a forest c...
research
07/14/2022

Improved OOD Generalization via Conditional Invariant Regularizer

Recently, generalization on out-of-distribution (OOD) data with correlat...

Please sign up or login with your details

Forgot password? Click here to reset