Monitoring Shortcut Learning using Mutual Information

06/27/2022
by   Mohammed Adnan, et al.
5

The failure of deep neural networks to generalize to out-of-distribution data is a well-known problem and raises concerns about the deployment of trained networks in safety-critical domains such as healthcare, finance and autonomous vehicles. We study a particular kind of distribution shift x2013 shortcuts or spurious correlations in the training data. Shortcut learning is often only exposed when models are evaluated on real-world data that does not contain the same spurious correlations, posing a serious dilemma for AI practitioners to properly assess the effectiveness of a trained model for real-world applications. In this work, we propose to use the mutual information (MI) between the learned representation and the input as a metric to find where in training, the network latches onto shortcuts. Experiments demonstrate that MI can be used as a domain-agnostic metric for monitoring shortcut learning.

READ FULL TEXT

page 2

page 3

research
05/23/2023

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Reinforcement Learning (RL) environments can produce training data with ...
research
08/27/2021

ProtoInfoMax: Prototypical Networks with Mutual Information Maximization for Out-of-Domain Detection

The ability to detect Out-of-Domain (OOD) inputs has been a critical req...
research
06/09/2022

DORA: Exploring outlier representations in Deep Neural Networks

Deep Neural Networks (DNNs) draw their power from the representations th...
research
12/29/2021

Disentanglement and Generalization Under Correlation Shifts

Correlations between factors of variation are prevalent in real-world da...
research
03/31/2022

Mutual information estimation for graph convolutional neural networks

Measuring model performance is a key issue for deep learning practitione...
research
06/28/2023

Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalization

Out-of-distribution (OOD) graph generalization are critical for many rea...
research
09/30/2022

Information Removal at the bottleneck in Deep Neural Networks

Deep learning models are nowadays broadly deployed to solve an incredibl...

Please sign up or login with your details

Forgot password? Click here to reset