Agree to Disagree: Diversity through Disagreement for Better Transferability

02/09/2022
by   Matteo Pagliardini, et al.
0

Gradient-based learning algorithms have an implicit simplicity bias which in effect can limit the diversity of predictors being sampled by the learning procedure. This behavior can hinder the transferability of trained models by (i) favoring the learning of simpler but spurious features – present in the training data but absent from the test data – and (ii) by only leveraging a small subset of predictive features. Such an effect is especially magnified when the test distribution does not exactly match the train distribution – referred to as the Out of Distribution (OOD) generalization problem. However, given only the training data, it is not always possible to apriori assess if a given feature is spurious or transferable. Instead, we advocate for learning an ensemble of models which capture a diverse set of predictive features. Towards this, we propose a new algorithm D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data, but disagreement on the OOD data. We show how D-BAT naturally emerges from the notion of generalized discrepancy, as well as demonstrate in multiple experiments how the proposed method can mitigate shortcut-learning, enhance uncertainty and OOD detection, as well as improve transferability.

READ FULL TEXT

page 2

page 6

page 13

page 14

research
08/30/2023

Learning Diverse Features in Vision Transformers for Improved Generalization

Deep learning models often rely only on a small set of features even whe...
research
06/07/2021

Quantifying and Improving Transferability in Domain Generalization

Out-of-distribution generalization is one of the key challenges when tra...
research
04/01/2021

TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity and Model Smoothness

Adversarial Transferability is an intriguing property of adversarial exa...
research
06/20/2022

Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments

We develop a new, principled algorithm for estimating the contribution o...
research
09/18/2017

Searching for test data with feature diversity

There is an implicit assumption in software testing that more diverse an...
research
02/17/2021

Transferability of Neural Network-based De-identification Systems

Methods and Materials: We investigated transferability of neural network...
research
07/23/2022

A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

In this work we establish an algorithm and distribution independent non-...

Please sign up or login with your details

Forgot password? Click here to reset