An adversarial feature learning strategy for debiasing neural networks

01/30/2023
by   Rishabh Tiwari, et al.
0

Simplicity bias is the concerning tendency of deep networks to over-depend on simple, weakly predictive features, to the exclusion of stronger, more complex features. This causes biased, incorrect model predictions in many real-world applications, exacerbated by incomplete training data containing spurious feature-label correlations. We propose a direct, interventional method for addressing simplicity bias in DNNs, which we call the feature sieve. We aim to automatically identify and suppress easily-computable spurious features in lower layers of the network, thereby allowing the higher network levels to extract and utilize richer, more meaningful representations. We provide concrete evidence of this differential suppression enhancement of relevant features on both controlled datasets and real-world images, and report substantial gains on many real-world debiasing benchmarks (11.4 on Imagenet-A; 3.2 incorporate knowledge about known spurious or biased attributes, despite our method not using any such information. We believe that our feature sieve work opens up exciting new research directions in automated adversarial feature extraction representation learning for deep networks.

READ FULL TEXT

page 2

page 8

research
05/25/2022

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

Many recent works indicate that the deep neural networks tend to take da...
research
11/04/2022

SelecMix: Debiased Learning by Contradicting-pair Sampling

Neural networks trained with ERM (empirical risk minimization) sometimes...
research
10/04/2022

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Deep Neural Networks are known to be brittle to even minor distribution ...
research
06/22/2022

Learning Debiased Classifier with Biased Committee

Neural networks are prone to be biased towards spurious correlations bet...
research
03/29/2023

Implicit Visual Bias Mitigation by Posterior Estimate Sharpening of a Bayesian Neural Network

The fairness of a deep neural network is strongly affected by dataset bi...
research
05/30/2023

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

Neural networks trained with (stochastic) gradient descent have an induc...
research
06/14/2018

Neural Stethoscopes: Unifying Analytic, Auxiliary and Adversarial Network Probing

Model interpretability and systematic, targeted model adaptation present...

Please sign up or login with your details

Forgot password? Click here to reset