Look Beyond Bias with Entropic Adversarial Data Augmentation

by   Thomas Duboudin, et al.

Deep neural networks do not discriminate between spurious and causal patterns, and will only learn the most predictive ones while ignoring the others. This shortcut learning behaviour is detrimental to a network's ability to generalize to an unknown test-time distribution in which the spurious correlations do not hold anymore. Debiasing methods were developed to make networks robust to such spurious biases but require to know in advance if a dataset is biased and make heavy use of minority counterexamples that do not display the majority bias of their class. In this paper, we argue that such samples should not be necessarily needed because the ”hidden” causal information is often also contained in biased images. To study this idea, we propose 3 publicly released synthetic classification benchmarks, exhibiting predictive classification shortcuts, each of a different and challenging nature, without any minority samples acting as counterexamples. First, we investigate the effectiveness of several state-of-the-art strategies on our benchmarks and show that they do not yield satisfying results on them. Then, we propose an architecture able to succeed on our benchmarks, despite their unusual properties, using an entropic adversarial data augmentation training scheme. An encoder-decoder architecture is tasked to produce images that are not recognized by a classifier, by maximizing the conditional entropy of its outputs, and keep as much as possible of the initial content. A precise control of the information destroyed, via a disentangling process, enables us to remove the shortcut and leave everything else intact. Furthermore, results competitive with the state-of-the-art on the BAR dataset ensure the applicability of our method in real-life situations.


page 1

page 5

page 8

page 9

page 11


OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses

Dataset bias and spurious correlations can significantly impair generali...

BiaSwap: Removing dataset bias with bias-tailored swapping augmentation

Deep neural networks often make decisions based on the spurious correlat...

SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification

In this paper, we present SpecAugment++, a novel data augmentation metho...

Data augmentation with mixtures of max-entropy transformations for filling-level classification

We address the problem of distribution shifts in test-time data with a p...

Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing

Machine learning models achieve state-of-the-art performance on many sup...

Assessing Dataset Bias in Computer Vision

A biased dataset is a dataset that generally has attributes with an unev...

Towards Learning an Unbiased Classifier from Biased Data via Conditional Adversarial Debiasing

Bias in classifiers is a severe issue of modern deep learning methods, e...

Please sign up or login with your details

Forgot password? Click here to reset