Adversarial Filters of Dataset Biases

02/10/2020
by   Ronan Le Bras, et al.
14

Large neural models have demonstrated human-level performance on language and vision benchmarks such as ImageNet and Stanford Natural Language Inference (SNLI). Yet, their performance degrades considerably when tested on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting on spurious dataset biases. We investigate one recently proposed approach, AFLite, which adversarially filters such dataset biases, as a means to mitigate the prevalent overestimation of machine performance. We provide a theoretical understanding for AFLite, by situating it in the generalized framework for optimum bias reduction. Our experiments show that as a result of the substantial reduction of these biases, models trained on the filtered datasets yield better generalization to out-of-distribution tasks, especially when the benchmarks used for training are over-populated with biased samples. We show that AFLite is broadly applicable to a variety of both real and synthetic datasets for reduction of measurable dataset biases and provide extensive supporting analyses. Finally, filtering results in a large drop in model performance (e.g., from 92 still remains high. Our work thus shows that such filtered datasets can pose new research challenges for robust generalization by serving as upgraded benchmarks.

READ FULL TEXT
research
05/02/2020

DQI: Measuring Data Quality in NLP

Neural language models have achieved human level performance across seve...
research
07/09/2019

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

Popular Natural Language Inference (NLI) datasets have been shown to be ...
research
02/28/2023

SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

Recent studies reveal that various biases exist in different NLP tasks, ...
research
05/01/2020

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Models for natural language understanding (NLU) tasks often rely on the ...
research
07/14/2020

Our Evaluation Metric Needs an Update to Encourage Generalization

Models that surpass human performance on several popular benchmarks disp...
research
04/16/2019

REPAIR: Removing Representation Bias by Dataset Resampling

Modern machine learning datasets can have biases for certain representat...
research
06/06/2019

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

The success of lottery ticket initializations (Frankle and Carbin, 2019)...

Please sign up or login with your details

Forgot password? Click here to reset