Don't Discard All the Biased Instances: Investigating a Core Assumption in Dataset Bias Mitigation Techniques

09/01/2021
by   Hossein Amirkhani, et al.
0

Existing techniques for mitigating dataset bias often leverage a biased model to identify biased instances. The role of these biased instances is then reduced during the training of the main model to enhance its robustness to out-of-distribution data. A common core assumption of these techniques is that the main model handles biased instances similarly to the biased model, in that it will resort to biases whenever available. In this paper, we show that this assumption does not hold in general. We carry out a critical investigation on two well-known datasets in the domain, MNLI and FEVER, along with two biased instance detection methods, partial-input and limited-capacity models. Our experiments show that in around a third to a half of instances, the biased model is unable to predict the main model's behavior, highlighted by the significantly different parts of the input on which they base their decisions. Based on a manual validation, we also show that this estimate is highly in line with human interpretation. Our findings suggest that down-weighting of instances detected by bias detection methods, which is a widely-practiced procedure, is an unnecessary waste of training data. We release our code to facilitate reproducibility and future research.

READ FULL TEXT
research
05/06/2023

Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo Chamber

Neural networks often learn spurious correlations when exposed to biased...
research
02/06/2023

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

Several proposals have been put forward in recent years for improving ou...
research
05/30/2023

Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases

NLP models often rely on superficial cues known as dataset biases to ach...
research
04/01/2021

An Investigation of Critical Issues in Bias Mitigation Techniques

A critical problem in deep learning is that systems learn inappropriate ...
research
05/21/2023

BiasAsker: Measuring the Bias in Conversational AI System

Powered by advanced Artificial Intelligence (AI) techniques, conversatio...
research
11/07/2022

Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference

It has been shown that NLI models are usually biased with respect to the...
research
03/26/2021

Mixing-AdaSIN: Constructing a de-biased dataset using Adaptive Structural Instance Normalization and texture Mixing

Following the pandemic outbreak, several works have proposed to diagnose...

Please sign up or login with your details

Forgot password? Click here to reset