An Investigation of Critical Issues in Bias Mitigation Techniques

04/01/2021
by   Robik Shrestha, et al.
0

A critical problem in deep learning is that systems learn inappropriate biases, resulting in their inability to perform well on minority groups. This has led to the creation of multiple algorithms that endeavor to mitigate bias. However, it is not clear how effective these methods are. This is because study protocols differ among papers, systems are tested on datasets that fail to test many forms of bias, and systems have access to hidden knowledge or are tuned specifically to the test set. To address this, we introduce an improved evaluation protocol, sensible metrics, and a new dataset, which enables us to ask and answer critical questions about bias mitigation algorithms. We evaluate seven state-of-the-art algorithms using the same network architecture and hyperparameter selection policy across three benchmark datasets. We introduce a new dataset called Biased MNIST that enables assessment of robustness to multiple bias sources. We use Biased MNIST and a visual question answering (VQA) benchmark to assess robustness to hidden biases. Rather than only tuning to the test set distribution, we study robustness across different tuning distributions, which is critical because for many applications the test distribution may not be known during development. We find that algorithms exploit hidden biases, are unable to scale to multiple forms of bias, and are highly sensitive to the choice of tuning set. Based on our findings, we implore the community to adopt more rigorous assessment of future bias mitigation methods. All data, code, and results are publicly available at: https://github.com/erobic/bias-mitigators.

READ FULL TEXT

page 1

page 5

page 19

research
06/24/2019

RUBi: Reducing Unimodal Biases in Visual Question Answering

Visual Question Answering (VQA) is the task of answering questions about...
research
10/10/2022

Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA

Visual Question Answering (VQA) models are prone to learn the shortcut s...
research
01/28/2023

Bipol: Multi-axes Evaluation of Bias with Explainability in Benchmark Datasets

We evaluate five English NLP benchmark datasets (available on the superG...
research
08/19/2023

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

Bias mitigation in image classification has been widely researched, and ...
research
09/01/2021

Don't Discard All the Biased Instances: Investigating a Core Assumption in Dataset Bias Mitigation Techniques

Existing techniques for mitigating dataset bias often leverage a biased ...
research
05/17/2022

Unbiased Math Word Problems Benchmark for Mitigating Solving Bias

In this paper, we revisit the solving bias when evaluating models on cur...
research
11/23/2022

BiasBed – Rigorous Texture Bias Evaluation

The well-documented presence of texture bias in modern convolutional neu...

Please sign up or login with your details

Forgot password? Click here to reset