REPAIR: Removing Representation Bias by Dataset Resampling

04/16/2019
by   Yi Li, et al.
0

Modern machine learning datasets can have biases for certain representations that are leveraged by algorithms to achieve high performance without learning to solve the underlying task. This problem is referred to as "representation bias". The question of how to reduce the representation biases of a dataset is investigated and a new dataset REPresentAtion bIas Removal (REPAIR) procedure is proposed. This formulates bias minimization as an optimization problem, seeking a weight distribution that penalizes examples easy for a classifier built on a given feature representation. Bias reduction is then equated to maximizing the ratio between the classification loss on the reweighted dataset and the uncertainty of the ground-truth class labels. This is a minimax problem that REPAIR solves by alternatingly updating classifier parameters and dataset resampling weights, using stochastic gradient descent. An experimental set-up is also introduced to measure the bias of any dataset for a given representation, and the impact of this bias on the performance of recognition models. Experiments with synthetic and action recognition data show that dataset REPAIR can significantly reduce representation bias, and lead to improved generalization of models trained on REPAIRed datasets. The tools used for characterizing representation bias, and the proposed dataset REPAIR algorithm, are available at https://github.com/JerryYLi/Dataset-REPAIR/.

READ FULL TEXT
research
11/24/2022

AIREPAIR: A Repair Platform for Neural Networks

We present AIREPAIR, a platform for repairing neural networks. It featur...
research
05/10/2020

Towards Robustifying NLI Models Against Lexical Dataset Biases

While deep learning models are making fast progress on the task of Natur...
research
04/16/2020

ViBE: A Tool for Measuring and Mitigating Bias in Image Datasets

Machine learning models are known to perpetuate the biases present in th...
research
02/10/2020

Adversarial Filters of Dataset Biases

Large neural models have demonstrated human-level performance on languag...
research
12/21/2018

Feature-Wise Bias Amplification

We study the phenomenon of bias amplification in classifiers, wherein a ...
research
05/17/2022

Unbiased Math Word Problems Benchmark for Mitigating Solving Bias

In this paper, we revisit the solving bias when evaluating models on cur...
research
05/20/2020

Model Repair: Robust Recovery of Over-Parameterized Statistical Models

A new type of robust estimation problem is introduced where the goal is ...

Please sign up or login with your details

Forgot password? Click here to reset