Denoising after Entropy-based Debiasing A Robust Training Method for Dataset Bias with Noisy Labels

12/01/2022
by   Sumyeong Ahn, et al.
0

Improperly constructed datasets can result in inaccurate inferences. For instance, models trained on biased datasets perform poorly in terms of generalization (i.e., dataset bias). Recent debiasing techniques have successfully achieved generalization performance by underestimating easy-to-learn samples (i.e., bias-aligned samples) and highlighting difficult-to-learn samples (i.e., bias-conflicting samples). However, these techniques may fail owing to noisy labels, because the trained model recognizes noisy labels as difficult-to-learn and thus highlights them. In this study, we find that earlier approaches that used the provided labels to quantify difficulty could be affected by the small proportion of noisy labels. Furthermore, we find that running denoising algorithms before debiasing is ineffective because denoising algorithms reduce the impact of difficult-to-learn samples, including valuable bias-conflicting samples. Therefore, we propose an approach called denoising after entropy-based debiasing, i.e., DENEB, which has three main stages. (1) The prejudice model is trained by emphasizing (bias-aligned, clean) samples, which are selected using a Gaussian Mixture Model. (2) Using the per-sample entropy from the output of the prejudice model, the sampling probability of each sample that is proportional to the entropy is computed. (3) The final model is trained using existing denoising algorithms with the mini-batches constructed by following the computed sampling probability. Compared to existing debiasing and denoising algorithms, our method achieves better debiasing performance on multiple benchmarks.

READ FULL TEXT

page 12

page 13

research
05/29/2022

BiasEnsemble: Revisiting the Importance of Amplifying Bias for Debiasing

In image classification, "debiasing" aims to train a classifier to be le...
research
05/31/2022

Mitigating Dataset Bias by Using Per-sample Gradient

The performance of deep neural networks is strongly influenced by the tr...
research
05/19/2023

Moment Matching Denoising Gibbs Sampling

Energy-Based Models (EBMs) offer a versatile framework for modeling comp...
research
03/20/2023

PASS: Peer-Agreement based Sample Selection for training with Noisy Labels

Noisy labels present a significant challenge in deep learning because mo...
research
06/10/2021

A Mathematical Foundation for Robust Machine Learning based on Bias-Variance Trade-off

A common assumption in machine learning is that samples are independentl...
research
01/23/2022

The risk of bias in denoising methods

Experimental datasets are growing rapidly in size, scope, and detail, bu...
research
10/30/2017

Denoising random forests

This paper proposes a novel type of random forests called a denoising ra...

Please sign up or login with your details

Forgot password? Click here to reset