Sample Selection with Uncertainty of Losses for Learning with Noisy Labels

06/01/2021
by   Xiaobo Xia, et al.
0

In learning with noisy labels, the sample selection approach is very popular, which regards small-loss data as correctly labeled during training. However, losses are generated on-the-fly based on the model being trained with noisy labels, and thus large-loss data are likely but not certainly to be incorrect. There are actually two possibilities of a large-loss data point: (a) it is mislabeled, and then its loss decreases slower than other data, since deep neural networks "learn patterns first"; (b) it belongs to an underrepresented group of data and has not been selected yet. In this paper, we incorporate the uncertainty of losses by adopting interval estimation instead of point estimation of losses, where lower bounds of the confidence intervals of losses derived from distribution-free concentration inequalities, but not losses themselves, are used for sample selection. In this way, we also give large-loss but less selected data a try; then, we can better distinguish between the cases (a) and (b) by seeing if the losses effectively decrease with the uncertainty after the try. As a result, we can better explore underrepresented data that are correctly labeled but seem to be mislabeled at first glance. Experiments demonstrate that the proposed method is superior to baselines and robust to a broad range of label noise types.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2023

Regularly Truncated M-estimators for Learning with Noisy Labels

The sample selection approach is very popular in learning with noisy lab...
research
01/30/2022

Do We Need to Penalize Variance of Losses for Learning with Label Noise?

Algorithms which minimize the averaged loss have been widely designed fo...
research
08/17/2022

CTRL: Clustering Training Losses for Label Error Detection

In supervised machine learning, use of correct labels is extremely impor...
research
08/26/2023

Late Stopping: Avoiding Confidently Learning from Mislabeled Examples

Sample selection is a prevalent method in learning with noisy labels, wh...
research
04/24/2023

Stochastic Soiling Loss Models for Heliostats in Concentrating Solar Power Plants

Reflectance losses on solar mirrors due to soiling pose a formidable cha...
research
03/14/2021

Pre-interpolation loss behaviour in neural networks

When training neural networks as classifiers, it is common to observe an...
research
10/16/2018

Stochastic Negative Mining for Learning with Large Output Spaces

We consider the problem of retrieving the most relevant labels for a giv...

Please sign up or login with your details

Forgot password? Click here to reset