Classification with Noisy Labels by Importance Reweighting

11/27/2014
by   Tongliang Liu, et al.
0

In this paper, we study a classification problem in which sample labels are randomly corrupted. In this scenario, there is an unobservable sample with noise-free labels. However, before being observed, the true labels are independently flipped with a probability ρ∈[0,0.5), and the random label noise can be class-conditional. Here, we address two fundamental problems raised by this scenario. The first is how to best use the abundant surrogate loss functions designed for the traditional classification problem when there is label noise. We prove that any surrogate loss function can be used for classification with noisy labels by using importance reweighting, with consistency assurance that the label noise does not ultimately hinder the search for the optimal classifier of the noise-free sample. The other is the open problem of how to obtain the noise rate ρ. We show that the rate is upper bounded by the conditional probability P(y|x) of the noisy sample. Consequently, the rate can be estimated, because the upper bound can be easily reached in classification problems. Experimental results on synthetic and real datasets confirm the efficiency of our methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2013

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

In many real-world classification problems, the labels of training examp...
research
10/23/2021

Signal to Noise Ratio Loss Function

This work proposes a new loss function targeting classification problems...
research
12/17/2019

Performance of regression models as a function of experiment noise

A challenge in developing machine learning regression models is that it ...
research
02/14/2019

Classification with unknown class conditional label noise on non-compact feature spaces

We investigate the problem of classification in the presence of unknown ...
research
09/23/2021

Unbiased Loss Functions for Multilabel Classification with Missing Labels

This paper considers binary and multilabel classification problems in a ...
research
10/28/2022

The Fisher-Rao Loss for Learning under Label Noise

Choosing a suitable loss function is essential when learning by empirica...
research
05/31/2023

Label Embedding by Johnson-Lindenstrauss Matrices

We present a simple and scalable framework for extreme multiclass classi...

Please sign up or login with your details

Forgot password? Click here to reset