Class2Simi: A New Perspective on Learning with Label Noise

by   Songhua Wu, et al.

Label noise is ubiquitous in the era of big data. Deep learning algorithms can easily fit the noise and thus cannot generalize well without properly modeling the noise. In this paper, we propose a new perspective on dealing with label noise called Class2Simi. Specifically, we transform the training examples with noisy class labels into pairs of examples with noisy similarity labels and propose a deep learning framework to learn robust classifiers directly with the noisy similarity labels. Note that a class label shows the class that an instance belongs to; while a similarity label indicates whether or not two instances belong to the same class. It is worthwhile to perform the transformation: We prove that the noise rate for the noisy similarity labels is lower than that of the noisy class labels, because similarity labels themselves are robust to noise. For example, given two instances, even if both of their class labels are incorrect, their similarity label could be correct. Due to the lower noise rate, Class2Simi achieves remarkably better classification accuracy than its baselines that directly deals with the noisy class labels.


page 1

page 2

page 3

page 4


Multi-Class Classification from Noisy-Similarity-Labeled Data

A similarity label indicates whether two instances belong to the same cl...

LNL+K: Learning with Noisy Labels and Noise Source Distribution Knowledge

Learning with noisy labels (LNL) is challenging as the model tends to me...

Learning to Combat Noisy Labels via Classification Margins

A deep neural network trained on noisy labels is known to quickly lose i...

Confident Learning: Estimating Uncertainty in Dataset Labels

Learning exists in the context of data, yet notions of confidence typica...

CNT (Conditioning on Noisy Targets): A new Algorithm for Leveraging Top-Down Feedback

We propose a novel regularizer for supervised learning called Conditioni...

Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability

In this paper, we study learning in probabilistic domains where the lear...

Making Every Label Count: Handling Semantic Imprecision by Integrating Domain Knowledge

Noisy data, crawled from the web or supplied by volunteers such as Mecha...

Please sign up or login with your details

Forgot password? Click here to reset