Label Noise Types and Their Effects on Deep Learning

03/23/2020
by   Görkem Algan, et al.
0

The recent success of deep learning is mostly due to the availability of big datasets with clean annotations. However, gathering a cleanly annotated dataset is not always feasible due to practical challenges. As a result, label noise is a common problem in datasets, and numerous methods to train deep neural networks in the presence of noisy labels are proposed in the literature. These methods commonly use benchmark datasets with synthetic label noise on the training set. However, there are multiple types of label noise, and each of them has its own characteristic impact on learning. Since each work generates a different kind of label noise, it is problematic to test and compare those algorithms in the literature fairly. In this work, we provide a detailed analysis of the effects of different kinds of label noise on learning. Moreover, we propose a generic framework to generate feature-dependent label noise, which we show to be the most challenging case for learning. Our proposed method aims to emphasize similarities among data instances by sparsely distributing them in the feature domain. By this approach, samples that are more likely to be mislabeled are detected from their softmax probabilities, and their labels are flipped to the corresponding class. The proposed method can be applied to any clean dataset to synthesize feature-dependent noisy labels. For the ease of other researchers to test their algorithms with noisy labels, we share corrupted labels for the most commonly used benchmark datasets. Our code and generated noisy synthetic labels are available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2021

A Realistic Simulation Framework for Learning with Label Noise

We propose a simulation framework for generating realistic instance-depe...
research
05/22/2021

Generation and Analysis of Feature-Dependent Pseudo Noise for Training Deep Neural Networks

Training Deep neural networks (DNNs) on noisy labeled datasets is a chal...
research
01/04/2023

Towards the Identifiability in Noisy Label Learning: A Multinomial Mixture Approach

Learning from noisy labels plays an important role in the deep learning ...
research
04/19/2021

Do We Really Need Gold Samples for Sample Weighting Under Label Noise?

Learning with labels noise has gained significant traction recently due ...
research
08/05/2022

Neighborhood Collective Estimation for Noisy Label Identification and Correction

Learning with noisy labels (LNL) aims at designing strategies to improve...
research
02/01/2021

Learning to Combat Noisy Labels via Classification Margins

A deep neural network trained on noisy labels is known to quickly lose i...
research
01/24/2021

Analysing the Noise Model Error for Realistic Noisy Label Data

Distant and weak supervision allow to obtain large amounts of labeled tr...

Please sign up or login with your details

Forgot password? Click here to reset