Deep Learning is Robust to Massive Label Noise

by   David Rolnick, et al.

Deep neural networks trained on large supervised datasets have led to impressive results in recent years. However, since well-annotated datasets can be prohibitively expensive and time-consuming to collect, recent work has explored the use of larger but noisy datasets that can be more easily obtained. In this paper, we investigate the behavior of deep neural networks on training sets with massively noisy labels. We show that successful learning is possible even with an essentially arbitrary amount of noise. For example, on MNIST we find that accuracy of above 90 percent is still attainable even when the dataset has been diluted with 100 noisy examples for each clean example. Such behavior holds across multiple patterns of label noise, even when noisy labels are biased towards confusing classes. Further, we show how the required dataset size for successful training increases with higher label noise. Finally, we present simple actionable techniques for improving learning in the regime of high label noise.


page 1

page 2

page 3

page 4


Error-Bounded Correction of Noisy Labels

To collect large scale annotated data, it is inevitable to introduce lab...

kNet: A Deep kNN Network To Handle Label Noise

Deep Neural Networks require large amounts of labeled data for their tra...

Co-Seg: An Image Segmentation Framework Against Label Corruption

Supervised deep learning performance is heavily tied to the availability...

On the Robustness of Monte Carlo Dropout Trained with Noisy Labels

The memorization effect of deep learning hinders its performance to effe...

Limited Gradient Descent: Learning With Noisy Labels

Label noise may handicap the generalization of classifiers, and it is an...

A Topological Filter for Learning with Label Noise

Noisy labels can impair the performance of deep neural networks. To tack...

Training Convolutional Networks with Noisy Labels

The availability of large labeled datasets has allowed Convolutional Net...