ProMix: Combating Label Noise via Maximizing Clean Sample Utility

07/21/2022
by   Haobo Wang, et al.
0

The ability to train deep neural networks under label noise is appealing, as imperfectly annotated data are relatively cheaper to obtain. State-of-the-art approaches are based on semi-supervised learning(SSL), which selects small loss examples as clean and then applies SSL techniques for boosted performance. However, the selection step mostly provides a medium-sized and decent-enough clean subset, which overlooks a rich set of clean samples. In this work, we propose a novel noisy label learning framework ProMix that attempts to maximize the utility of clean samples for boosted performance. Key to our method, we propose a matched high-confidence selection technique that selects those examples having high confidence and matched prediction with its given labels. Combining with the small-loss selection, our method is able to achieve a precision of 99.27 and a recall of 98.22 in detecting clean samples on the CIFAR-10N dataset. Based on such a large set of clean data, ProMix improves the best baseline method by +2.67 The code and data are available at https://github.com/Justherozen/ProMix

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2022

UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning

Supervised deep learning methods require a large repository of annotated...
research
03/06/2021

LongReMix: Robust Learning with High Confidence Samples in a Noisy Label Environment

Deep neural network models are robust to a limited amount of label noise...
research
05/27/2021

Training Classifiers that are Universally Robust to All Label Noise Levels

For classification tasks, deep neural networks are prone to overfitting ...
research
12/21/2022

Class Prototype-based Cleaner for Label Noise Learning

Semi-supervised learning based methods are current SOTA solutions to the...
research
08/06/2020

Salvage Reusable Samples from Noisy Data for Robust Learning

Due to the existence of label noise in web images and the high memorizat...
research
06/05/2023

On Emergence of Clean-Priority Learning in Early Stopped Neural Networks

When random label noise is added to a training dataset, the prediction e...
research
10/12/2022

How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?

Given the volume of data needed to train modern machine learning models,...

Please sign up or login with your details

Forgot password? Click here to reset