Searching to Exploit Memorization Effect in Learning from Corrupted Labels

11/06/2019
by   Hansi Yang, et al.
0

Sample-selection approaches, which attempt to pick up clean instances from the noisy training data set, have become one promising direction to robust learning from corrupted labels. These methods all build on the memorization effect, which means deep networks learn easy patterns first and then gradually over-fit the training data set. In this paper, we show how to properly select instances so that the training process can benefit the most from the memorization effect is a hard problem. Specifically, memorization can heavily depend on many factors, e.g., data set and network architecture. Nonetheless, there still exist general patterns of how memorization can occur. These facts motivate us to exploit memorization by automated machine learning (AutoML) techniques. First, we design an expressive but compact search space based on observed general patterns. Then, we propose to use the natural gradient-based search algorithm to efficiently search through space. Finally, extensive experiments on both synthetic data sets and benchmark data sets demonstrate that the proposed method can not only be much efficient than existing AutoML algorithms but can also achieve much better performance than the state-of-the-art approaches for learning from corrupted labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2017

MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels

Recent studies have discovered that deep networks are capable of memoriz...
research
03/05/2020

Combating noisy labels by agreement: A joint training method with co-regularization

Deep Learning with noisy labels is a practically challenging problem in ...
research
06/28/2022

Parallel Instance Filtering for Malware Detection

Machine learning algorithms are widely used in the area of malware detec...
research
06/01/2021

Instance Correction for Learning with Open-set Noisy Labels

The problem of open-set noisy labels denotes that part of training data ...
research
06/20/2023

MILD: Modeling the Instance Learning Dynamics for Learning with Noisy Labels

Despite deep learning has achieved great success, it often relies on a l...
research
04/22/2021

Efficient Relation-aware Scoring Function Search for Knowledge Graph Embedding

The scoring function, which measures the plausibility of triplets in kno...
research
05/13/2019

The Softwarised Network Data Zoo

More and more management and orchestration approaches for (software) net...

Please sign up or login with your details

Forgot password? Click here to reset