Regularization in neural network optimization via trimmed stochastic gradient descent with noisy label

12/21/2020
by   Kensuke Nakamura, et al.
0

Regularization is essential for avoiding over-fitting to training data in neural network optimization, leading to better generalization of the trained networks. The label noise provides a strong implicit regularization by replacing the target ground truth labels of training examples by uniform random labels. However, it may also cause undesirable misleading gradients due to the large loss associated with incorrect labels. We propose a first-order optimization method (Label-Noised Trim-SGD) which combines the label noise with the example trimming in order to remove the outliers. The proposed algorithm enables us to impose a large label noise and obtain a better regularization effect than the original methods. The quantitative analysis is performed by comparing the behavior of the label noise, the example trimming, and the proposed algorithm. We also present empirical results that demonstrate the effectiveness of our algorithm using the major benchmarks and the fundamental networks, where our method has successfully outperformed the state-of-the-art optimization methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Label Noise SGD Provably Prefers Flat Global Minimizers

In overparametrized models, the noise in stochastic gradient descent (SG...
research
02/19/2020

Improving Generalization by Controlling Label-Noise Information in Neural Network Weights

In the presence of noisy or incorrect labels, neural networks have the u...
research
06/15/2020

Shape Matters: Understanding the Implicit Bias of the Noise Covariance

The noise in stochastic gradient descent (SGD) provides a crucial implic...
research
03/24/2020

Robust and On-the-fly Dataset Denoising for Image Classification

Memorization in over-parameterized neural networks could severely hurt g...
research
07/19/2018

Label Aggregation via Finding Consensus Between Models

Label aggregation is an efficient and low cost way to make large dataset...
research
03/02/2023

Over-training with Mixup May Hurt Generalization

Mixup, which creates synthetic training instances by linearly interpolat...
research
11/20/2020

Adversarial Training for EM Classification Networks

We present a novel variant of Domain Adversarial Networks with impactful...

Please sign up or login with your details

Forgot password? Click here to reset