Anticorrelated Noise Injection for Improved Generalization

02/06/2022
by   Antonio Orvieto, et al.
0

Injecting artificial noise into gradient descent (GD) is commonly employed to improve the performance of machine learning models. Usually, uncorrelated noise is used in such perturbed gradient descent (PGD) methods. It is, however, not known if this is optimal or whether other types of noise could provide better generalization performance. In this paper, we zoom in on the problem of correlating the perturbations of consecutive PGD steps. We consider a variety of objective functions for which we find that GD with anticorrelated perturbations ("Anti-PGD") generalizes significantly better than GD and standard (uncorrelated) PGD. To support these experimental findings, we also derive a theoretical analysis that demonstrates that Anti-PGD moves to wider minima, while GD and PGD remain stuck in suboptimal regions or even diverge. This new connection between anticorrelated noise and generalization opens the field to novel ways to exploit noise for training machine learning models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2023

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

Gradient regularization, as described in <cit.>, is a highly effective t...
research
09/19/2022

On the Theoretical Properties of Noise Correlation in Stochastic Optimization

Studying the properties of stochastic noise to optimize complex non-conv...
research
06/09/2022

Explicit Regularization in Overparametrized Models via Noise Injection

Injecting noise within gradient descent has several desirable features. ...
research
06/11/2019

Power Gradient Descent

The development of machine learning is promoting the search for fast and...
research
04/25/2023

Learning Trajectories are Generalization Indicators

The aim of this paper is to investigate the connection between learning ...
research
11/17/2022

Why Deep Learning Generalizes

Very large deep learning models trained using gradient descent are remar...
research
02/14/2023

The Missing Margin: How Sample Corruption Affects Distance to the Boundary in ANNs

Classification margins are commonly used to estimate the generalization ...

Please sign up or login with your details

Forgot password? Click here to reset