On the Implicit Bias of Adam

08/31/2023
by   Matias D. Cattaneo, et al.
0

In previous literature, backward error analysis was used to find ordinary differential equations (ODEs) approximating the gradient descent trajectory. It was found that finite step sizes implicitly regularize solutions because terms appearing in the ODEs penalize the two-norm of the loss gradients. We prove that the existence of similar implicit regularization in RMSProp and Adam depends on their hyperparameters and the training stage, but with a different "norm" involved: the corresponding ODE terms either penalize the (perturbed) one-norm of the loss gradients or, on the contrary, hinder its decrease (the latter case being typical). We also conduct numerical experiments and discuss how the proven facts can influence generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2020

Implicit Gradient Regularization

Gradient descent can be surprisingly good at optimizing deep neural netw...
research
05/25/2017

Implicit Regularization in Matrix Factorization

We study implicit regularization when optimizing an underdetermined quad...
research
02/02/2023

Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent

It is well known that the finite step-size (h) in Gradient Descent (GD) ...
research
08/28/2019

Linear Convergence of Adaptive Stochastic Gradient Descent

We prove that the norm version of the adaptive stochastic gradient metho...
research
06/30/2021

Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization

For deep linear networks (DLN), various hyperparameters alter the dynami...
research
09/20/2020

"Hey, that's not an ODE": Faster ODE Adjoints with 12 Lines of Code

Neural differential equations may be trained by backpropagating gradient...
research
03/25/2020

Two almost-circles, and two real ones

Implicit locus equations in GeoGebra allow the user to do experiments wi...

Please sign up or login with your details

Forgot password? Click here to reset