Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

06/20/2022
by   Loucas Pillaud-Vivien, et al.
0

Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks. In this paper, we study the role of the label noise in the training dynamics of a quadratically parametrised model through its continuous time version. We explicitly characterise the solution chosen by the stochastic flow and prove that it implicitly solves a Lasso program. To fully complete our analysis, we provide nonasymptotic convergence guarantees for the dynamics as well as conditions for support recovery. We also give experimental results which support our theoretical claims. Our findings highlight the fact that structured noise can induce better generalisation and help explain the greater performances of stochastic dynamics as observed in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2021

Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity

Understanding the implicit bias of training algorithms is of crucial imp...
research
02/13/2021

Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections

Gaussian noise injections (GNIs) are a family of simple and widely-used ...
research
05/21/2018

Stochastic modified equations for the asynchronous stochastic gradient descent

We propose a stochastic modified equations (SME) for modeling the asynch...
research
03/03/2023

Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks

Physics-informed neural networks (PINNs) have effectively been demonstra...
research
04/02/2023

Saddle-to-Saddle Dynamics in Diagonal Linear Networks

In this paper we fully describe the trajectory of gradient flow over dia...

Please sign up or login with your details

Forgot password? Click here to reset