Rigorous dynamical mean field theory for stochastic gradient descent methods

10/12/2022
by   Cédric Gerbelot, et al.
0

We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e.g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk minimization. This includes widely used algorithms such as stochastic gradient descent (SGD) or Nesterov acceleration. The obtained equations match those resulting from the discretization of dynamical mean-field theory (DMFT) equations from statistical physics when applied to gradient flow. Our proof method allows us to give an explicit description of how memory kernels build up in the effective dynamics, and to include non-separable update functions, allowing datasets with non-identity covariance matrices. Finally, we provide numerical implementations of the equations for SGD with generic extensive batch-size and with constant learning rates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2020

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

We analyze in a closed form the learning dynamics of stochastic gradient...
research
09/09/2023

Stochastic Gradient Descent outperforms Gradient Descent in recovering a high-dimensional signal in a glassy energy landscape

Stochastic Gradient Descent (SGD) is an out-of-equilibrium algorithm use...
research
11/03/2021

Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks

Understanding the properties of neural networks trained via stochastic g...
research
02/29/2020

Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning

The success of deep learning is due, to a great extent, to the remarkabl...
research
12/14/2021

The high-dimensional asymptotics of first order methods with random data

We study a class of deterministic flows in ℝ^d× k, parametrized by a ran...
research
10/11/2019

A Finite-Volume Method for Fluctuating Dynamical Density Functional Theory

In this work we introduce a finite-volume numerical scheme for solving s...
research
02/23/2021

Online Stochastic Gradient Descent Learns Linear Dynamical Systems from A Single Trajectory

This work investigates the problem of estimating the weight matrices of ...

Please sign up or login with your details

Forgot password? Click here to reset