Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statistics, such as the risk and other measures of sub-optimality. This equivalence holds with overwhelming probability when the model parameter count grows proportionally to the number of data. This framework allows us to obtain learning rate thresholds for stability of SGD as well as convergence guarantees. In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates. Finally, we illustrate this theory on some standard examples and show numerical simulations which give an excellent match to the theory.

READ FULL TEXT
research
02/01/2022

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Despite the non-convex optimization landscape, over-parametrized shallow...
research
09/24/2020

How Many Factors Influence Minima in SGD?

Stochastic gradient descent (SGD) is often applied to train Deep Neural ...
research
05/14/2022

Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties

We develop a stochastic differential equation, called homogenized SGD, f...
research
06/15/2022

Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions

Stochastic gradient descent (SGD) is a pillar of modern machine learning...
research
05/29/2023

Escaping mediocrity: how two-layer networks learn hard single-index models with SGD

This study explores the sample complexity for two-layer neural networks ...
research
08/03/2023

Online covariance estimation for stochastic gradient descent under Markovian sampling

We study the online overlapping batch-means covariance estimator for Sto...
research
04/03/2023

High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance

We derive high-dimensional scaling limits and fluctuations for the onlin...

Please sign up or login with your details

Forgot password? Click here to reset