Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties

05/14/2022
by   Courtney Paquette, et al.
0

We develop a stochastic differential equation, called homogenized SGD, for analyzing the dynamics of stochastic gradient descent (SGD) on a high-dimensional random least squares problem with ℓ^2-regularization. We show that homogenized SGD is the high-dimensional equivalence of SGD – for any quadratic statistic (e.g., population risk with quadratic loss), the statistic under the iterates of SGD converges to the statistic under homogenized SGD when the number of samples n and number of features d are polynomially related (d^c < n < d^1/c for some c > 0). By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the generalization performance of SGD in terms of a solution of a Volterra integral equation. Further we provide the exact value of the limiting excess risk in the case of quadratic losses when trained by SGD. The analysis is formulated for data matrices and target vectors that satisfy a family of resolvent conditions, which can roughly be viewed as a weak (non-quantitative) form of delocalization of sample-side singular vectors of the data. Several motivating applications are provided including sample covariance matrices with independent samples and random features with non-generative model targets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2022

Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions

Stochastic gradient descent (SGD) is a pillar of modern machine learning...
research
04/13/2023

High-dimensional limit of one-pass SGD on least squares

We give a description of the high-dimensional limit of one-pass single-b...
research
02/08/2021

SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

We propose a new framework, inspired by random matrix theory, for analyz...
research
02/20/2023

High-dimensional Central Limit Theorems for Linear Functionals of Online Least-Squares SGD

Stochastic gradient descent (SGD) has emerged as the quintessential meth...
research
03/03/2023

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples

This paper considers the problem of learning a single ReLU neuron with s...
research
03/23/2020

A classification for the performance of online SGD for high-dimensional inference

Stochastic gradient descent (SGD) is a popular algorithm for optimizatio...
research
08/17/2023

Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

We analyze the dynamics of streaming stochastic gradient descent (SGD) i...

Please sign up or login with your details

Forgot password? Click here to reset