High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance

We derive high-dimensional scaling limits and fluctuations for the online least-squares Stochastic Gradient Descent (SGD) algorithm by taking the properties of the data generating model explicitly into consideration. Our approach treats the SGD iterates as an interacting particle system, where the expected interaction is characterized by the covariance structure of the input. Assuming smoothness conditions on moments of order up to eight orders, and without explicitly assuming Gaussianity, we establish the high-dimensional scaling limits and fluctuations in the form of infinite-dimensional Ordinary Differential Equations (ODEs) or Stochastic Differential Equations (SDEs). Our results reveal a precise three-step phase transition of the iterates; it goes from being ballistic, to diffusive, and finally to purely random behavior, as the noise variance goes from low, to moderate and finally to very-high noise setting. In the low-noise setting, we further characterize the precise fluctuations of the (scaled) iterates as infinite-dimensional SDEs. We also show the existence and uniqueness of solutions to the derived limiting ODEs and SDEs. Our results have several applications, including characterization of the limiting mean-square estimation or prediction errors and their fluctuations which can be obtained by analytically or numerically solving the limiting equations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2021

Stationary Behavior of Constant Stepsize SGD Type Algorithms: An Asymptotic Characterization

Stochastic approximation (SA) and stochastic gradient descent (SGD) algo...
research
04/09/2022

High-dimensional Asymptotics of Langevin Dynamics in Spiked Matrix Models

We study Langevin dynamics for recovering the planted signal in the spik...
research
10/02/2022

Stochastic optimization on matrices and a graphon McKean-Vlasov limit

We consider stochastic gradient descents on the space of large symmetric...
research
05/17/2018

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

We present a high-dimensional analysis of three popular algorithms, name...
research
02/20/2023

High-dimensional Central Limit Theorems for Linear Functionals of Online Least-Squares SGD

Stochastic gradient descent (SGD) has emerged as the quintessential meth...
research
08/17/2023

Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

We analyze the dynamics of streaming stochastic gradient descent (SGD) i...
research
10/08/2018

The Viterbi process, decay-convexity and parallelized maximum a-posteriori estimation

The Viterbi process is the limiting maximum a-posteriori estimate of the...

Please sign up or login with your details

Forgot password? Click here to reset