Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations

11/05/2018
by   Qianxiao Li, et al.
0

We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as an weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent (SGD), momentum SGD and stochastic Nesterov's accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover important analytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/19/2015

Stochastic modified equations and adaptive stochastic gradient algorithms

We develop the method of stochastic modified equations (SME), in which s...
research
11/20/2019

Bayesian interpretation of SGD as Ito process

The current interpretation of stochastic gradient descent (SGD) as a sto...
research
06/01/2022

Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis

When solving finite-sum minimization problems, two common alternatives t...
research
03/07/2020

Stochastic Modified Equations for Continuous Limit of Stochastic ADMM

Stochastic version of alternating direction method of multiplier (ADMM) ...
research
11/27/2017

Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms

This paper investigates asymptotic behaviors of gradient descent algorit...
research
02/02/2019

Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation

Diffusion approximation provides weak approximation for stochastic gradi...
research
07/14/2018

Generalization in quasi-periodic environments

By and large the behavior of stochastic gradient is regarded as a challe...

Please sign up or login with your details

Forgot password? Click here to reset