Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis

04/30/2019
by   Xi Chen, et al.
0

Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm in stochastic optimization. Recent work by Zhang et al. [2017] presents an analysis for the hitting time of SGLD for the first and second order stationary points. The proof in Zhang et al. [2017] is a two-stage procedure through bounding the Cheeger's constant, which is rather complicated and leads to loose bounds. In this paper, using intuitions from stochastic differential equations, we provide a direct analysis for the hitting times of SGLD to the first and second order stationary points. Our analysis is straightforward. It only relies on basic linear algebra and probability theory tools. Our direct analysis also leads to tighter bounds comparing to Zhang et al. [2017] and shows the explicit dependence of the hitting time on different factors, including dimensionality, smoothness, noise strength, and step size effects. Under suitable conditions, we show that the hitting time of SGLD to first-order stationary points can be dimension-independent. Moreover, we apply our analysis to study several important online estimation problems in machine learning, including linear regression, matrix factorization, and online PCA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2019

Stochastic Gradient Descent Escapes Saddle Points Efficiently

This paper considers the perturbed stochastic gradient descent algorithm...
research
04/19/2019

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

We analyze stochastic gradient algorithms for optimizing nonconvex probl...
research
10/21/2020

On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm

In this work, we unify several expected generalization error bounds base...
research
04/11/2021

Learning from Censored and Dependent Data: The case of Linear Dynamics

Observations from dynamical systems often exhibit irregularities, such a...
research
05/30/2023

KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization

Second order stochastic optimizers allow parameter update step size and ...
research
11/17/2020

Resolving Molecular Contributions of Ion Channel Noise to Interspike Interval Variability through Stochastic Shielding

The contributions of independent noise sources to the variability of act...
research
05/06/2021

Probablistic Bigraphs

Bigraphs are a universal computational modelling formalism for the spati...

Please sign up or login with your details

Forgot password? Click here to reset