Stochastic Gradient Descent Escapes Saddle Points Efficiently

02/13/2019
by   Chi Jin, et al.
20

This paper considers the perturbed stochastic gradient descent algorithm and shows that it finds ϵ-second order stationary points (∇ f(x)≤ϵ and ∇^2 f(x) ≽ -√(ϵ)I) in Õ(d/ϵ^4) iterations, giving the first result that has linear dependence on dimension for this setting. For the special case, where stochastic gradients are Lipschitz, the dependence on dimension reduces to polylogarithmic. In addition to giving new results, this paper also presents a simplified proof strategy that gives a shorter and more elegant proof of previously known results (Jin et al. 2017) on perturbed gradient descent algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2019

Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis

Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm ...
research
08/17/2020

A Realistic Example in 2 Dimension that Gradient Descent Takes Exponential Time to Escape Saddle Points

Gradient descent is a popular algorithm in optimization, and its perform...
research
11/28/2021

Escape saddle points by a simple gradient-descent based algorithm

Escaping saddle points is a central research topic in nonconvex optimiza...
research
02/09/2016

Poor starting points in machine learning

Poor (even random) starting points for learning/training/optimization ar...
research
07/03/2019

Distributed Learning in Non-Convex Environments – Part II: Polynomial Escape from Saddle-Points

The diffusion strategy for distributed learning from streaming data empl...
research
05/27/2022

HOUDINI: Escaping from Moderately Constrained Saddles

We give the first polynomial time algorithms for escaping from high-dime...

Please sign up or login with your details

Forgot password? Click here to reset