Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization

08/19/2019
by   Stefan Vlaski, et al.
5

Recent years have seen increased interest in performance guarantees of gradient descent algorithms for non-convex optimization. A number of works have uncovered that gradient noise plays a critical role in the ability of gradient descent recursions to efficiently escape saddle-points and reach second-order stationary points. Most available works limit the gradient noise component to be bounded with probability one or sub-Gaussian and leverage concentration inequalities to arrive at high-probability results. We present an alternate approach, relying primarily on mean-square arguments and show that a more relaxed relative bound on the gradient noise variance is sufficient to ensure efficient escape from saddle-points without the need to inject additional noise, employ alternating step-sizes or rely on a global dispersive noise assumption, as long as a gradient noise component is present in a descent direction for every saddle-point.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2019

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Gradient descent and its variants are widely used in machine learning. H...
research
03/02/2017

How to Escape Saddle Points Efficiently

This paper shows that a perturbed form of gradient descent converges to ...
research
09/25/2017

On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

The Hessian-vector product has been utilized to find a second-order stat...
research
07/03/2019

Distributed Learning in Non-Convex Environments – Part II: Polynomial Escape from Saddle-Points

The diffusion strategy for distributed learning from streaming data empl...
research
07/03/2019

Distributed Learning in Non-Convex Environments – Part I: Agreement at a Linear Rate

Driven by the need to solve increasingly complex optimization problems i...
research
11/30/2022

Swarm-Based Gradient Descent Method for Non-Convex Optimization

We introduce a new Swarm-Based Gradient Descent (SBGD) method for non-co...

Please sign up or login with your details

Forgot password? Click here to reset