First-order Methods Almost Always Avoid Saddle Points

10/20/2017
by   Jason D. Lee, et al.
0

We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Theorem allow for a global stability analysis. Thus, neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid saddle points.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2016

Gradient Descent Converges to Minimizers

We show that gradient descent converges to a local minimizer, almost sur...
research
07/11/2018

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

Motivated by applications in Optimization, Game Theory, and the training...
research
08/07/2019

Distributed Gradient Descent: Nonconvergence to Saddle Points and the Stable-Manifold Theorem

The paper studies a distributed gradient descent (DGD) process and consi...
research
08/03/2022

Gradient descent provably escapes saddle points in the training of shallow ReLU networks

Dynamical systems theory has recently been applied in optimization to pr...
research
10/29/2019

Efficiently avoiding saddle points with zero order methods: No gradients required

We consider the case of derivative-free algorithms for non-convex optimi...
research
05/25/2023

Two-timescale Extragradient for Finding Local Minimax Points

Minimax problems are notoriously challenging to optimize. However, we de...

Please sign up or login with your details

Forgot password? Click here to reset