Heavy-ball Algorithms Always Escape Saddle Points

by   Tao Sun, et al.
National University of Defense Technology
Hunan University
NetEase, Inc

Nonconvex optimization algorithms with random initialization have attracted increasing attention recently. It has been showed that many first-order methods always avoid saddle points with random starting points. In this paper, we answer a question: can the nonconvex heavy-ball algorithms with random initialization avoid saddle points? The answer is yes! Direct using the existing proof technique for the heavy-ball algorithms is hard due to that each iteration of the heavy-ball algorithm consists of current and last points. It is impossible to formulate the algorithms as iteration like xk+1= g(xk) under some mapping g. To this end, we design a new mapping on a new space. With some transfers, the heavy-ball algorithm can be interpreted as iterations after this mapping. Theoretically, we prove that heavy-ball gradient descent enjoys larger stepsize than the gradient descent to escape saddle points to escape the saddle point. And the heavy-ball proximal point algorithm is also considered; we also proved that the algorithm can always escape the saddle point.


page 1

page 2

page 3

page 4


Stochastic Heavy Ball

This paper deals with a natural stochastic optimization procedure derive...

A nonsmooth nonconvex descent algorithm

The paper presents a new descent algorithm for locally Lipschitz continu...

Ball k-means

This paper presents a novel accelerated exact k-means algorithm called t...

Efficient Projection Algorithms onto the Weighted l1 Ball

Projected gradient descent has been proved efficient in many optimizatio...

An Iteratively Reweighted Method for Sparse Optimization on Nonconvex ℓ_p Ball

This paper is intended to solve the nonconvex ℓ_p-ball constrained nonli...

Optimization-Based Separations for Neural Networks

Depth separation results propose a possible theoretical explanation for ...

Convergence of Momentum-Based Heavy Ball Method with Batch Updating and/or Approximate Gradients

In this paper, we study the well-known "Heavy Ball" method for convex an...

Please sign up or login with your details

Forgot password? Click here to reset