On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

09/25/2017
by   Mingrui Liu, et al.
0

The Hessian-vector product has been utilized to find a second-order stationary solution with strong complexity guarantee (e.g., almost linear time complexity in the problem's dimensionality). In this paper, we propose to further reduce the number of Hessian-vector products for faster non-convex optimization. Previous algorithms need to approximate the smallest eigen-value with a sufficient precision (e.g., ϵ_2≪ 1) in order to achieve a sufficiently accurate second-order stationary solution (i.e., λ_(∇^2 f())≥ -ϵ_2). In contrast, the proposed algorithms only need to compute the smallest eigen-vector approximating the corresponding eigen-value up to a small power of current gradient's norm. As a result, it can dramatically reduce the number of Hessian-vector products during the course of optimization before reaching first-order stationary points (e.g., saddle points). The key building block of the proposed algorithms is a novel updating step named the NCG step, which lets a noisy negative curvature descent compete with the gradient descent. We show that the worst-case time complexity of the proposed algorithms with their favorable prescribed accuracy requirements can match the best in literature for achieving a second-order stationary point but with an arguably smaller per-iteration cost. We also show that the proposed algorithms can benefit from inexact Hessian by developing their variants accepting inexact Hessian under a mild condition for achieving the same goal. Moreover, we develop a stochastic algorithm for a finite or infinite sum non-convex optimization problem. To the best of our knowledge, the proposed stochastic algorithm is the first one that converges to a second-order stationary point in high probability with a time complexity independent of the sample size and almost linear in dimensionality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2017

Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence

In this paper, we study stochastic non-convex optimization with non-conv...
research
11/03/2017

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

Two classes of methods have been proposed for escaping from saddle point...
research
09/17/2019

Quantum algorithm for finding the negative curvature direction in non-convex optimization

We present an efficient quantum algorithm aiming to find the negative cu...
research
07/09/2019

SNAP: Finding Approximate Second-Order Stationary Solutions Efficiently for Non-convex Linearly Constrained Problems

This paper proposes low-complexity algorithms for finding approximate se...
research
05/30/2023

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

In this paper, we consider non-convex multi-block bilevel optimization (...
research
08/19/2019

Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization

Recent years have seen increased interest in performance guarantees of g...
research
05/07/2018

Implementation of Stochastic Quasi-Newton's Method in PyTorch

In this paper, we implement the Stochastic Damped LBFGS (SdLBFGS) for st...

Please sign up or login with your details

Forgot password? Click here to reset