Log In Sign Up

Finding Local Minima via Stochastic Nested Variance Reduction

by   Dongruo Zhou, et al.

We propose two algorithms that can find local minima faster than the state-of-the-art algorithms in both finite-sum and general stochastic nonconvex optimization. At the core of the proposed algorithms is One-epoch-SNVRG^+ using stochastic nested variance reduction (Zhou et al., 2018a), which outperforms the state-of-the-art variance reduction algorithms such as SCSG (Lei et al., 2017). In particular, for finite-sum optimization problems, the proposed SNVRG^++Neon2^finite algorithm achieves Õ(n^1/2ϵ^-2+nϵ_H^-3+n^3/4ϵ_H^-7/2) gradient complexity to converge to an (ϵ, ϵ_H)-second-order stationary point, which outperforms SVRG+Neon2^finite (Allen-Zhu and Li, 2017) , the best existing algorithm, in a wide regime. For general stochastic optimization problems, the proposed SNVRG^++Neon2^online achieves Õ(ϵ^-3+ϵ_H^-5+ϵ^-2ϵ_H^-3) gradient complexity, which is better than both SVRG+Neon2^online (Allen-Zhu and Li, 2017) and Natasha2 (Allen-Zhu, 2017) in certain regimes. Furthermore, we explore the acceleration brought by third-order smoothness of the objective function.


page 1

page 2

page 3

page 4


Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

We propose stochastic optimization algorithms that can find local minima...

Stochastic Nested Variance Reduction for Nonconvex Optimization

We study finite-sum nonconvex optimization problems, where the objective...

Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization

We propose and analyze several stochastic gradient algorithms for findin...

Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently

We propose a family of nonconvex optimization algorithms that are able t...

On the Adaptivity of Stochastic Gradient-Based Optimization

Stochastic-gradient-based optimization has been a core enabling methodol...

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

This paper proposes an algorithm (RMDA) for training neural networks (NN...

Multi-Level Composite Stochastic Optimization via Nested Variance Reduction

We consider multi-level composite optimization problems where each mappi...