Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

10/25/2021
by   Zixiang Chen, et al.
0

Escaping from saddle points and finding local minima is a central problem in nonconvex optimization. Perturbed gradient methods are perhaps the simplest approach for this problem. However, to find (ϵ, √(ϵ))-approximate local minima, the existing best stochastic gradient complexity for this type of algorithms is Õ(ϵ^-3.5), which is not optimal. In this paper, we propose , a faster perturbed stochastic gradient framework for finding local minima. We show that Pullback with stochastic gradient estimators such as SARAH/SPIDER and STORM can find (ϵ, ϵ_H)-approximate local minima within Õ(ϵ^-3 + ϵ_H^-6) stochastic gradient evaluations (or Õ(ϵ^-3) when ϵ_H = √(ϵ)). The core idea of our framework is a step-size “pullback” scheme to control the average movement of the iterates, which leads to faster convergence to the local minima. Experiments on matrix factorization problems corroborate our theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2017

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

We propose stochastic optimization algorithms that can find local minima...
research
11/08/2017

Stochastic Cubic Regularization for Fast Nonconvex Optimization

This paper proposes a stochastic variant of a classic algorithm---the cu...
research
06/29/2020

Optimization Landscape of Tucker Decomposition

Tucker decomposition is a popular technique for many data analysis and m...
research
12/01/2021

Spurious Valleys, Spurious Minima and NP-hardness of Sparse Matrix Factorization With Fixed Support

The problem of approximating a dense matrix by a product of sparse facto...
research
06/14/2023

Noise Stability Optimization for Flat Minima with Optimal Convergence Rates

We consider finding flat, local minimizers by adding average weight pert...
research
05/10/2019

The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?

An open problem in machine learning is whether flat minima generalize be...
research
02/02/2022

Flipping the switch on local exploration: Genetic Algorithms with Reversals

One important feature of complex systems are problem domains that have m...

Please sign up or login with your details

Forgot password? Click here to reset