DeepAI AI Chat
Log In Sign Up

Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

by   Zeyuan Allen-Zhu, et al.

Given a nonconvex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue -σ of the Hessian. This parameter σ captures how strongly nonconvex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known (offline) methods for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ_0 so that the currently fastest methods for σ>σ_0 and for σ<σ_0 have different behaviors: the former scales with n^2/3 and the latter scales with n^3/4.


page 1

page 2

page 3

page 4


Natasha 2: Faster Non-Convex Optimization Than SGD

We design a stochastic algorithm to train any smooth neural network to ε...

Why Do Local Methods Solve Nonconvex Problems?

Non-convex optimization is ubiquitous in modern machine learning. Resear...

Lower Bounds for Smooth Nonconvex Finite-Sum Optimization

Smooth finite-sum optimization has been widely studied in both convex an...

Linear Speedup in Saddle-Point Escape for Decentralized Non-Convex Optimization

Under appropriate cooperation protocols and parameter choices, fully dec...

First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions

Machine learning algorithms typically perform optimization over a class ...

Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization

Langevin dynamics (LD) has been proven to be a powerful technique for op...

A Conservation Law Method in Optimization

We propose some algorithms to find local minima in nonconvex optimizatio...