
Natasha 2: Faster NonConvex Optimization Than SGD
We design a stochastic algorithm to train any smooth neural network to ε...
read it

Why Do Local Methods Solve Nonconvex Problems?
Nonconvex optimization is ubiquitous in modern machine learning. Resear...
read it

Lower Bounds for Smooth Nonconvex FiniteSum Optimization
Smooth finitesum optimization has been widely studied in both convex an...
read it

Linear Speedup in SaddlePoint Escape for Decentralized NonConvex Optimization
Under appropriate cooperation protocols and parameter choices, fully dec...
read it

Breaking Reversibility Accelerates Langevin Dynamics for Global NonConvex Optimization
Langevin dynamics (LD) has been proven to be a powerful technique for op...
read it

A Conservation Law Method in Optimization
We propose some algorithms to find local minima in nonconvex optimizatio...
read it

Nonstationary Stochastic Optimization with Local Spatial and Temporal Changes
We consider a nonstationary sequential stochastic optimization problem,...
read it
Natasha: Faster NonConvex Stochastic Optimization Via Strongly NonConvex Parameter
Given a nonconvex function f(x) that is an average of n smooth functions, we design stochastic firstorder methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue σ of the Hessian. This parameter σ captures how strongly nonconvex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known (offline) methods for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ_0 so that the currently fastest methods for σ>σ_0 and for σ<σ_0 have different behaviors: the former scales with n^2/3 and the latter scales with n^3/4.
READ FULL TEXT
Comments
There are no comments yet.