Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

02/02/2017 ∙ by Zeyuan Allen-Zhu, et al. ∙ 0

Given a nonconvex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue -σ of the Hessian. This parameter σ captures how strongly nonconvex f(x) is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known (offline) methods for a range of parameter σ, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold σ_0 so that the currently fastest methods for σ>σ_0 and for σ<σ_0 have different behaviors: the former scales with n^2/3 and the latter scales with n^3/4.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.