Stochastic Gradient Descent Escapes Saddle Points Efficiently

02/13/2019
by   Chi Jin, et al.
20

This paper considers the perturbed stochastic gradient descent algorithm and shows that it finds ϵ-second order stationary points (∇ f(x)≤ϵ and ∇^2 f(x) ≽ -√(ϵ)I) in Õ(d/ϵ^4) iterations, giving the first result that has linear dependence on dimension for this setting. For the special case, where stochastic gradients are Lipschitz, the dependence on dimension reduces to polylogarithmic. In addition to giving new results, this paper also presents a simplified proof strategy that gives a shorter and more elegant proof of previously known results (Jin et al. 2017) on perturbed gradient descent algorithm.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset