Escaping Saddle-Points Faster under Interpolation-like Conditions

by   Abhishek Roy, et al.

In this paper, we show that under over-parametrization several standard stochastic optimization algorithms escape saddle-points and converge to local-minimizers much faster. One of the fundamental aspects of over-parametrized models is that they are capable of interpolating the training data. We show that, under interpolation-like assumptions satisfied by the stochastic gradients in an over-parametrization setting, the first-order oracle complexity of Perturbed Stochastic Gradient Descent (PSGD) algorithm to reach an Ïĩ-local-minimizer, matches the corresponding deterministic rate of 𝒊Ėƒ(1/Ïĩ^2). We next analyze Stochastic Cubic-Regularized Newton (SCRN) algorithm under interpolation-like conditions, and show that the oracle complexity to reach an Ïĩ-local-minimizer under interpolation-like conditions, is 𝒊Ėƒ(1/Ïĩ^2.5). While this obtained complexity is better than the corresponding complexity of either PSGD, or SCRN without interpolation-like assumptions, it does not match the rate of 𝒊Ėƒ(1/Ïĩ^1.5) corresponding to deterministic Cubic-Regularized Newton method. It seems further Hessian-based interpolation-like assumptions are necessary to bridge this gap. We also discuss the corresponding improved complexities in the zeroth-order settings.



page 1

page 2

page 3

page 4

∙ 11/08/2017

Stochastic Cubic Regularization for Fast Nonconvex Optimization

This paper proposes a stochastic variant of a classic algorithm---the cu...
∙ 06/15/2020

Improved Complexities for Stochastic Conditional Gradient Methods under Interpolation-like Conditions

We analyze stochastic conditional gradient type methods for constrained ...
∙ 02/21/2020

Stochastic Subspace Cubic Newton Method

In this paper, we propose a new randomized second-order optimization alg...
∙ 10/09/2020

Reparametrizing gradient descent

In this work, we propose an optimization algorithm which we call norm-ad...
∙ 10/11/2019

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

We consider stochastic second order methods for minimizing strongly-conv...
∙ 04/15/2019

The Numerical Stability of Regularized Barycentric Interpolation Formulae for Interpolation and Extrapolation

The ℓ_2- and ℓ_1-regularized modified Lagrange interpolation formulae ov...
∙ 01/29/2020

Complexity Analysis of a Stochastic Cubic Regularisation Method under Inexact Gradient Evaluations and Dynamic Hessian Accuracy

We here adapt an extended version of the adaptive cubic regularisation m...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.