Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

10/11/2019
by   Si Yi Meng, et al.
11

We consider stochastic second order methods for minimizing strongly-convex functions under an interpolation condition satisfied by over-parameterized models. Under this condition, we show that the regularized sub-sampled Newton method (R-SSN) achieves global linear convergence with an adaptive step size and a constant batch size. By growing the batch size for both the sub-sampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyse stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic L-BFGS and a "Hessian-free" implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and consider real medium-size datasets under a kernel mapping. Our experimental results show the fast convergence of these methods both in terms of the number of iterations and wall-clock time.

READ FULL TEXT
research
06/11/2020

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

As adaptive gradient methods are typically used for training over-parame...
research
07/17/2022

SP2: A Second Order Stochastic Polyak Method

Recently the "SP" (Stochastic Polyak step size) method has emerged as a ...
research
11/03/2022

Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods

This work proposes a universal and adaptive second-order method for mini...
research
05/09/2015

Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

We propose a randomized second-order method for optimization known as th...
research
06/03/2020

On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs

Following early work on Hessian-free methods for deep learning, we study...
research
09/28/2020

Escaping Saddle-Points Faster under Interpolation-like Conditions

In this paper, we show that under over-parametrization several standard ...
research
11/28/2022

Stochastic Steffensen method

Is it possible for a first-order method, i.e., only first derivatives al...

Please sign up or login with your details

Forgot password? Click here to reset