Average-case Acceleration Through Spectral Density Estimation

02/12/2020
by   Fabian Pedregosa, et al.
0

We develop a framework for designing optimal quadratic optimization methods in terms of their average-case runtime. This yields a new class of methods that achieve acceleration through a model of the Hessian's expected spectral density. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions. These methods are momentum-based gradient algorithms whose hyper-parameters can be estimated without knowledge of the Hessian's smallest singular value, in contrast with classical accelerated methods like Nesterov acceleration and Polyak momentum. Empirical results on quadratic, logistic regression and neural networks show the proposed methods always match and in many cases significantly improve over classical accelerated methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2022

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

Heavy Ball (HB) nowadays is one of the most popular momentum methods in ...
research
07/07/2016

Nesterov's Accelerated Gradient and Momentum as approximations to Regularised Update Descent

We present a unifying framework for adapting the update direction in gra...
research
01/23/2021

Acceleration Methods

This monograph covers some recent advances on a range of acceleration te...
research
06/19/2020

How Does Momentum Help Frank Wolfe?

We unveil the connections between Frank Wolfe (FW) type algorithms and t...
research
06/07/2021

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

We analyze a class of stochastic gradient algorithms with momentum on a ...
research
01/17/2020

Gradient descent with momentum — to accelerate or to super-accelerate?

We consider gradient descent with `momentum', a widely used method for l...
research
07/18/2018

Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders

RMSProp and ADAM continue to be extremely popular algorithms for trainin...

Please sign up or login with your details

Forgot password? Click here to reset