Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

by   Jun-Kun Wang, et al.

Heavy Ball (HB) nowadays is one of the most popular momentum methods in non-convex optimization. It has been widely observed that incorporating the Heavy Ball dynamic in gradient-based methods accelerates the training process of modern machine learning models. However, the progress on establishing its theoretical foundation of acceleration is apparently far behind its empirical success. Existing provable acceleration results are of the quadratic or close-to-quadratic functions, as the current techniques of showing HB's acceleration are limited to the case when the Hessian is fixed. In this work, we develop some new techniques that help show acceleration beyond quadratics, which is achieved by analyzing how the change of the Hessian at two consecutive time points affects the convergence speed. Based on our technical results, a class of Polyak-Łojasiewicz (PL) optimization problems for which provable acceleration can be achieved via HB is identified. Moreover, our analysis demonstrates a benefit of adaptively setting the momentum parameter.


page 1

page 2

page 3

page 4


On the fast convergence of minibatch heavy ball momentum

Simple stochastic momentum methods are widely used in machine learning o...

Just a Momentum: Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problem

When optimizing over loss functions it is common practice to use momentu...

Average-case Acceleration Through Spectral Density Estimation

We develop a framework for designing optimal quadratic optimization meth...

Understanding Modern Techniques in Optimization: Frank-Wolfe, Nesterov's Momentum, and Polyak's Momentum

In the first part of this dissertation research, we develop a modular fr...

Acceleration Methods

This monograph covers some recent advances on a range of acceleration te...

Momentum Accelerated Multigrid Methods

In this paper, we propose two momentum accelerated MG cycles. The main i...

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

The Heavy Ball Method, proposed by Polyak over five decades ago, is a fi...