Just a Momentum: Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problem

02/23/2021
by   Stefano Sarao Mannelli, et al.
0

When optimizing over loss functions it is common practice to use momentum-based accelerated methods rather than vanilla gradient-based method. Despite widely applied to arbitrary loss function, their behaviour in generically non-convex, high dimensional landscapes is poorly understood. In this work we used dynamical mean field theory techniques to describe analytically the average behaviour of these methods in a prototypical non-convex model: the (spiked) matrix-tensor model. We derive a closed set of equations that describe the behaviours of several algorithms including heavy-ball momentum and Nesterov acceleration. Additionally we characterize the evolution of a mathematically equivalent physical system of massive particles relaxing toward the bottom of an energetic landscape. Under the correct mapping the two dynamics are equivalent and it can be noticed that having a large mass increases the effective time step of the heavy ball dynamics leading to a speed up.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2022

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

Heavy Ball (HB) nowadays is one of the most popular momentum methods in ...
research
10/04/2020

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

The Heavy Ball Method, proposed by Polyak over five decades ago, is a fi...
research
07/13/2022

AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation

Recent work by Xia et al. leveraged the continuous-limit of the classica...
research
07/28/2023

Minimal error momentum Bregman-Kaczmarz

The Bregman-Kaczmarz method is an iterative method which can solve stron...
research
03/01/2023

A Unified Momentum-based Paradigm of Decentralized SGD for Non-Convex Models and Heterogeneous Data

Emerging distributed applications recently boosted the development of de...
research
01/26/2022

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

We introduce a novel framework for optimization based on energy-conservi...
research
06/07/2021

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

We analyze a class of stochastic gradient algorithms with momentum on a ...

Please sign up or login with your details

Forgot password? Click here to reset