On the Influence of Momentum Acceleration on Online Learning

03/14/2016
by   Kun Yuan, et al.
0

The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value. The size of the re-scaling is determined by the value of the momentum parameter. The equivalence result is established for all time instants and not only in steady-state. The analysis is carried out for general strongly convex and smooth risk functions, and is not limited to quadratic risks. One notable conclusion is that the well-known bene ts of momentum constructions for deterministic optimization problems do not necessarily carry over to the adaptive online setting when small constant step-sizes are used to enable continuous adaptation and learn- ing in the presence of persistent gradient noise. From simulations, the equivalence between momentum and standard stochastic gradient methods is also observed for non-differentiable and non-convex problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2020

Convergence of a Stochastic Gradient Method with Momentum for Nonsmooth Nonconvex Optimization

Stochastic gradient methods with momentum are widely used in application...
research
08/25/2022

Accelerated Sparse Recovery via Gradient Descent with Nonlinear Conjugate Gradient Momentum

This paper applies an idea of adaptive momentum for the nonlinear conjug...
research
03/21/2018

Stochastic Learning under Random Reshuffling

In empirical risk optimization, it has been observed that stochastic gra...
research
04/20/2017

Performance Limits of Stochastic Sub-Gradient Learning, Part II: Multi-Agent Case

The analysis in Part I revealed interesting properties for subgradient l...
research
03/22/2022

Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum

Most convergence guarantees for stochastic gradient descent with momentu...
research
10/29/2021

Does Momentum Help? A Sample Complexity Analysis

Momentum methods are popularly used in accelerating stochastic iterative...
research
02/12/2022

From Online Optimization to PID Controllers: Mirror Descent with Momentum

We study a family of first-order methods with momentum based on mirror d...

Please sign up or login with your details

Forgot password? Click here to reset