DeepAI
Log In Sign Up

How Does Momentum Help Frank Wolfe?

06/19/2020
by   Bingcong Li, et al.
0

We unveil the connections between Frank Wolfe (FW) type algorithms and the momentum in Accelerated Gradient Methods (AGM). On the negative side, these connections illustrate why momentum is unlikely to be effective for FW type algorithms. The encouraging message behind this link, on the other hand, is that momentum is useful for FW on a class of problems. In particular, we prove that a momentum variant of FW, that we term accelerated Frank Wolfe (AFW), converges with a faster rate Õ(1/k^2) on certain constraint sets despite the same O(1/k) rate as FW on general cases. Given the possible acceleration of AFW at almost no extra cost, it is thus a competitive alternative to FW. Numerical experiments on benchmarked machine learning tasks further validate our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/18/2016

Katyusha: The First Direct Acceleration of Stochastic Gradient Methods

Nesterov's momentum trick is famously known for accelerating gradient de...
10/30/2019

Understanding the Role of Momentum in Stochastic Gradient Methods

The use of momentum in stochastic gradient methods has become a widespre...
06/30/2020

Momentum Accelerated Multigrid Methods

In this paper, we propose two momentum accelerated MG cycles. The main i...
03/11/2019

Conformal Symplectic and Relativistic Optimization

Although momentum-based optimization methods have had a remarkable impac...
02/12/2020

Average-case Acceleration Through Spectral Density Estimation

We develop a framework for designing optimal quadratic optimization meth...
08/20/2021

Practical and Fast Momentum-Based Power Methods

The power method is a classical algorithm with broad applications in mac...
09/14/2020

A Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms

The dynamic behavior of RMSprop and Adam algorithms is studied through a...