
Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Nesterov's momentum trick is famously known for accelerating gradient de...
read it

Momentum Accelerated Multigrid Methods
In this paper, we propose two momentum accelerated MG cycles. The main i...
read it

Conformal Symplectic and Relativistic Optimization
Although momentumbased optimization methods have had a remarkable impac...
read it

Averagecase Acceleration Through Spectral Density Estimation
We develop a framework for designing optimal quadratic optimization meth...
read it

Practical and Fast MomentumBased Power Methods
The power method is a classical algorithm with broad applications in mac...
read it

Direct Acceleration of SAGA using Sampled Negative Momentum
Variance reduction is a simple and effective technique that accelerates ...
read it

A Qualitative Study of the Dynamic Behavior of Adaptive Gradient Algorithms
The dynamic behavior of RMSprop and Adam algorithms is studied through a...
read it
How Does Momentum Help Frank Wolfe?
We unveil the connections between Frank Wolfe (FW) type algorithms and the momentum in Accelerated Gradient Methods (AGM). On the negative side, these connections illustrate why momentum is unlikely to be effective for FW type algorithms. The encouraging message behind this link, on the other hand, is that momentum is useful for FW on a class of problems. In particular, we prove that a momentum variant of FW, that we term accelerated Frank Wolfe (AFW), converges with a faster rate Õ(1/k^2) on certain constraint sets despite the same O(1/k) rate as FW on general cases. Given the possible acceleration of AFW at almost no extra cost, it is thus a competitive alternative to FW. Numerical experiments on benchmarked machine learning tasks further validate our theoretical findings.
READ FULL TEXT
Comments
There are no comments yet.