Provably Faster Gradient Descent via Long Steps

by   Benjamin Grimmer, et al.
Johns Hopkins University

This work establishes provably faster convergence rates for gradient descent in smooth convex optimization via a computer-assisted analysis technique. Our theory allows nonconstant stepsize policies with frequent long steps potentially violating descent by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We show that long steps, which may increase the objective value in the short term, lead to provably faster convergence in the long term. A conjecture towards proving a faster O(1/Tlog T) rate for gradient descent is also motivated along with simple numerical validation.


page 1

page 2

page 3

page 4


Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

Nesterov's accelerated gradient descent (AGD), an instance of the genera...

AdaX: Adaptive Gradient Descent with Exponential Long Term Memory

Although adaptive optimization algorithms such as Adam show fast converg...

Convex optimization over a probability simplex

We propose a new iteration scheme, the Cauchy-Simplex, to optimize conve...

The Power of Normalization: Faster Evasion of Saddle Points

A commonly used heuristic in non-convex optimization is Normalized Gradi...

Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

For strongly convex objectives that are smooth, the classical theory of ...

Exponential convergence of Sobolev gradient descent for a class of nonlinear eigenproblems

We propose to use the Łojasiewicz inequality as a general tool for analy...

Convergence rates of the stochastic alternating algorithm for bi-objective optimization

Stochastic alternating algorithms for bi-objective optimization are cons...

Code Repositories


Certificates proving the convergence rates claimed in Table 1 of the (forthcoming) paper "Provably Faster Gradient Descent via Long Steps" by Benjamin Grimmer. The Mathematica notebooks include everything in rational form and computations (exact arithmetic) verifying all of the need (spectral) properties of the certificates.

view repo

Please sign up or login with your details

Forgot password? Click here to reset