This work establishes provably faster convergence rates for gradient descent
in smooth convex optimization via a computer-assisted analysis technique. Our
theory allows nonconstant stepsize policies with frequent long steps
potentially violating descent by analyzing the overall effect of many
iterations at once rather than the typical one-iteration inductions used in
most first-order method analyses. We show that long steps, which may increase
the objective value in the short term, lead to provably faster convergence in
the long term. A conjecture towards proving a faster O(1/Tlog T) rate for
gradient descent is also motivated along with simple numerical validation.
Certificates proving the convergence rates claimed in Table 1 of the (forthcoming) paper "Provably Faster Gradient Descent via Long Steps" by Benjamin Grimmer. The Mathematica notebooks include everything in rational form and computations (exact arithmetic) verifying all of the need (spectral) properties of the certificates.