Adaptive gradient descent without descent

10/21/2019
by   Yura Malitsky, et al.
0

We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don't increase the stepsize too fast and 2) don't overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on smoothness in a neighborhood of a solution. Given that the problem is convex, our method will converge even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice-differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including matrix factorization and training of ResNet-18.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

Local Convergence of Adaptive Gradient Descent Optimizers

Adaptive Moment Estimation (ADAM) is a very popular training algorithm f...
research
05/11/2023

Convergence of Alternating Gradient Descent for Matrix Factorization

We consider alternating gradient descent (AGD) with fixed step size η > ...
research
09/10/2019

First Analysis of Local GD on Heterogeneous Data

We provide the first convergence analysis of local gradient descent for ...
research
11/03/2020

The Complexity of Gradient Descent: CLS = PPAD ∩ PLS

We study search problems that can be solved by performing Gradient Desce...
research
02/24/2021

Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

Numerous empirical evidences have corroborated the importance of noise i...
research
02/13/2023

Convergence analysis for a nonlocal gradient descent method via directional Gaussian smoothing

We analyze the convergence of a nonlocal gradient descent method for min...
research
05/29/2019

Global Guarantees for Blind Demodulation with Generative Priors

We study a deep learning inspired formulation for the blind demodulation...

Please sign up or login with your details

Forgot password? Click here to reset