UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

05/09/2023
by   Yiming Jiang, et al.
0

Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. This is supported by a rigorous convergence analysis of UAdam in the non-convex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with the rate of 𝒪(1/T). Furthermore, the size of neighborhood decreases as β increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2023

Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

Adam is a commonly used stochastic optimization algorithm in machine lea...
research
08/20/2022

Adam Can Converge Without Any Modification on Update Rules

Ever since Reddi et al. 2018 pointed out the divergence issue of Adam, m...
research
04/12/2016

Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization

Recently, stochastic momentum methods have been widely adopted in train...
research
11/23/2018

A Sufficient Condition for Convergences of Adam and RMSProp

Adam and RMSProp, as two of the most influential adaptive stochastic alg...
research
09/18/2023

Fitchean Ignorance and First-order Ignorance: A Neighborhood Look

In a seminal work <cit.>, Fine classifies several forms of ignorance, am...
research
02/12/2022

From Online Optimization to PID Controllers: Mirror Descent with Momentum

We study a family of first-order methods with momentum based on mirror d...
research
11/03/2017

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

Two classes of methods have been proposed for escaping from saddle point...

Please sign up or login with your details

Forgot password? Click here to reset