ELRA: Exponential learning rate adaption gradient descent optimization method

09/12/2023
by   Alexander Kleinsorge, et al.
0

We present a novel, fast (exponential rate adaption), ab initio (hyper-parameter-free) gradient based optimizer algorithm. The main idea of the method is to adapt the learning rate α by situational awareness, mainly striving for orthogonal neighboring gradients. The method has a high success and fast convergence rate and does not rely on hand-tuned parameters giving it greater universality. It can be applied to problems of any dimensions n and scales only linearly (of order O(n)) with the dimension of the problem. It optimizes convex and non-convex continuous landscapes providing some kind of gradient. In contrast to the Ada-family (AdaGrad, AdaMax, AdaDelta, Adam, etc.) the method is rotation invariant: optimization path and performance are independent of coordinate choices. The impressive performance is demonstrated by extensive experiments on the MNIST benchmark data-set against state-of-the-art optimizers. We name this new class of optimizers after its core idea Exponential Learning Rate Adaption - ELRA. We present it in two variants c2min and p2min with slightly different control. The authors strongly believe that ELRA will open a completely new research direction for gradient descent optimize.

READ FULL TEXT

page 1

page 9

research
07/06/2022

BFE and AdaBFE: A New Approach in Learning Rate Automation for Stochastic Optimization

In this paper, a new gradient-based optimization approach by automatical...
research
05/25/2023

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

This paper proposes a new easy-to-implement parameter-free gradient-base...
research
04/21/2020

AdaX: Adaptive Gradient Descent with Exponential Long Term Memory

Although adaptive optimization algorithms such as Adam show fast converg...
research
06/11/2023

Parameter-free version of Adaptive Gradient Methods for Strongly-Convex Functions

The optimal learning rate for adaptive gradient methods applied to λ-str...
research
01/10/2020

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

The gradient-based optimization method for deep machine learning models ...
research
09/27/2022

The Curse of Unrolling: Rate of Differentiating Through Optimization

Computing the Jacobian of the solution of an optimization problem is a c...
research
11/30/2019

Learning Rate Dropout

The performance of a deep neural network is highly dependent on its trai...

Please sign up or login with your details

Forgot password? Click here to reset