BFE and AdaBFE: A New Approach in Learning Rate Automation for Stochastic Optimization

07/06/2022
by   Xin Cao, et al.
0

In this paper, a new gradient-based optimization approach by automatically adjusting the learning rate is proposed. This approach can be applied to design non-adaptive learning rate and adaptive learning rate. Firstly, I will introduce the non-adaptive learning rate optimization method: Binary Forward Exploration (BFE), and then the corresponding adaptive per-parameter learning rate method: Adaptive BFE (AdaBFE) is possible to be developed. This approach could be an alternative method to optimize the learning rate based on the stochastic gradient descent (SGD) algorithm besides the current non-adaptive learning rate methods e.g. SGD, momentum, Nesterov and the adaptive learning rate methods e.g. AdaGrad, AdaDelta, Adam... The purpose to develop this approach is not to beat the benchmark of other methods but just to provide a different perspective to optimize the gradient descent method, although some comparative study with previous methods will be made in the following sections. This approach is expected to be heuristic or inspire researchers to improve gradient-based optimization combined with previous methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2022

Improved Binary Forward Exploration: Learning Rate Scheduling Method for Stochastic Optimization

A new gradient-based optimization approach by automatically scheduling t...
research
04/20/2023

Angle based dynamic learning rate for gradient descent

In our work, we propose a novel yet simple approach to obtain an adaptiv...
research
09/05/2017

Stochastic Gradient Descent: Going As Fast As Possible But Not Faster

When applied to training deep neural networks, stochastic gradient desce...
research
04/02/2022

AdaSmooth: An Adaptive Learning Rate Method based on Effective Ratio

It is well known that we need to choose the hyper-parameters in Momentum...
research
09/12/2023

ELRA: Exponential learning rate adaption gradient descent optimization method

We present a novel, fast (exponential rate adaption), ab initio (hyper-p...
research
07/06/2020

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

We investigate whether Jacobi preconditioning, accounting for the bootst...
research
09/21/2019

Using Statistics to Automate Stochastic Optimization

Despite the development of numerous adaptive optimizers, tuning the lear...

Please sign up or login with your details

Forgot password? Click here to reset