Automatic, Dynamic, and Nearly Optimal Learning Rate Specification by Local Quadratic Approximation

04/07/2020
by   Yingqiu Zhu, et al.
3

In deep learning tasks, the learning rate determines the update step size in each iteration, which plays a critical role in gradient-based optimization. However, the determination of the appropriate learning rate in practice typically replies on subjective judgement. In this work, we propose a novel optimization method based on local quadratic approximation (LQA). In each update step, given the gradient direction, we locally approximate the loss function by a standard quadratic function of the learning rate. Then, we propose an approximation step to obtain a nearly optimal learning rate in a computationally efficient way. The proposed LQA method has three important features. First, the learning rate is automatically determined in each update step. Second, it is dynamically adjusted according to the current loss function value and the parameter estimates. Third, with the gradient direction fixed, the proposed method leads to nearly the greatest reduction in terms of the loss function. Extensive experiments have been conducted to prove the strengths of the proposed LQA method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2023

QLAB: Quadratic Loss Approximation-Based Optimal Learning Rate for Deep Learning

We propose a learning rate adaptation scheme, called QLAB, for descent o...
research
03/07/2018

WNGrad: Learn the Learning Rate in Gradient Descent

Adjusting the learning rate schedule in stochastic gradient methods is a...
research
01/10/2020

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

The gradient-based optimization method for deep machine learning models ...
research
12/09/2021

Extending AdamW by Leveraging Its Second Moment and Magnitude

Recent work [4] analyses the local convergence of Adam in a neighbourhoo...
research
07/02/2020

On the Outsized Importance of Learning Rates in Local Update Methods

We study a family of algorithms, which we refer to as local update metho...
research
04/20/2023

Angle based dynamic learning rate for gradient descent

In our work, we propose a novel yet simple approach to obtain an adaptiv...
research
10/12/2021

Regularized Step Directions in Conjugate Gradient Minimization for Machine Learning

Conjugate gradient minimization methods (CGM) and their accelerated vari...

Please sign up or login with your details

Forgot password? Click here to reset