QLAB: Quadratic Loss Approximation-Based Optimal Learning Rate for Deep Learning

02/01/2023
by   Minghan Fu, et al.
0

We propose a learning rate adaptation scheme, called QLAB, for descent optimizers. We derive QLAB by optimizing the quadratic approximation of the loss function and QLAB can be combined with any optimizer who can provide the descent update direction. The computation of an adaptive learning rate with QLAB requires only computing an extra loss function value. We theoretically prove the convergence of the descent optimizers with QLAB. We demonstrate the effectiveness of QLAB in a range of optimization problems by combining with conclusively stochastic gradient descent, stochastic gradient descent with momentum, and Adam. The performance is validated on multi-layer neural networks, CNN, VGG-Net, ResNet and ShuffleNet with two datasets, MNIST and CIFAR10.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2017

Online Learning Rate Adaptation with Hypergradient Descent

We introduce a general method for improving the convergence rate of grad...
research
03/15/2020

Stochastic gradient descent with random learning rate

We propose to optimize neural networks with a uniformly-distributed rand...
research
04/07/2020

Automatic, Dynamic, and Nearly Optimal Learning Rate Specification by Local Quadratic Approximation

In deep learning tasks, the learning rate determines the update step siz...
research
06/08/2020

The Golden Ratio of Learning and Momentum

Gradient descent has been a central training principle for artificial ne...
research
12/20/2022

Normalized Stochastic Gradient Descent Training of Deep Neural Networks

In this paper, we introduce a novel optimization algorithm for machine l...
research
02/14/2018

L4: Practical loss-based stepsize adaptation for deep learning

We propose a stepsize adaptation scheme for stochastic gradient descent....
research
02/07/2019

Combining learning rate decay and weight decay with complexity gradient descent - Part I

The role of L^2 regularization, in the specific case of deep neural netw...

Please sign up or login with your details

Forgot password? Click here to reset