LRTuner: A Learning Rate Tuner for Deep Neural Networks

05/30/2021
by   Nikhil Iyer, et al.
0

One very important hyperparameter for training deep neural networks is the learning rate schedule of the optimizer. The choice of learning rate schedule determines the computational cost of getting close to a minima, how close you actually get to the minima, and most importantly the kind of local minima (wide/narrow) attained. The kind of minima attained has a significant impact on the generalization accuracy of the network. Current systems employ hand tuned learning rate schedules, which are painstakingly tuned for each network and dataset. Given that the state space of schedules is huge, finding a satisfactory learning rate schedule can be very time consuming. In this paper, we present LRTuner, a method for tuning the learning rate as training proceeds. Our method works with any optimizer, and we demonstrate results on SGD with Momentum, and Adam optimizers. We extensively evaluate LRTuner on multiple datasets, models, and across optimizers. We compare favorably against standard learning rate schedules for the given dataset and models, including ImageNet on Resnet-50, Cifar-10 on Resnet-18, and SQuAD fine-tuning on BERT. For example on ImageNet with Resnet-50, LRTuner shows up to 0.2 the hand-tuned baseline schedule. Moreover, LRTuner can achieve the same accuracy as the baseline schedule in 29

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2020

Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule

While the generalization properties of neural networks are not yet well ...
research
02/17/2021

Training Aware Sigmoidal Optimizer

Proper optimization of deep neural networks is an open research question...
research
12/04/2019

Domain-independent Dominance of Adaptive Methods

From a simplified analysis of adaptive methods, we derive AvaGrad, a new...
research
06/13/2018

Boosted Training of Convolutional Neural Networks for Multi-Class Segmentation

Training deep neural networks on large and sparse datasets is still chal...
research
09/28/2021

Faster Improvement Rate Population Based Training

The successful training of neural networks typically involves careful an...
research
10/17/2021

S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks

We explore a new perspective on adapting the learning rate (LR) schedule...
research
11/30/2021

AutoDrop: Training Deep Learning Models with Automatic Learning Rate Drop

Modern deep learning (DL) architectures are trained using variants of th...

Please sign up or login with your details

Forgot password? Click here to reset