Super-Convergence with an Unstable Learning Rate

02/22/2021
by   Samet Oymak, et al.
0

Conventional wisdom dictates that learning rate should be in the stable regime so that gradient-based algorithms don't blow up. This note introduces a simple scenario where an unstable learning rate scheme leads to a super fast convergence, with the convergence rate depending only logarithmically on the condition number of the problem. Our scheme uses a Cyclical Learning Rate where we periodically take one large unstable step and several small stable steps to compensate for the instability. These findings also help explain the empirical observations of [Smith and Topin, 2019] where they claim CLR with a large maximum learning rate leads to "super-convergence". We prove that our scheme excels in the problems where Hessian exhibits a bimodal spectrum and the eigenvalues can be grouped into two clusters (small and large). The unstable step is the key to enabling fast convergence over the small eigen-spectrum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2019

DTN: A Learning Rate Scheme with Convergence Rate of O(1/t) for SGD

We propose a novel diminishing learning rate scheme, coined Decreasing-T...
research
08/23/2017

Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates

In this paper, we show a phenomenon, which we named "super-convergence",...
research
10/07/2021

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Recent empirical advances show that training deep models with large lear...
research
04/02/2021

Neurons learn slower than they think

Recent studies revealed complex convergence dynamics in gradient-based m...
research
05/20/2016

Convergence of Contrastive Divergence with Annealed Learning Rate in Exponential Family

In our recent paper, we showed that in exponential family, contrastive d...
research
05/11/2023

On the convergence of the MLE as an estimator of the learning rate in the Exp3 algorithm

When fitting the learning data of an individual to algorithm-like learni...
research
08/31/2020

Super-linear convergence in the p-adic QR-algorithm

The QR-algorithm is one of the most important algorithms in linear algeb...

Please sign up or login with your details

Forgot password? Click here to reset