Exploring loss function topology with cyclical learning rates

02/14/2017
by   Leslie N. Smith, et al.
0

We present observations and discussion of previously unreported phenomena discovered while training residual networks. The goal of this work is to better understand the nature of neural networks through the examination of these new empirical results. These behaviors were identified through the application of Cyclical Learning Rates (CLR) and linear network interpolation. Among these behaviors are counterintuitive increases and decreases in training loss and instances of rapid training. For example, we demonstrate how CLR can produce greater testing accuracy than traditional training despite using large learning rates. Files to replicate these results are available at https://github.com/lnsmith54/exploring-loss

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2020

Neural Teleportation

In this paper, we explore a process called neural teleportation, a mathe...
research
10/09/2022

Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing

Residual networks have shown great success and become indispensable in t...
research
06/21/2019

Adaptive Learning Rate Clipping Stabilizes Learning

Artificial neural network training with stochastic gradient descent can ...
research
08/23/2017

Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates

In this paper, we show a phenomenon, which we named "super-convergence",...
research
11/18/2022

Distributionally Robust Survival Analysis: A Novel Fairness Loss Without Demographics

We propose a general approach for training survival analysis models that...
research
02/14/2022

How Do Vision Transformers Work?

The success of multi-head self-attentions (MSAs) for computer vision is ...

Please sign up or login with your details

Forgot password? Click here to reset