Gradient-only line searches: An Alternative to Probabilistic Line Searches

03/22/2019
by   Dominic Kafka, et al.
4

Step sizes in neural network training are largely determined using predetermined rules such as fixed learning rates and learning rate schedules, which require user input to determine their functional form and associated hyperparameters. Global optimization strategies to resolve these hyperparameters are computationally expensive. Line searches are capable of adaptively resolving learning rate schedules. However, due to discontinuities induced by mini-batch sampling, they have largely fallen out of favor. Notwithstanding, probabilistic line searches have recently demonstrated viability in resolving learning rates for stochastic loss functions. This method creates surrogates with confidence intervals, where restrictions are placed on the rate at which the search domain can grow along a search direction. This paper introduces an alternative paradigm, Gradient-Only Line Searches that are inexact (GOLS-I), as an alternative strategy to automatically resolve learning rates in stochastic cost functions over a range of 15 orders of magnitude without the use of surrogates. We show that GOLS-I is a competitive strategy to reliably resolve step sizes, adding high value in terms of performance, while being easy to implement. Considering mini-batch sampling, we open the discussion on how to split the effort to resolve quality search directions from quality step size estimates along a search direction.

READ FULL TEXT

page 2

page 4

page 13

research
06/29/2020

Gradient-only line searches to automatically determine learning rates for a variety of stochastic training algorithms

Gradient-only and probabilistic line searches have recently reintroduced...
research
01/15/2020

Resolving learning rates adaptively by locating Stochastic Non-Negative Associated Gradient Projection Points using line searches

Learning rates in stochastic neural network training are currently deter...
research
08/31/2021

Using a one dimensional parabolic model of the full-batch loss to estimate learning rates during training

A fundamental challenge in Deep Learning is to find optimal step sizes f...
research
05/23/2021

GOALS: Gradient-Only Approximations for Line Searches Towards Robust and Consistent Training of Deep Neural Networks

Mini-batch sub-sampling (MBSS) is favored in deep neural network trainin...
research
06/22/2023

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models

Recent works have shown that line search methods can speed up Stochastic...
research
03/29/2017

Probabilistic Line Searches for Stochastic Optimization

In deterministic optimization, line searches are a standard tool ensurin...
research
09/15/2019

Empirical study towards understanding line search approximations for training neural networks

Choosing appropriate step sizes is critical for reducing the computation...

Please sign up or login with your details

Forgot password? Click here to reset