A straightforward line search approach on the expected empirical loss for stochastic deep learning problems

10/02/2020
by   Maximus Mutschler, et al.
0

A fundamental challenge in deep learning is that the optimal step sizes for update steps of stochastic gradient descent are unknown. In traditional optimization, line searches are used to determine good step sizes, however, in deep learning, it is too costly to search for good step sizes on the expected empirical loss due to noisy losses. This empirical work shows that it is possible to approximate the expected empirical loss on vertical cross sections for common deep learning tasks considerably cheaply. This is achieved by applying traditional one-dimensional function fitting to measured noisy losses of such cross sections. The step to a minimum of the resulting approximation is then used as step size for the optimization. This approach leads to a robust and straightforward optimization method which performs well across datasets and architectures without the need of hyperparameter tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2021

Using a one dimensional parabolic model of the full-batch loss to estimate learning rates during training

A fundamental challenge in Deep Learning is to find optimal step sizes f...
research
06/22/2023

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models

Recent works have shown that line search methods can speed up Stochastic...
research
03/18/2013

Margins, Shrinkage, and Boosting

This manuscript shows that AdaBoost and its immediate variants can produ...
research
11/13/2021

Bolstering Stochastic Gradient Descent with Model Building

Stochastic gradient descent method and its variants constitute the core ...
research
05/02/2023

Random Function Descent

While gradient based methods are ubiquitous in machine learning, selecti...
research
06/05/2023

Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking

The backtracking line-search is an effective technique to automatically ...
research
10/16/2020

The Deep Bootstrap: Good Online Learners are Good Offline Generalizers

We propose a new framework for reasoning about generalization in deep le...

Please sign up or login with your details

Forgot password? Click here to reset