Automatic Tuning of Stochastic Gradient Descent with Bayesian Optimisation

06/25/2020
by   Victor Picheny, et al.
0

Many machine learning models require a training procedure based on running stochastic gradient descent. A key element for the efficiency of those algorithms is the choice of the learning rate schedule. While finding good learning rates schedules using Bayesian optimisation has been tackled by several authors, adapting it dynamically in a data-driven way is an open question. This is of high practical importance to users that need to train a single, expensive model. To tackle this problem, we introduce an original probabilistic model for traces of optimisers, based on latent Gaussian processes and an auto-/regressive formulation, that flexibly adjusts to abrupt changes of behaviours induced by new learning rate values. As illustrated, this model is well-suited to tackle a set of problems: first, for the on-line adaptation of the learning rate for a cold-started run; then, for tuning the schedule for a set of similar tasks (in a classical BO setup), as well as warm-starting it for a new task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2023

Learning-Rate-Free Learning: Dissecting D-Adaptation and Probabilistic Line Search

This paper explores two recent methods for learning rate optimisation in...
research
03/14/2017

Online Learning Rate Adaptation with Hypergradient Descent

We introduce a general method for improving the convergence rate of grad...
research
09/21/2019

Using Statistics to Automate Stochastic Optimization

Despite the development of numerous adaptive optimizers, tuning the lear...
research
06/23/2020

On Compression Principle and Bayesian Optimization for Neural Networks

Finding methods for making generalizable predictions is a fundamental pr...
research
02/14/2023

Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent

We propose new limiting dynamics for stochastic gradient descent in the ...
research
05/22/2021

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

The learning rate (LR) schedule is one of the most important hyper-param...
research
03/01/2021

Acceleration via Fractal Learning Rate Schedules

When balancing the practical tradeoffs of iterative methods for large-sc...

Please sign up or login with your details

Forgot password? Click here to reset