Log In Sign Up

Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic

by   Matteo Sordello, et al.

This paper proposes SplitSGD, a new stochastic optimization algorithm with a dynamic learning rate selection rule. This procedure decreases the learning rate for better adaptation to the local geometry of the objective function whenever a stationary phase is detected, that is, the iterates are likely to bounce around a vicinity of a local minimum. The detection is performed by splitting the single thread into two and using the inner products of the gradients from the two threads as a measure of stationarity. This learning rate selection is provably valid, robust to initial parameters, easy-to-implement, and essentially does not incur additional computational cost. Finally, we illustrate the robust convergence properties of SplitSGD through extensive experiments.


page 7

page 11


Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum

Convergence detection of iterative stochastic optimization methods is of...

A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization

Machine learning practitioners invest significant manual and computation...

Using Statistics to Automate Stochastic Optimization

Despite the development of numerous adaptive optimizers, tuning the lear...

Training Deep Networks without Learning Rates Through Coin Betting

Deep learning methods achieve state-of-the-art performance in many appli...

Data augmentation as stochastic optimization

We present a theoretical framework recasting data augmentation as stocha...

Improving reinforcement learning algorithms: towards optimal learning rate policies

This paper investigates to what extent we can improve reinforcement lear...

A comparison of learning rate selection methods in generalized Bayesian inference

Generalized Bayes posterior distributions are formed by putting a fracti...