Making SGD Parameter-Free

05/04/2022
by   Yair Carmon, et al.
0

We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for SGD step size choice, and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2019

Making the Last Iterate of SGD Information Theoretically Optimal

Stochastic gradient descent (SGD) is one of the most widely used algorit...
research
02/24/2020

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

We propose a stochastic variant of the classical Polyak step-size (Polya...
research
07/19/2021

Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints

Generalization performance of stochastic optimization stands a central p...
research
06/18/2020

SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation

We provide several convergence theorems for SGD for two large classes of...
research
09/18/2018

Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent

We propose graph-dependent implicit regularisation strategies for distri...
research
10/25/2022

Parameter-free Regret in High Probability with Heavy Tails

We present new algorithms for online convex optimization over unbounded ...
research
05/30/2017

Online to Offline Conversions, Universality and Adaptive Minibatch Sizes

We present an approach towards convex optimization that relies on a nove...

Please sign up or login with your details

Forgot password? Click here to reset