Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization

01/25/2019
by   Zhenxun Zhuang, et al.
0

Stochastic Gradient Descent (SGD) has played a central role in machine learning. However, it requires a carefully hand-picked stepsize for fast convergence, which is notoriously tedious and time-consuming to tune. Over the last several years, a plethora of adaptive gradient-based algorithms have emerged to ameliorate this problem. They have proved efficient in reducing the labor of tuning in practice, but many of them lack theoretic guarantees even in the convex setting. In this paper, we propose new surrogate losses to cast the problem of learning the optimal stepsizes for the stochastic optimization of a non-convex smooth objective function onto an online convex optimization problem. This allows the use of no-regret online algorithms to compute optimal stepsizes on the fly. In turn, this results in a SGD algorithm with self-tuned stepsizes that guarantees convergence rates that are automatically adaptive to the level of noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2019

Momentum-Based Variance Reduction in Non-Convex SGD

Variance reduction has emerged in recent years as a strong competitor to...
research
03/11/2021

BODAME: Bilevel Optimization for Defense Against Model Extraction

Model extraction attacks have become serious issues for service provider...
research
05/21/2018

On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes

Stochastic gradient descent is the method of choice for large scale opti...
research
10/28/2019

Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval

In recent literature, a general two step procedure has been formulated f...
research
02/10/2020

Adaptive Online Learning with Varying Norms

Given any increasing sequence of norms ·_0,...,·_T-1, we provide an onli...
research
07/17/2023

Universal Online Learning with Gradual Variations: A Multi-layer Online Ensemble Approach

In this paper, we propose an online convex optimization method with two ...
research
06/17/2023

Adaptive Strategies in Non-convex Optimization

An algorithm is said to be adaptive to a certain parameter (of the probl...

Please sign up or login with your details

Forgot password? Click here to reset