Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization

02/13/2020
by   Samuel Horvath, et al.
4

Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve drastically different settings of hyperparameters, such as step-size schemes and batch sizes, in different regimes. Despite the appealing theoretical results, such divisive strategies provide little, if any, insight to practitioners to select algorithms that work broadly without tweaking the hyperparameters. In this work, blending the "geometrization" technique introduced by Lei Jordan 2016 and the SARAH algorithm of Nguyen et al., 2017, we propose the Geometrized SARAH algorithm for non-convex finite-sum and stochastic optimization. Our algorithm is proved to achieve adaptivity to both the magnitude of the target accuracy and the Polyak-Łojasiewicz (PL) constant if present. In addition, it achieves the best-available convergence rate for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2023

Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case

Adam is a commonly used stochastic optimization algorithm in machine lea...
research
02/15/2019

ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization

In this paper, we propose a new stochastic algorithmic framework to solv...
research
04/09/2019

On the Adaptivity of Stochastic Gradient-Based Optimization

Stochastic-gradient-based optimization has been a core enabling methodol...
research
08/10/2022

Adaptive Learning Rates for Faster Stochastic Gradient Methods

In this work, we propose new adaptive step size strategies that improve ...
research
09/17/2018

Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

In this paper, we propose and analyze zeroth-order stochastic approximat...
research
06/05/2021

Bandwidth-based Step-Sizes for Non-Convex Stochastic Optimization

Many popular learning-rate schedules for deep neural networks combine a ...
research
10/26/2020

Stochastic Optimization with Laggard Data Pipelines

State-of-the-art optimization is steadily shifting towards massively par...

Please sign up or login with your details

Forgot password? Click here to reset