On the Convergence of SARAH and Beyond

06/05/2019
by   Bingcong Li, et al.
1

The main theme of this work is a unifying algorithm, abbreviated as L2S, that can deal with (strongly) convex and nonconvex empirical risk minimization (ERM) problems. It broadens a recently developed variance reduction method known as SARAH. L2S enjoys a linear convergence rate for strongly convex problems, which also implies the last iteration of SARAH's inner loop converges linearly. For convex problems, different from SARAH, L2S can afford step and mini-batch sizes not dependent on the data size n, and the complexity needed to guarantee E[∇ F(x) ^2] ≤ϵ is O(n+ n/ϵ). For nonconvex problems on the other hand, the complexity is O(n+ √(n)/ϵ). Parallel to L2S there are a few side results. Leveraging an aggressive step size, D2S is proposed, which provides a more efficient alternative to L2S and SARAH-like algorithms. Specifically, D2S requires a reduced IFO complexity of O( (n+ κ̅) (1/ϵ) ) for strongly convex problems. Moreover, to avoid the tedious selection of the optimal step size, an automatic tuning scheme is developed, which obtains comparable empirical performance with SARAH using judiciously tuned step size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2019

Adaptive Step Sizes in Variance Reduction via Regularization

The main goal of this work is equipping convex and nonconvex problems wi...
research
06/20/2019

Accelerating Mini-batch SARAH by Step Size Rules

StochAstic Recursive grAdient algoritHm (SARAH), originally proposed for...
research
07/22/2019

Stochastic algorithms with geometric step decay converge linearly on sharp functions

Stochastic (sub)gradient methods require step size schedule tuning to pe...
research
05/02/2018

SVRG meets SAGA: k-SVRG --- A Tale of Limited Memory

In recent years, many variance reduced algorithms for empirical risk min...
research
07/25/2022

On the convergence and sampling of randomized primal-dual algorithms and their application to parallel MRI reconstruction

The Stochastic Primal-Dual Hybrid Gradient or SPDHG is an algorithm prop...
research
06/05/2023

Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking

The backtracking line-search is an effective technique to automatically ...
research
08/25/2019

Almost Tune-Free Variance Reduction

The variance reduction class of algorithms including the representative ...

Please sign up or login with your details

Forgot password? Click here to reset