ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method

03/21/2021
by   Zhize Li, et al.
7

We propose a novel accelerated variance-reduced gradient method called ANITA for finite-sum optimization. In this paper, we consider both general convex and strongly convex settings. In the general convex setting, ANITA achieves the convergence result O(nmin{1+log1/ϵ√(n), log√(n)} + √(nL/ϵ)), which improves the previous best result O(nmin{log1/ϵ, log n}+√(nL/ϵ)) given by Varag (Lan et al., 2019). In particular, for a very wide range of ϵ, i.e., ϵ∈ (0,L/nlog^2√(n)]∪ [1/√(n),+∞), where ϵ is the error tolerance f(x_T)-f^*≤ϵ and n is the number of data samples, ANITA can achieve the optimal convergence result O(n+√(nL/ϵ)) matching the lower bound Ω(n+√(nL/ϵ)) provided by Woodworth and Srebro (2016). To the best of our knowledge, ANITA is the first accelerated algorithm which can exactly achieve this optimal result O(n+√(nL/ϵ)) for general convex finite-sum problems. In the strongly convex setting, we also show that ANITA can achieve the optimal convergence result O((n+√(nL/μ))log1/ϵ) matching the lower bound Ω((n+√(nL/μ))log1/ϵ) provided by Lan and Zhou (2015). Moreover, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Katyusha (Allen-Zhu, 2017) and Varag (Lan et al., 2019) where they use an inconvenient double-loop structure. Finally, the experimental results also show that ANITA converges faster than previous state-of-the-art Varag (Lan et al., 2019), validating our theoretical results and confirming the practical superiority of ANITA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2019

Optimal Finite-Sum Smooth Non-Convex Optimization with SARAH

The total complexity (measured as the total number of gradient computati...
research
06/18/2020

Stochastic Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization

In this paper, we introduce a simplified and unified method for finite-s...
research
07/07/2020

An Accelerated DFO Algorithm for Finite-sum Convex Functions

Derivative-free optimization (DFO) has recently gained a lot of momentum...
research
03/03/2022

Accelerated SGD for Non-Strongly-Convex Least Squares

We consider stochastic approximation for the least squares regression pr...
research
03/02/2021

ZeroSARAH: Efficient Nonconvex Finite-Sum Optimization with Zero Full Gradient Computation

We propose ZeroSARAH – a novel variant of the variance-reduced method SA...
research
04/13/2016

A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization

In modern large-scale machine learning applications, the training data a...
research
08/22/2022

Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization

We propose and analyze several stochastic gradient algorithms for findin...

Please sign up or login with your details

Forgot password? Click here to reset