
Optimal FiniteSum Smooth NonConvex Optimization with SARAH
The total complexity (measured as the total number of gradient computati...
read it

Stochastic Variance Reduction via Accelerated Dual Averaging for FiniteSum Optimization
In this paper, we introduce a simplified and unified method for finites...
read it

ZeroSARAH: Efficient Nonconvex FiniteSum Optimization with Zero Full Gradient Computation
We propose ZeroSARAH – a novel variant of the variancereduced method SA...
read it

Fiducial Matching for the Approximate Posterior: FABC
FABC is introduced, using universal sufficient statistics, unlike previ...
read it

A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization
In modern largescale machine learning applications, the training data a...
read it

Variance Reduction via PrimalDual Accelerated Dual Averaging for Nonsmooth Convex FiniteSums
We study structured nonsmooth convex finitesum optimization that appear...
read it

On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
We study Nesterov's accelerated gradient method in the stochastic approx...
read it
ANITA: An Optimal Loopless Accelerated VarianceReduced Gradient Method
We propose a novel accelerated variancereduced gradient method called ANITA for finitesum optimization. In this paper, we consider both general convex and strongly convex settings. In the general convex setting, ANITA achieves the convergence result O(nmin{1+log1/ϵ√(n), log√(n)} + √(nL/ϵ)), which improves the previous best result O(nmin{log1/ϵ, log n}+√(nL/ϵ)) given by Varag (Lan et al., 2019). In particular, for a very wide range of ϵ, i.e., ϵ∈ (0,L/nlog^2√(n)]∪ [1/√(n),+∞), where ϵ is the error tolerance f(x_T)f^*≤ϵ and n is the number of data samples, ANITA can achieve the optimal convergence result O(n+√(nL/ϵ)) matching the lower bound Ω(n+√(nL/ϵ)) provided by Woodworth and Srebro (2016). To the best of our knowledge, ANITA is the first accelerated algorithm which can exactly achieve this optimal result O(n+√(nL/ϵ)) for general convex finitesum problems. In the strongly convex setting, we also show that ANITA can achieve the optimal convergence result O((n+√(nL/μ))log1/ϵ) matching the lower bound Ω((n+√(nL/μ))log1/ϵ) provided by Lan and Zhou (2015). Moreover, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Katyusha (AllenZhu, 2017) and Varag (Lan et al., 2019) where they use an inconvenient doubleloop structure. Finally, the experimental results also show that ANITA converges faster than previous stateoftheart Varag (Lan et al., 2019), validating our theoretical results and confirming the practical superiority of ANITA.
READ FULL TEXT
Comments
There are no comments yet.