ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method

03/21/2021
by   Zhize Li, et al.
7

We propose a novel accelerated variance-reduced gradient method called ANITA for finite-sum optimization. In this paper, we consider both general convex and strongly convex settings. In the general convex setting, ANITA achieves the convergence result O(nmin{1+log1/ϵ√(n), log√(n)} + √(nL/ϵ)), which improves the previous best result O(nmin{log1/ϵ, log n}+√(nL/ϵ)) given by Varag (Lan et al., 2019). In particular, for a very wide range of ϵ, i.e., ϵ∈ (0,L/nlog^2√(n)]∪ [1/√(n),+∞), where ϵ is the error tolerance f(x_T)-f^*≤ϵ and n is the number of data samples, ANITA can achieve the optimal convergence result O(n+√(nL/ϵ)) matching the lower bound Ω(n+√(nL/ϵ)) provided by Woodworth and Srebro (2016). To the best of our knowledge, ANITA is the first accelerated algorithm which can exactly achieve this optimal result O(n+√(nL/ϵ)) for general convex finite-sum problems. In the strongly convex setting, we also show that ANITA can achieve the optimal convergence result O((n+√(nL/μ))log1/ϵ) matching the lower bound Ω((n+√(nL/μ))log1/ϵ) provided by Lan and Zhou (2015). Moreover, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Katyusha (Allen-Zhu, 2017) and Varag (Lan et al., 2019) where they use an inconvenient double-loop structure. Finally, the experimental results also show that ANITA converges faster than previous state-of-the-art Varag (Lan et al., 2019), validating our theoretical results and confirming the practical superiority of ANITA.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset