ANITA: An Optimal Loopless Accelerated Variance-Reduced Gradient Method
We propose a novel accelerated variance-reduced gradient method called ANITA for finite-sum optimization. In this paper, we consider both general convex and strongly convex settings. In the general convex setting, ANITA achieves the convergence result O(nmin{1+log1/ϵ√(n), log√(n)} + √(nL/ϵ)), which improves the previous best result O(nmin{log1/ϵ, log n}+√(nL/ϵ)) given by Varag (Lan et al., 2019). In particular, for a very wide range of ϵ, i.e., ϵ∈ (0,L/nlog^2√(n)]∪ [1/√(n),+∞), where ϵ is the error tolerance f(x_T)-f^*≤ϵ and n is the number of data samples, ANITA can achieve the optimal convergence result O(n+√(nL/ϵ)) matching the lower bound Ω(n+√(nL/ϵ)) provided by Woodworth and Srebro (2016). To the best of our knowledge, ANITA is the first accelerated algorithm which can exactly achieve this optimal result O(n+√(nL/ϵ)) for general convex finite-sum problems. In the strongly convex setting, we also show that ANITA can achieve the optimal convergence result O((n+√(nL/μ))log1/ϵ) matching the lower bound Ω((n+√(nL/μ))log1/ϵ) provided by Lan and Zhou (2015). Moreover, ANITA enjoys a simpler loopless algorithmic structure unlike previous accelerated algorithms such as Katyusha (Allen-Zhu, 2017) and Varag (Lan et al., 2019) where they use an inconvenient double-loop structure. Finally, the experimental results also show that ANITA converges faster than previous state-of-the-art Varag (Lan et al., 2019), validating our theoretical results and confirming the practical superiority of ANITA.
READ FULL TEXT