Nesterov Accelerated Shuffling Gradient Method for Convex Optimization

02/07/2022
by   Trang H. Tran, et al.
0

In this paper, we propose Nesterov Accelerated Shuffling Gradient (NASG), a new algorithm for the convex finite-sum minimization problems. Our method integrates the traditional Nesterov's acceleration momentum with different shuffling sampling schemes. We show that our algorithm has an improved rate of 𝒪(1/T) using unified shuffling schemes, where T is the number of epochs. This rate is better than that of any other shuffling gradient methods in convex regime. Our convergence analysis does not require an assumption on bounded domain or a bounded gradient condition. For randomized shuffling schemes, we improve the convergence bound further. When employing some initial condition, we show that our method converges faster near the small neighborhood of the solution. Numerical simulations demonstrate the efficiency of our algorithm.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

05/29/2019

A unified variance-reduced accelerated gradient method for convex optimization

We propose a novel randomized incremental gradient algorithm, namely, VA...
02/25/2020

Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization

We consider the setting of distributed empirical risk minimization where...
05/25/2021

Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

The problem of finding near-stationary points in convex optimization has...
11/24/2020

Shuffling Gradient-Based Methods with Momentum

We combine two advanced ideas widely used in optimization for machine le...
11/05/2020

Accelerated Additive Schwarz Methods for Convex Optimization with Adaptive Restart

Based on an observation that additive Schwarz methods for general convex...
02/27/2020

On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

We study Nesterov's accelerated gradient method in the stochastic approx...
03/20/2012

On the Equivalence between Herding and Conditional Gradient Algorithms

We show that the herding procedure of Welling (2009) takes exactly the f...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.