Log In Sign Up

A Unified Convergence Analysis for Shuffling-Type Gradient Methods

by   Lam M. Nguyen, et al.

In this paper, we provide a unified convergence analysis for a class of shuffling-type gradient methods for solving a well-known finite-sum minimization problem commonly used in machine learning. This algorithm covers various variants such as randomized reshuffling, single shuffling, and cyclic/incremental gradient schemes. We consider two different settings: strongly convex and non-convex problems. Our main contribution consists of new non-asymptotic and asymptotic convergence rates for a general class of shuffling-type gradient methods to solve both non-convex and strongly convex problems. While our rate in the non-convex problem is new (i.e. not known yet under standard assumptions), the rate on the strongly convex case matches (up to a constant) the best-known results. However, unlike existing works in this direction, we only use standard assumptions such as smoothness and strong convexity. Finally, we empirically illustrate the effect of learning rates via a non-convex logistic regression and neural network examples.


page 1

page 2

page 3

page 4


SAGA and Restricted Strong Convexity

SAGA is a fast incremental gradient method on the finite sum problem and...

Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives

Many classical algorithms are found until several years later to outlive...

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

In this work we introduce a new optimisation method called SAGA in the s...

Analysis of a Two-Layer Neural Network via Displacement Convexity

Fitting a function by using linear combinations of a large number N of `...

On Faster Convergence of Cyclic Block Coordinate Descent-type Methods for Strongly Convex Minimization

The cyclic block coordinate descent-type (CBCD-type) methods, which perf...

Fast classification rates without standard margin assumptions

We consider the classical problem of learning rates for classes with fin...

Convergence under Lipschitz smoothness of ease-controlled Random Reshuffling gradient Algorithms

We consider minimizing the average of a very large number of smooth and ...