Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

03/13/2023
by   Jaeyoung Cha, et al.
0

We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components n and the number of epochs K, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number κ. For SGD with Random Reshuffling, we present lower bounds that have tighter κ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing bounds in factors of n and κ and thereby match the upper bounds shown for a recently proposed algorithm called GraB.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2023

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

Recent progress was made in characterizing the generalization error of g...
research
10/20/2021

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

In distributed learning, local SGD (also known as federated averaging) a...
research
07/31/2019

How Good is SGD with Random Shuffling?

We study the performance of stochastic gradient descent (SGD) on smooth ...
research
06/21/2023

Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds

Stochastic gradient descent (SGD) is perhaps the most prevalent optimiza...
research
02/04/2022

Polynomial convergence of iterations of certain random operators in Hilbert space

We study the convergence of random iterative sequence of a family of ope...
research
02/19/2021

Permutation-Based SGD: Is Random Optimal?

A recent line of ground-breaking results for permutation-based SGD has c...
research
07/03/2016

Understanding the Energy and Precision Requirements for Online Learning

It is well-known that the precision of data, hyperparameters, and intern...

Please sign up or login with your details

Forgot password? Click here to reset