Sampling without Replacement Leads to Faster Rates in Finite-Sum Minimax Optimization

06/07/2022
by   Aniket Das, et al.
8

We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax optimization and show that, for many such algorithms, sampling the data points without replacement leads to faster convergence compared to sampling with replacement. For the smooth and strongly convex-strongly concave setting, we consider gradient descent ascent and the proximal point method, and present a unified analysis of two popular without-replacement sampling strategies, namely Random Reshuffling (RR), which shuffles the data every epoch, and Single Shuffling or Shuffle Once (SO), which shuffles only at the beginning. We obtain tight convergence rates for RR and SO and demonstrate that these strategies lead to faster convergence than uniform sampling. Moving beyond convexity, we obtain similar results for smooth nonconvex-nonconcave objectives satisfying a two-sided Polyak-Łojasiewicz inequality. Finally, we demonstrate that our techniques are general enough to analyze the effect of data-ordering attacks, where an adversary manipulates the order in which data points are supplied to the optimizer. Our analysis also recovers tight rates for the incremental gradient method, where the data points are not shuffled at all.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2022

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Stochastic gradient descent-ascent (SGDA) is one of the main workhorses ...
research
07/15/2020

Incremental Without Replacement Sampling in Nonconvex Optimization

Minibatch decomposition methods for empirical risk minimization are comm...
research
03/04/2019

SGD without Replacement: Sharper Rates for General Smooth Convex Functions

We study stochastic gradient descent without replacement () for smooth ...
research
06/12/2020

SGD with shuffling: optimal rates without component convexity and large epoch requirements

We study without-replacement SGD for solving finite-sum optimization pro...
research
07/10/2014

Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems

Recent advances in optimization theory have shown that smooth strongly c...
research
06/16/2023

Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima

Sharpness-Aware Minimization (SAM) is an optimizer that takes a descent ...
research
02/02/2022

HMC and Langevin united in the unadjusted and convex case

We consider a family of unadjusted HMC samplers, which includes standard...

Please sign up or login with your details

Forgot password? Click here to reset