Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity

12/10/2021
by   Junchi Yang, et al.
0

Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training. Albeit its desirable simplicity, recent work shows inferior convergence rates of GDA in theory even assuming strong concavity of the objective on one side. This paper establishes new convergence results for two alternative single-loop algorithms – alternating GDA and smoothed GDA – under the mild assumption that the objective satisfies the Polyak-Lojasiewicz (PL) condition about one variable. We prove that, to find an ϵ-stationary point, (i) alternating GDA and its stochastic variant (without mini batch) respectively require O(κ^2ϵ^-2) and O(κ^4ϵ^-4) iterations, while (ii) smoothed GDA and its stochastic variant (without mini batch) respectively require O(κϵ^-2) and O(κ^2ϵ^-4) iterations. The latter greatly improves over the vanilla GDA and gives the hitherto best known complexity results among single-loop algorithms under similar settings. We further showcase the empirical efficiency of these algorithms in training GANs and robust nonlinear regression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2021

Solve Minimax Optimization by Anderson Acceleration

Many modern machine learning algorithms such as generative adversarial n...
research
11/24/2022

Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems

In this paper, we consider a class of nonconvex-nonconcave minimax probl...
research
12/22/2021

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Alternating gradient-descent-ascent (AltGDA) is an optimization algorith...
research
10/12/2022

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Stochastic gradient descent-ascent (SGDA) is one of the main workhorses ...
research
08/23/2020

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning

Temporal-Difference (TD) learning with nonlinear smooth function approxi...
research
02/21/2022

Semi-Implicit Hybrid Gradient Methods with Application to Adversarial Robustness

Adversarial examples, crafted by adding imperceptible perturbations to n...
research
11/19/2018

Stackelberg GAN: Towards Provable Minimax Equilibrium via Multi-Generator Architectures

We study the problem of alleviating the instability issue in the GAN tra...

Please sign up or login with your details

Forgot password? Click here to reset