Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation

09/30/2020
by   Tanner Fiez, et al.
0

We study the role that a finite timescale separation parameter τ has on gradient descent-ascent in two-player non-convex, non-concave zero-sum games where the learning rate of player 1 is denoted by γ_1 and the learning rate of player 2 is defined to be γ_2=τγ_1. Existing work analyzing the role of timescale separation in gradient descent-ascent has primarily focused on the edge cases of players sharing a learning rate (τ =1) and the maximizing player approximately converging between each update of the minimizing player (τ→∞). For the parameter choice of τ=1, it is known that the learning dynamics are not guaranteed to converge to a game-theoretically meaningful equilibria in general. In contrast, Jin et al. (2020) showed that the stable critical points of gradient descent-ascent coincide with the set of strict local minmax equilibria as τ→∞. In this work, we bridge the gap between past work by showing there exists a finite timescale separation parameter τ^∗ such that x^∗ is a stable critical point of gradient descent-ascent for all τ∈ (τ^∗, ∞) if and only if it is a strict local minmax equilibrium. Moreover, we provide an explicit construction for computing τ^∗ along with corresponding convergence rates and results under deterministic and stochastic gradient feedback. The convergence results we present are complemented by a non-convergence result: given a critical point x^∗ that is not a strict local minmax equilibrium, then there exists a finite timescale separation τ_0 such that x^∗ is unstable for all τ∈ (τ_0, ∞). Finally, we empirically demonstrate on the CIFAR-10 and CelebA datasets the significant impact timescale separation has on training performance.

READ FULL TEXT

page 27

page 34

research
10/25/2021

Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates

Large-scale optimization problems require algorithms both effective and ...
research
05/28/2019

Competitive Gradient Descent

We introduce a new algorithm for the numerical computation of Nash equil...
research
01/13/2021

Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent

Many recent AI architectures are inspired by zero-sum games, however, th...
research
05/27/2022

Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

We study the problem of finding the Nash equilibrium in a two-player zer...
research
06/16/2020

Linear Last-iterate Convergence for Matrix Games and Stochastic Games

Optimistic Gradient Descent Ascent (OGDA) algorithm for saddle-point opt...
research
02/14/2022

Simultaneous Transport Evolution for Minimax Equilibria on Measures

Min-max optimization problems arise in several key machine learning setu...
research
05/26/2020

On the Impossibility of Global Convergence in Multi-Loss Optimization

Under mild regularity conditions, gradient-based methods converge global...

Please sign up or login with your details

Forgot password? Click here to reset