-
Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems
Nonconvex minimax problems appear frequently in emerging machine learnin...
read it
-
Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities
In this paper, we study zeroth-order algorithms for minimax optimization...
read it
-
The Landscape of Nonconvex-Nonconcave Minimax Optimization
Minimax optimization has become a central tool for modern machine learni...
read it
-
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry
The gradient descent-ascent (GDA) algorithm has been widely applied to s...
read it
-
ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs
Despite remarkable empirical success, the training dynamics of generativ...
read it
-
K-Beam Subgradient Descent for Minimax Optimization
Minimax optimization plays a key role in adversarial training of machine...
read it
Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems
Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient decent ascent (GDA), alternating gradient decent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in the sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these different limiting behaviors caused by introducing regularization in the problem instance.
READ FULL TEXT
Comments
There are no comments yet.