DeepAI AI Chat
Log In Sign Up

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

by   Feihu Huang, et al.

In the paper, we propose a class of faster adaptive gradient descent ascent methods for solving the nonconvex-strongly-concave minimax problems by using unified adaptive matrices used in the SUPER-ADAM <cit.>. Specifically, we propose a fast adaptive gradient decent ascent (AdaGDA) method based on the basic momentum technique, which reaches a low sample complexity of O(κ^4ϵ^-4) for finding an ϵ-stationary point without large batches, which improves the existing result of adaptive minimax optimization method by a factor of O(√(κ)). Moreover, we present an accelerated version of AdaGDA (VR-AdaGDA) method based on the momentum-based variance reduced technique, which achieves the best known sample complexity of O(κ^3ϵ^-3) for finding an ϵ-stationary point without large batches. Further assume the bounded Lipschitz parameter of objective function, we prove that our VR-AdaGDA method reaches a lower sample complexity of O(κ^2.5ϵ^-3) with the mini-batch size O(κ). In particular, we provide an effective convergence analysis framework for our adaptive methods based on unified adaptive matrices, which include almost existing adaptive learning rates.


page 1

page 2

page 3

page 4


BiAdam: Fast Adaptive Bilevel Optimization Methods

Bilevel optimization recently has attracted increased interest in machin...

Gradient Descent Ascent for Min-Max Problems on Riemannian Manifold

In the paper, we study a class of useful non-convex minimax optimization...

Bregman Gradient Policy Optimization

In this paper, we design a novel Bregman gradient policy optimization fr...

Accelerated Zeroth-Order Momentum Methods from Mini to Minimax Optimization

In the paper, we propose a new accelerated zeroth-order momentum (Acc-ZO...

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

Adaptive gradient methods have shown excellent performance for solving m...

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning

Temporal-Difference (TD) learning with nonlinear smooth function approxi...

Private optimization in the interpolation regime: faster rates and hardness results

In non-private stochastic convex optimization, stochastic gradient metho...