Can We Find Nash Equilibria at a Linear Rate in Markov Games?

03/03/2023
by   Zhuoqing Song, et al.
0

We study decentralized learning in two-player zero-sum discounted Markov games where the goal is to design a policy optimization algorithm for either agent satisfying two properties. First, the player does not need to know the policy of the opponent to update its policy. Second, when both players adopt the algorithm, their joint policy converges to a Nash equilibrium of the game. To this end, we construct a meta algorithm, dubbed as , which provably finds a Nash equilibrium at a global linear rate. In particular, interweaves two base algorithms and via homotopy continuation. is an algorithm that enjoys local linear convergence while is an algorithm that converges globally but at a slower sublinear rate. By switching between these two base algorithms, essentially serves as a “guide” which identifies a benign neighborhood where enjoys fast convergence. However, since the exact size of such a neighborhood is unknown, we apply a doubling trick to switch between these two base algorithms. The switching scheme is delicately designed so that the aggregated performance of the algorithm is driven by . Furthermore, we prove that and can both be instantiated by variants of optimistic gradient descent/ascent (OGDA) method, which is of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2022

Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence

We examine global non-asymptotic convergence properties of policy gradie...
research
10/18/2021

Empirical Policy Optimization for n-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its pol...
research
05/31/2019

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

We study the global convergence of policy optimization for finding the N...
research
10/14/2022

Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games

We study Nash equilibria learning of a general-sum stochastic game with ...
research
11/19/2019

Fictitious Play: Convergence, Smoothness, and Optimism

We consider the dynamics of two-player zero-sum games, with the goal of ...
research
02/08/2021

Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games

We study infinite-horizon discounted two-player zero-sum Markov games, a...
research
06/07/2021

Forward Looking Best-Response Multiplicative Weights Update Methods

We propose a novel variant of the multiplicative weights update method w...

Please sign up or login with your details

Forgot password? Click here to reset