Log In Sign Up

Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games

by   Kenshi Abe, et al.

The theory of learning in games is prominent in the AI community, motivated by several rising applications such as multi-agent reinforcement learning and Generative Adversarial Networks. We propose Mutation-driven Multiplicative Weights Update (M2WU) for learning an equilibrium in two-player zero-sum normal-form games and prove that it exhibits the last-iterate convergence property in both full- and noisy-information feedback settings. In the full-information feedback setting, the players observe their exact gradient vectors of the utility functions. On the other hand, in the noisy-information feedback setting, they can only observe the noisy gradient vectors. Existing algorithms, including the well-known Multiplicative Weights Update (MWU) and Optimistic MWU (OMWU) algorithms, fail to converge to a Nash equilibrium with noisy-information feedback. In contrast, M2WU exhibits the last-iterate convergence to a stationary point near a Nash equilibrium in both of the feedback settings. We then prove that it converges to an exact Nash equilibrium by adapting the mutation term iteratively. We empirically confirm that M2WU outperforms MWU and OMWU in exploitability and convergence rates.


page 1

page 2

page 3

page 4


Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

In this study, we consider a variant of the Follow the Regularized Leade...

Exponential Convergence of Gradient Methods in Concave Network Zero-sum Games

Motivated by Generative Adversarial Networks, we study the computation o...

Forward Looking Best-Response Multiplicative Weights Update Methods

We propose a novel variant of the multiplicative weights update method w...

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Algorithms designed for single-agent reinforcement learning (RL) general...

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

This paper investigates the problem of computing the equilibrium of comp...

Learning in Matrix Games can be Arbitrarily Complex

A growing number of machine learning architectures, such as Generative A...

Learning in Games with Quantized Payoff Observations

This paper investigates the impact of feedback quantization on multi-age...