Consensus Multiplicative Weights Update: Learning to Learn using Projector-based Game Signatures

by   Nelson Vadori, et al.

Recently, Optimistic Multiplicative Weights Update (OMWU) was proven to be the first constant step-size algorithm in the online no-regret framework to enjoy last-iterate convergence to Nash Equilibria in the constrained zero-sum bimatrix case, where weights represent the probabilities of playing pure strategies. We introduce the second such algorithm, Consensus MWU, for which we prove local convergence and show empirically that it enjoys faster and more robust convergence than OMWU. Our algorithm shows the importance of a new object, the simplex Hessian, as well as of the interaction of the game with the (eigen)space of vectors summing to zero, which we believe future research can build on. As for OMWU, CMWU has convergence guarantees in the zero-sum case only, but Cheung and Piliouras (2020) recently showed that OMWU and MWU display opposite convergence properties depending on whether the game is zero-sum or cooperative. Inspired by this work and the recent literature on learning to optimize for single functions, we extend CMWU to non zero-sum games by introducing a new framework for online learning in games, where the update rule's gradient and Hessian coefficients along a trajectory are learnt by a reinforcement learning policy that is conditioned on the nature of the game: the game signature. We construct the latter using a new canonical decomposition of two-player games into eight components corresponding to commutative projection operators, generalizing and unifying recent game concepts studied in the literature. We show empirically that our new learning policy is able to exploit the game signature across a wide range of game types.



There are no comments yet.


page 8

page 22

page 24

page 25

page 26

page 27

page 30

page 31


Stochastic Multiplicative Weights Updates in Zero-Sum Games

We study agents competing against each other in a repeated network zero-...

Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games

We establish that algorithmic experiments in zero-sum games "fail misera...

Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information

This paper considers repeated games in which one player has more informa...

Online Double Oracle

Solving strategic games with huge action space is a critical yet under-e...

EigenGame Unloaded: When playing games is better than optimizing

We build on the recently proposed EigenGame that views eigendecompositio...

Let's be honest: An optimal no-regret framework for zero-sum games

We revisit the problem of solving two-player zero-sum games in the decen...

Chaos, Extremism and Optimism: Volume Analysis of Learning in Games

We present volume analyses of Multiplicative Weights Updates (MWU) and O...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.