A Generalized Training Approach for Multiagent Learning

09/27/2019
by   Paul Müller, et al.
20

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO). PSRO is general in the sense that it (1) encompasses well-known algorithms such as fictitious play and double oracle as special cases, and (2) in principle applies to general-sum, many-player games. Despite this, prior studies of PSRO have been focused on two-player zero-sum games, a regime wherein Nash equilibria are tractably computable. In moving from two-player zero-sum games to more general settings, computation of Nash equilibria quickly becomes infeasible. Here, we extend the theoretical underpinnings of PSRO by considering an alternative solution concept, α-Rank, which is unique (thus faces no equilibrium selection issues, unlike Nash) and tractable to compute in general-sum, many-player settings. We establish convergence guarantees in several games classes, and identify links between Nash equilibria and α-Rank. We demonstrate the competitive performance of α-Rank-based PSRO against an exact Nash solver-based PSRO in 2-player Kuhn and Leduc Poker. We then go beyond the reach of prior PSRO applications by considering 3- to 5-player poker games, yielding instances where α-Rank achieves faster convergence than approximate Nash solvers, thus establishing it as a favorable general games solver. We also carry out an initial empirical validation in MuJoCo soccer, illustrating the feasibility of the proposed approach in another complex domain.

READ FULL TEXT

page 3

page 22

page 26

research
01/03/2019

On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games

We propose a two-timescale algorithm for finding local Nash equilibria i...
research
09/21/2019

Multiagent Evaluation under Incomplete Information

This paper investigates the evaluation of learned multiagent strategies ...
research
06/12/2022

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Algorithms designed for single-agent reinforcement learning (RL) general...
research
02/24/2022

Semidefinite games

We introduce and study the class of semidefinite games, which generalize...
research
09/28/2022

Meta-Learning in Games

In the literature on game-theoretic equilibrium finding, focus has mainl...
research
04/20/2020

Real World Games Look Like Spinning Tops

This paper investigates the geometrical properties of real world games (...
research
09/09/2019

A fixed-point policy-iteration-type algorithm for symmetric nonzero-sum stochastic impulse games

Nonzero-sum stochastic differential games with impulse controls offer a ...

Please sign up or login with your details

Forgot password? Click here to reset