Open-ended Learning in Symmetric Zero-sum Games

01/23/2019
by   David Balduzzi, et al.
16

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2020

Evolutionary Game Theory Squared: Evolving Agents in Endogenously Evolving Zero-Sum Games

The predominant paradigm in evolutionary game theory and more generally ...
research
10/08/2021

Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity

Strategic diversity is often essential in games: in multi-player games, ...
research
04/20/2020

Real World Games Look Like Spinning Tops

This paper investigates the geometrical properties of real world games (...
research
10/05/2021

Stochastic Multiplicative Weights Updates in Zero-Sum Games

We study agents competing against each other in a repeated network zero-...
research
05/31/2022

Simplex NeuPL: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Learning to play optimally against any mixture over a diverse set of str...
research
09/08/2017

Cycles in adversarial regularized learning

Regularized learning is a fundamental technique in online optimization, ...
research
06/05/2023

Calibrated Stackelberg Games: Learning Optimal Commitments Against Calibrated Agents

In this paper, we introduce a generalization of the standard Stackelberg...

Please sign up or login with your details

Forgot password? Click here to reset