DeepAI AI Chat
Log In Sign Up

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

06/08/2020
by   Thomas Anthony, et al.
0

Recent advances in deep reinforcement learning (RL) have led to considerable progress in many 2-player zero-sum games, such as Go, Poker and Starcraft. The purely adversarial nature of such games allows for conceptually simple and principled application of RL methods. However real-world settings are many-agent, and agent interactions are complex mixtures of common-interest and competitive aspects. We consider Diplomacy, a 7-player board game designed to accentuate dilemmas resulting from many-agent interactions. It also features a large combinatorial action space and simultaneous moves, which are challenging for RL algorithms. We propose a simple yet effective approximate best response operator, designed to handle large combinatorial action spaces and simultaneous moves. We also introduce a family of policy iteration methods that approximate fictitious play. With these methods, we successfully apply RL to Diplomacy: we show that our agents convincingly outperform the previous state-of-the-art, and game theoretic equilibrium analysis shows that the new process yields consistent improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/10/2021

Reinforcement Learning In Two Player Zero Sum Simultaneous Action Games

Two player zero sum simultaneous action games are common in video games,...
02/21/2017

Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning

There has been a recent explosion in the capabilities of game-playing ar...
08/17/2020

Playing Catan with Cross-dimensional Neural Network

Catan is a strategic board game having interesting properties, including...
06/12/2022

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Algorithms designed for single-agent reinforcement learning (RL) general...
09/11/2020

Physically Embedded Planning Problems: New Challenges for Reinforcement Learning

Recent work in deep reinforcement learning (RL) has produced algorithms ...
07/13/2022

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

In competitive two-agent environments, deep reinforcement learning (RL) ...