Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

10/11/2022
by   Anton Bakhtin, et al.
0

No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games like chess, Go, and poker, self-play alone is insufficient for achieving optimal performance in domains involving cooperation with humans. We address this shortcoming by first introducing a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy. We prove that this is a no-regret learning algorithm under a modified utility function. We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model. We used RL-DiL-piKL to train an agent we name Diplodocus. In a 200-game no-press Diplomacy tournament involving 62 human participants spanning skill levels from beginner to expert, two Diplodocus agents both achieved a higher average score than all other participants who played more than two games, and ranked first and third according to an Elo ratings model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

Human-Level Performance in No-Press Diplomacy via Equilibrium Search

Prior AI breakthroughs in complex games have focused on either the purel...
research
10/06/2021

No-Press Diplomacy from Scratch

Prior AI successes in complex games have largely focused on settings wit...
research
03/02/2023

Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning

Progress in fields of machine learning and adversarial planning has bene...
research
07/06/2021

Survey of Self-Play in Reinforcement Learning

In reinforcement learning (RL), the term self-play describes a kind of m...
research
06/05/2019

Finding Friend and Foe in Multi-Agent Games

Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dot...
research
02/03/2023

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

There is a recent trend of applying multi-agent reinforcement learning (...
research
09/23/2016

Regulating Reward Training by Means of Certainty Prediction in a Neural Network-Implemented Pong Game

We present the first reinforcement-learning model to self-improve its re...

Please sign up or login with your details

Forgot password? Click here to reset