A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

11/02/2017
by   Marc Lanctot, et al.
0

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

READ FULL TEXT

page 6

page 20

page 22

page 23

research
02/01/2023

Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning

Multiagent reinforcement learning (MARL) has benefited significantly fro...
research
02/27/2019

Introspection Learning

Traditional reinforcement learning agents learn from experience, past or...
research
06/17/2020

Policy Evaluation and Seeking for Multi-Agent Reinforcement Learning via Best Response

This paper introduces two metrics (cycle-based and memory-based metrics)...
research
02/21/2021

A Game-Theoretic Approach for Hierarchical Policy-Making

We present the design and analysis of a multi-level game-theoretic model...
research
11/04/2017

Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning

Rather than learning new control policies for each new task, it is possi...
research
06/20/2019

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Detection of malicious behavior is a fundamental problem in security. On...
research
06/28/2018

Procedural Level Generation Improves Generality of Deep Reinforcement Learning

Over the last few years, deep reinforcement learning (RL) has shown impr...

Please sign up or login with your details

Forgot password? Click here to reset