A Maximum Mutual Information Framework for Multi-Agent Reinforcement Learning

06/04/2020
by   Woojun Kim, et al.
0

In this paper, we propose a maximum mutual information (MMI) framework for multi-agent reinforcement learning (MARL) to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the mutual information between actions. By introducing a latent variable to induce nonzero mutual information between actions and applying a variational bound, we derive a tractable lower bound on the considered MMI-regularized objective function. Applying policy iteration to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic (VM3-AC), which follows centralized learning with decentralized execution (CTDE). We evaluated VM3-AC for several games requiring coordination, and numerical results show that VM3-AC outperforms MADDPG and other MARL algorithms in multi-agent tasks requiring coordination.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2023

A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning

In this paper, we propose a new mutual information framework for multi-a...
research
11/12/2019

Learning Representations in Reinforcement Learning:An Information Bottleneck Approach

The information bottleneck principle is an elegant and useful approach t...
research
09/29/2015

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

The mutual information is a core statistical quantity that has applicati...
research
03/16/2022

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Learning to collaborate is critical in Multi-Agent Reinforcement Learnin...
research
01/20/2022

Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming

Information sharing is key in building team cognition and enables coordi...
research
05/17/2019

A Regularized Opponent Model with Maximum Entropy Objective

In a single-agent setting, reinforcement learning (RL) tasks can be cast...
research
06/21/2020

Emergent cooperation through mutual information maximization

With artificial intelligence systems becoming ubiquitous in our society,...

Please sign up or login with your details

Forgot password? Click here to reset