Influence-Based Multi-Agent Exploration

10/12/2019
by   Tonghan Wang, et al.
17

Intrinsically motivated reinforcement learning aims to address the exploration challenge for sparse-reward tasks. However, the study of exploration methods in transition-dependent multi-agent settings is largely absent from the literature. We aim to take a step towards solving this problem. We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents. EITI uses mutual information to capture influence transition dynamics. EDTI uses a novel intrinsic reward, called Value of Interaction (VoI), to characterize and quantify the influence of one agent's behavior on expected returns of other agents. By optimizing EITI or EDTI objective as a regularizer, agents are encouraged to coordinate their exploration and learn policies to optimize team performance. We show how to optimize these regularizers so that they can be easily integrated with policy gradient reinforcement learning. The resulting update rule draws a connection between coordinated exploration and intrinsic reward distribution. Finally, we empirically demonstrate the significant strength of our method in a variety of multi-agent scenarios.

READ FULL TEXT

page 8

page 9

page 18

page 19

research
06/05/2019

Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning

This paper investigates the use of intrinsic reward to guide exploration...
research
08/06/2018

Learning to Share and Hide Intentions using Information Regularization

Learning to cooperate with friends and compete with foes is a key compon...
research
08/28/2021

Influence-based Reinforcement Learning for Intrinsically-motivated Agents

The reinforcement learning (RL) research area is very active, with sever...
research
05/31/2016

Information Theoretically Aided Reinforcement Learning for Embodied Agents

Reinforcement learning for embodied agents is a challenging problem. The...
research
08/12/2020

REMAX: Relational Representation for Multi-Agent Exploration

Training a multi-agent reinforcement learning (MARL) model is generally ...
research
07/05/2022

The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks and Environmental Factors without Precise Reward Functions

In this paper, we propose a novel benchmark called the StarCraft Multi-A...
research
01/31/2012

Empowerment for Continuous Agent-Environment Systems

This paper develops generalizations of empowerment to continuous states....

Please sign up or login with your details

Forgot password? Click here to reset