Intrinsically-Motivated Goal-Conditioned Reinforcement Learning in Multi-Agent Environments

by   Elías Masquil, et al.

How can a population of reinforcement learning agents autonomously learn a diversity of cooperative tasks in a shared environment? In the single-agent paradigm, goal-conditioned policies have been combined with intrinsic motivation mechanisms to endow agents with the ability to master a wide diversity of autonomously discovered goals. Transferring this idea to cooperative multi-agent systems (MAS) entails a challenge: intrinsically motivated agents that sample goals independently focus on a shared cooperative goal with low probability, impairing their learning performance. In this work, we propose a new learning paradigm for modeling such settings, the Decentralized Intrinsically Motivated Skill Acquisition Problem (Dec-IMSAP), and employ it to solve cooperative navigation tasks. Agents in a Dec-IMSAP are trained in a fully decentralized way, which comes in contrast to previous contributions in multi-goal MAS that consider a centralized goal-selection mechanism. Our empirical analysis indicates that a sufficient condition for efficiently learning a diversity of cooperative tasks is to ensure that a group aligns its goals, i.e., the agents pursue the same cooperative goal and learn to coordinate their actions through specialization. We introduce the Goal-coordination game, a fully-decentralized emergent communication algorithm, where goal alignment emerges from the maximization of individual rewards in multi-goal cooperative environments and show that it is able to reach equal performance to a centralized training baseline that guarantees aligned goals. To our knowledge, this is the first contribution addressing the problem of intrinsically motivated multi-agent goal exploration in a decentralized training paradigm.


page 5

page 7

page 14


Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training

Much work has been dedicated to the exploration of Multi-Agent Reinforce...

Towards Using Promises for Multi-Agent Cooperation in Goal Reasoning

Reasoning and planning for mobile robots is a challenging problem, as th...

Deep Decentralized Reinforcement Learning for Cooperative Control

In order to collaborate efficiently with unknown partners in cooperative...

Dif-MAML: Decentralized Multi-Agent Meta-Learning

The objective of meta-learning is to exploit the knowledge obtained from...

Stochastic Market Games

Some of the most relevant future applications of multi-agent systems lik...

Cooperation without Coordination: Hierarchical Predictive Planning for Decentralized Multiagent Navigation

Decentralized multiagent planning raises many challenges, such as adapti...

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Autonomous agents must learn to collaborate. It is not scalable to devel...

Please sign up or login with your details

Forgot password? Click here to reset