Multi-Agent Cooperation via Unsupervised Learning of Joint Intentions

07/05/2023
by   Shanqi Liu, et al.
0

The field of cooperative multi-agent reinforcement learning (MARL) has seen widespread use in addressing complex coordination tasks. While value decomposition methods in MARL have been popular, they have limitations in solving tasks with non-monotonic returns, restricting their general application. Our work highlights the significance of joint intentions in cooperation, which can overcome non-monotonic problems and increase the interpretability of the learning process. To this end, we present a novel MARL method that leverages learnable joint intentions. Our method employs a hierarchical framework consisting of a joint intention policy and a behavior policy to formulate the optimal cooperative policy. The joint intentions are autonomously learned in a latent space through unsupervised learning and enable the method adaptable to different agent configurations. Our results demonstrate significant performance improvements in both the StarCraft micromanagement benchmark and challenging MAgent domains, showcasing the effectiveness of our method in learning meaningful joint intentions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2023

Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning

Real-world cooperation often requires intensive coordination among agent...
research
03/29/2022

Multi-Agent Asynchronous Cooperation with Hierarchical Reinforcement Learning

Hierarchical multi-agent reinforcement learning (MARL) has shown a signi...
research
10/16/2019

MAVEN: Multi-Agent Variational Exploration

Centralised training with decentralised execution is an important settin...
research
03/19/2020

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

In many real-world settings, a team of agents must coordinate its behavi...
research
05/28/2022

Multi-agent Databases via Independent Learning

Machine learning is rapidly being used in database research to improve t...
research
02/21/2021

Dealing with Non-Stationarity in Multi-Agent Reinforcement Learning via Trust Region Decomposition

Non-stationarity is one thorny issue in multi-agent reinforcement learni...
research
12/08/2021

Greedy-based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

Due to the representation limitation of the joint Q value function, mult...

Please sign up or login with your details

Forgot password? Click here to reset