Policy Diversity for Cooperative Agents

08/28/2023
by   Mingxi Tan, et al.
0

Standard cooperative multi-agent reinforcement learning (MARL) methods aim to find the optimal team cooperative policy to complete a task. However there may exist multiple different ways of cooperating, which usually are very needed by domain experts. Therefore, identifying a set of significantly different policies can alleviate the task complexity for them. Unfortunately, there is a general lack of effective policy diversity approaches specifically designed for the multi-agent domain. In this work, we propose a method called Moment-Matching Policy Diversity to alleviate this problem. This method can generate different team policies to varying degrees by formalizing the difference between team policies as the difference in actions of selected agents in different policies. Theoretically, we show that our method is a simple way to implement a constrained optimization problem that regularizes the difference between two trajectory distributions by using the maximum mean discrepancy. The effectiveness of our approach is demonstrated on a challenging team-based shooter.

READ FULL TEXT
research
08/24/2019

Universal Policies to Learn Them All

We explore a collaborative and cooperative multi-agent reinforcement lea...
research
05/31/2019

Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies

Standard reinforcement learning methods aim to master one way of solving...
research
03/07/2021

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Teamwork is a set of interrelated reasoning, actions and behaviors of te...
research
06/01/2022

Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL

Cooperative multi-agent reinforcement learning (MARL) is making rapid pr...
research
11/10/2021

DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning

In recent years, multi-agent reinforcement learning (MARL) has presented...
research
06/05/2020

Logical Team Q-learning: An approach towards factored policies in cooperative MARL

We address the challenge of learning factored policies in cooperative MA...

Please sign up or login with your details

Forgot password? Click here to reset