Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization

08/13/2023
by   Mohammad Mehdi Nasiri, et al.
0

This paper presents an extension of the Mirror Descent method to overcome challenges in cooperative Multi-Agent Reinforcement Learning (MARL) settings, where agents have varying abilities and individual policies. The proposed Heterogeneous-Agent Mirror Descent Policy Optimization (HAMDPO) algorithm utilizes the multi-agent advantage decomposition lemma to enable efficient policy updates for each agent while ensuring overall performance improvements. By iteratively updating agent policies through an approximate solution of the trust-region problem, HAMDPO guarantees stability and improves performance. Moreover, the HAMDPO algorithm is capable of handling both continuous and discrete action spaces for heterogeneous agents in various MARL problems. We evaluate HAMDPO on Multi-Agent MuJoCo and StarCraftII tasks, demonstrating its superiority over state-of-the-art algorithms such as HATRPO and HAPPO. These results suggest that HAMDPO is a promising approach for solving cooperative MARL problems and could potentially be extended to address other challenging problems in the field of MARL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2022

CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning

We propose a multimodal (vision-and-language) benchmark for cooperative ...
research
03/24/2023

Learning Reward Machines in Cooperative Multi-Agent Tasks

This paper presents a novel approach to Multi-Agent Reinforcement Learni...
research
03/02/2023

GHQ: Grouped Hybrid Q Learning for Heterogeneous Cooperative Multi-agent Reinforcement Learning

Previous deep multi-agent reinforcement learning (MARL) algorithms have ...
research
12/14/2022

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning

The availability of challenging benchmarks has played a key role in the ...
research
05/15/2023

Task-Oriented Communication Design at Scale

With countless promising applications in various domains such as IoT and...
research
12/29/2021

DeepHAM: A Global Solution Method for Heterogeneous Agent Models with Aggregate Shocks

We propose an efficient, reliable, and interpretable global solution met...
research
02/21/2021

Dealing with Non-Stationarity in Multi-Agent Reinforcement Learning via Trust Region Decomposition

Non-stationarity is one thorny issue in multi-agent reinforcement learni...

Please sign up or login with your details

Forgot password? Click here to reset