Boosting Value Decomposition via Unit-Wise Attentive State Representation for Cooperative Multi-Agent Reinforcement Learning

05/12/2023
by   Qingpeng Zhao, et al.
0

In cooperative multi-agent reinforcement learning (MARL), the environmental stochasticity and uncertainties will increase exponentially when the number of agents increases, which puts hard pressure on how to come up with a compact latent representation from partial observation for boosting value decomposition. To tackle these issues, we propose a simple yet powerful method that alleviates partial observability and efficiently promotes coordination by introducing the UNit-wise attentive State Representation (UNSR). In UNSR, each agent learns a compact and disentangled unit-wise state representation outputted from transformer blocks, and produces its local action-value function. The proposed UNSR is used to boost the value decomposition with a multi-head attention mechanism for producing efficient credit assignment in the mixing network, providing an efficient reasoning path between the individual value function and joint value function. Experimental results demonstrate that our method achieves superior performance and data efficiency compared to solid baselines on the StarCraft II micromanagement challenge. Additional ablation experiments also help identify the key factors contributing to the performance of UNSR.

READ FULL TEXT

page 6

page 7

page 11

research
08/15/2022

Transformer-based Value Function Decomposition for Cooperative Multi-agent Reinforcement Learning in StarCraft

The StarCraft II Multi-Agent Challenge (SMAC) was created to be a challe...
research
06/16/2017

Value-Decomposition Networks For Cooperative Multi-Agent Learning

We study the problem of cooperative multi-agent reinforcement learning w...
research
09/08/2023

Leveraging World Model Disentanglement in Value-Based Multi-Agent Reinforcement Learning

In this paper, we propose a novel model-based multi-agent reinforcement ...
research
02/04/2023

Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative Multi-Agent Reinforcement Learning

Value decomposition methods have gradually become popular in the coopera...
research
12/08/2021

Greedy-based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

Due to the representation limitation of the joint Q value function, mult...
research
02/14/2023

Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning

Real-world cooperation often requires intensive coordination among agent...
research
07/08/2022

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Deep cooperative multi-agent reinforcement learning has demonstrated its...

Please sign up or login with your details

Forgot password? Click here to reset