Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

06/01/2021
by   Jiahui Li, et al.
0

Centralized Training with Decentralized Execution (CTDE) has been a popular paradigm in cooperative Multi-Agent Reinforcement Learning (MARL) settings and is widely used in many real applications. One of the major challenges in the training process is credit assignment, which aims to deduce the contributions of each agent according to the global rewards. Existing credit assignment methods focus on either decomposing the joint value function into individual value functions or measuring the impact of local observations and actions on the global value function. These approaches lack a thorough consideration of the complicated interactions among multiple agents, leading to an unsuitable assignment of credit and subsequently mediocre results on MARL. We propose Shapley Counterfactual Credit Assignment, a novel method for explicit credit assignment which accounts for the coalition of agents. Specifically, Shapley Value and its desired properties are leveraged in deep MARL to credit any combinations of agents, which grants us the capability to estimate the individual credit for each agent. Despite this capability, the main technical difficulty lies in the computational complexity of Shapley Value who grows factorially as the number of agents. We instead utilize an approximation method via Monte Carlo sampling, which reduces the sample complexity while maintaining its effectiveness. We evaluate our method on StarCraft II benchmarks across different scenarios. Our method outperforms existing cooperative MARL algorithms significantly and achieves the state-of-the-art, with especially large margins on tasks with more severe difficulties.

READ FULL TEXT

page 4

page 6

research
10/10/2022

Learning Credit Assignment for Cooperative Reinforcement Learning

Cooperative multi-agent policy gradient (MAPG) algorithms have recently ...
research
06/02/2022

RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning

In recent years, reinforcement learning has faced several challenges in ...
research
09/22/2021

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (MARL) faces significant ...
research
02/09/2022

Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy Regularization

In cooperative multi-agent systems, agents jointly take actions and rece...
research
05/25/2022

QGNN: Value Function Factorisation with Graph Neural Networks

In multi-agent reinforcement learning, the use of a global objective is ...
research
02/10/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Recently, deep multiagent reinforcement learning (MARL) has become a hig...
research
11/23/2022

Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition

Value Decomposition (VD) aims to deduce the contributions of agents for ...

Please sign up or login with your details

Forgot password? Click here to reset