Learning Cooperative Multi-Agent Policies with Partial Reward Decoupling

12/23/2021
by   Benjamin Freed, et al.
8

One of the preeminent obstacles to scaling multi-agent reinforcement learning to large numbers of agents is assigning credit to individual agents' actions. In this paper, we address this credit assignment problem with an approach that we call partial reward decoupling (PRD), which attempts to decompose large cooperative multi-agent RL problems into decoupled subproblems involving subsets of agents, thereby simplifying credit assignment. We empirically demonstrate that decomposing the RL problem using PRD in an actor-critic algorithm results in lower variance policy gradient estimates, which improves data efficiency, learning stability, and asymptotic performance across a wide array of multi-agent RL tasks, compared to various other actor-critic approaches. Additionally, we relate our approach to counterfactual multi-agent policy gradient (COMA), a state-of-the-art MARL algorithm, and empirically show that our approach outperforms COMA by making better use of information in agents' reward streams, and by enabling recent advances in advantage estimation to be used.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
10/16/2021

Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

Policy gradient methods have become popular in multi-agent reinforcement...
research
08/02/2019

Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning

This paper proposes a definition of system health in the context of mult...
research
04/01/2020

Counterfactual Multi-Agent Reinforcement Learning with Graph Convolution Communication

We consider a fully cooperative multi-agent system where agents cooperat...
research
10/10/2022

Learning Credit Assignment for Cooperative Reinforcement Learning

Cooperative multi-agent policy gradient (MAPG) algorithms have recently ...
research
07/11/2019

Rethink Global Reward Game and Credit Assignment in Multi-agent Reinforcement Learning

Cooperative game is a critical research area in multi-agent reinforcemen...
research
02/24/2020

Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic

Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a re...
research
09/27/2019

Multi-Agent Actor-Critic with Hierarchical Graph Attention Network

Most previous studies on multi-agent reinforcement learning focus on der...

Please sign up or login with your details

Forgot password? Click here to reset