More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization

09/26/2022
by   Jiangxing Wang, et al.
0

In cooperative multi-agent reinforcement learning (MARL), combining value decomposition with actor-critic enables agents to learn stochastic policies, which are more suitable for the partially observable environment. Given the goal of learning local policies that enable decentralized execution, agents are commonly assumed to be independent of each other, even in centralized training. However, such an assumption may prohibit agents from learning the optimal joint policy. To address this problem, we explicitly take the dependency among agents into centralized training. Although this leads to the optimal joint policy, it may not be factorized for decentralized execution. Nevertheless, we theoretically show that from such a joint policy, we can always derive another joint policy that achieves the same optimality but can be factorized for decentralized execution. To this end, we propose multi-agent conditional policy factorization (MACPF), which takes more centralized training but still enables decentralized execution. We empirically verify MACPF in various cooperative MARL tasks and demonstrate that MACPF achieves better performance or faster convergence than baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2019

Decentralized Multi-Agent Actor-Critic with Generative Inference

Recent multi-agent actor-critic methods have utilized centralized traini...
research
11/06/2022

Decentralized Policy Optimization

The study of decentralized learning or independent learning in cooperati...
research
07/12/2022

Towards Global Optimality in Cooperative MARL with Sequential Transformation

Policy learning in multi-agent reinforcement learning (MARL) is challeng...
research
05/27/2023

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Centralized Training with Decentralized Execution (CTDE) has recently em...
research
10/22/2018

Multi-Agent Actor-Critic with Generative Cooperative Policy Network

We propose an efficient multi-agent reinforcement learning approach to d...
research
01/25/2019

Distributed Policy Iteration for Scalable Approximation of Cooperative Multi-Agent Policies

Decision making in multi-agent systems (MAS) is a great challenge due to...
research
02/21/2021

Dealing with Non-Stationarity in Multi-Agent Reinforcement Learning via Trust Region Decomposition

Non-stationarity is one thorny issue in multi-agent reinforcement learni...

Please sign up or login with your details

Forgot password? Click here to reset