Many Agent Reinforcement Learning Under Partial Observability

06/17/2021
by   Keyang He, et al.
0

Recent renewed interest in multi-agent reinforcement learning (MARL) has generated an impressive array of techniques that leverage deep reinforcement learning, primarily actor-critic architectures, and can be applied to a limited range of settings in terms of observability and communication. However, a continuing limitation of much of this work is the curse of dimensionality when it comes to representations based on joint actions, which grow exponentially with the number of agents. In this paper, we squarely focus on this challenge of scalability. We apply the key insight of action anonymity, which leads to permutation invariance of joint actions, to two recently presented deep MARL algorithms, MADDPG and IA2C, and compare these instantiations to another recent technique that leverages action anonymity, viz., mean-field MARL. We show that our instantiations can learn the optimal behavior in a broader class of agent networks than the mean-field method, using a recently introduced pragmatic domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2021

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach

Multi-agent reinforcement learning (MARL) becomes more challenging in th...
research
05/15/2022

RoMFAC: A Robust Mean-Field Actor-Critic Reinforcement Learning against Adversarial Perturbations on States

Deep reinforcement learning methods for multi-agent systems make optimal...
research
04/25/2023

Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention

Traditional multi-agent reinforcement learning algorithms are difficultl...
research
03/06/2022

Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation

Multi-agent settings remain a fundamental challenge in the reinforcement...
research
09/09/2021

On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)

Mean field control (MFC) is an effective way to mitigate the curse of di...
research
10/31/2019

PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning

Sample efficiency and scalability to a large number of agents are two im...
research
01/11/2021

Solving Common-Payoff Games with Approximate Policy Iteration

For artificially intelligent learning systems to have widespread applica...

Please sign up or login with your details

Forgot password? Click here to reset