SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding

03/01/2023
by   Yutong Wang, et al.
0

Trading off performance guarantees in favor of scalability, the Multi-Agent Path Finding (MAPF) community has recently started to embrace Multi-Agent Reinforcement Learning (MARL), where agents learn to collaboratively generate individual, collision-free (but often suboptimal) paths. Scalability is usually achieved by assuming a local field of view (FOV) around the agents, helping scale to arbitrary world sizes. However, this assumption significantly limits the amount of information available to the agents, making it difficult for them to enact the type of joint maneuvers needed in denser MAPF tasks. In this paper, we propose SCRIMP, where agents learn individual policies from even very small (down to 3x3) FOVs, by relying on a highly-scalable global/local communication mechanism based on a modified transformer. We further equip agents with a state-value-based tie-breaking strategy to further improve performance in symmetric situations, and introduce intrinsic rewards to encourage exploration while mitigating the long-term credit assignment problem. Empirical evaluations on a set of experiments indicate that SCRIMP can achieve higher performance with improved scalability compared to other state-of-the-art learning-based MAPF planners with larger FOVs, and even yields similar performance as a classical centralized planner in many cases. Ablation studies further validate the effectiveness of our proposed techniques. Finally, we show that our trained model can be directly implemented on real robots for online MAPF through high-fidelity simulations in gazebo.

READ FULL TEXT
research
05/27/2023

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

Centralized Training with Decentralized Execution (CTDE) has recently em...
research
09/22/2021

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (MARL) faces significant ...
research
05/28/2019

Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning

Sparse rewards are one of the most important challenges in reinforcement...
research
01/05/2023

Scalable Communication for Multi-Agent Reinforcement Learning via Transformer-Based Email Mechanism

Communication can impressively improve cooperation in multi-agent reinfo...
research
10/16/2020

PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning – Lifelong

Multi-agent path finding (MAPF) is an indispensable component of large-s...
research
09/12/2021

Learning Selective Communication for Multi-Agent Path Finding

Learning communication via deep reinforcement learning (RL) or imitation...
research
09/25/2019

α^α-Rank: Scalable Multi-agent Evaluation through Evolution

Although challenging, strategy profile evaluation in large connected lea...

Please sign up or login with your details

Forgot password? Click here to reset