PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning – Lifelong

10/16/2020
by   Mehul Damani, et al.
0

Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) – an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one – in dense and highly structured environments, typical of real-world warehouses operations. Effectively solving LMAPF in such environments requires expensive coordination between agents as well as frequent replanning abilities, a daunting task for existing coupled and decoupled approaches alike. With the purpose of achieving considerable agent coordination without any compromise on reactivity and scalability, we introduce PRIMAL2, a distributed reinforcement learning framework for LMAPF where agents learn fully decentralized policies to reactively plan paths online in a partially observable world. We extend our previous work, which was effective in low-density sparsely occupied worlds, to highly structured and constrained worlds by identifying behaviors and conventions which improve implicit agent coordination, and enabling their learning through the construction of a novel local agent observation and various training aids. We present extensive results of PRIMAL2 in both MAPF and LMAPF environments with up to 1024 agents and compare its performance to complete state-of-the-art planners. We experimentally observe that agents successfully learn to follow ideal conventions and can exhibit selfless coordinated maneuvers that maximize joint rewards. We find that not only does PRIMAL2 significantly surpass our previous work, it is also able to perform on par and even outperform state-of-the-art planners in terms of throughput.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

research
09/10/2018

PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning

Multi-agent path finding (MAPF) is an essential component of many large-...
research
09/19/2023

Crowd-Aware Multi-Agent Pathfinding With Boosted Curriculum Reinforcement Learning

Multi-Agent Path Finding (MAPF) in crowded environments presents a chall...
research
05/19/2022

Distributed Multi-Agent Deep Reinforcement Learning for Robust Coordination against Noise

In multi-agent systems, noise reduction techniques are important for imp...
research
10/09/2022

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

Modern multi-agent reinforcement learning frameworks rely on centralized...
research
07/09/2020

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Autonomous agents must learn to collaborate. It is not scalable to devel...
research
03/01/2023

SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding

Trading off performance guarantees in favor of scalability, the Multi-Ag...
research
03/07/2022

Reinforcement Learning for Location-Aware Scheduling

Recent techniques in dynamical scheduling and resource management have f...

Please sign up or login with your details

Forgot password? Click here to reset