Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments

08/13/2021
by   Vasilii Davydov, et al.
0

In this paper, we consider the problem of multi-agent navigation in partially observable grid environments. This problem is challenging for centralized planning approaches as they, typically, rely on the full knowledge of the environment. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals. To tackle the challenge associated with learning cooperative behavior, i.e. in many cases agents need to yield to each other to accomplish a mission, we use a mixing Q-network that complements learning individual policies. In the experimental evaluation, we show that such approach leads to plausible results and scales well to large number of agents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2020

Scalable Reinforcement Learning Policies for Multi-Agent Control

This paper develops a stochastic Multi-Agent Reinforcement Learning (MAR...
research
06/22/2022

POGEMA: Partially Observable Grid Environment for Multiple Agents

We introduce POGEMA (https://github.com/AIRI-Institute/pogema) a sandbox...
research
10/03/2021

Mixed Observable RRT: Multi-Agent Mission-Planning in Partially Observable Environments

This paper considers centralized mission-planning for a heterogeneous mu...
research
03/24/2023

Learning Reward Machines in Cooperative Multi-Agent Tasks

This paper presents a novel approach to Multi-Agent Reinforcement Learni...
research
09/22/2022

Environment Optimization for Multi-Agent Navigation

Traditional approaches to the design of multi-agent navigation algorithm...
research
05/25/2023

On Computing Universal Plans for Partially Observable Multi-Agent Path Finding

Multi-agent routing problems have drawn significant attention nowadays d...
research
05/15/2023

More Like Real World Game Challenge for Partially Observable Multi-Agent Cooperation

Some standardized environments have been designed for partially observab...

Please sign up or login with your details

Forgot password? Click here to reset