Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

12/16/2021
by   Enrico Marchesini, et al.
0

We study the problem of multi-robot mapless navigation in the popular Centralized Training and Decentralized Execution (CTDE) paradigm. This problem is challenging when each robot considers its path without explicitly sharing observations with other robots and can lead to non-stationary issues in Deep Reinforcement Learning (DRL). The typical CTDE algorithm factorizes the joint action-value function into individual ones, to favor cooperation and achieve decentralized execution. Such factorization involves constraints (e.g., monotonicity) that limit the emergence of novel behaviors in an individual as each agent is trained starting from a joint action-value. In contrast, we propose a novel architecture for CTDE that uses a centralized state-value network to compute a joint state-value, which is used to inject global state information in the value-based updates of the agents. Consequently, each model computes its gradient update for the weights, considering the overall state of the environment. Our idea follows the insights of Dueling Networks as a separate estimation of the joint state-value has both the advantage of improving sample efficiency, while providing each robot information whether the global state is (or is not) valuable. Experiments in a robotic navigation task with 2 4, and 8 robots, confirm the superior performance of our approach over prior CTDE methods (e.g., VDN, QMIX).

READ FULL TEXT
research
05/14/2019

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

We explore value-based solutions for multi-agent reinforcement learning ...
research
01/04/2022

Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

Value function factorization via centralized training and decentralized ...
research
08/03/2020

QPLEX: Duplex Dueling Multi-Agent Q-Learning

We explore value-based multi-agent reinforcement learning (MARL) in the ...
research
09/19/2019

Multi-Robot Deep Reinforcement Learning with Macro-Actions

In many real-world multi-robot tasks, high-quality solutions often requi...
research
11/11/2020

Decentralized Motion Planning for Multi-Robot Navigation using Deep Reinforcement Learning

This work presents a decentralized motion planning framework for address...
research
04/07/2022

Distributed Reinforcement Learning for Robot Teams: A Review

Purpose of review: Recent advances in sensing, actuation, and computatio...
research
03/22/2018

DOP: Deep Optimistic Planning with Approximate Value Function Evaluation

Research on reinforcement learning has demonstrated promising results in...

Please sign up or login with your details

Forgot password? Click here to reset