Decentralized Graph-Based Multi-Agent Reinforcement Learning Using Reward Machines

09/30/2021
by   Jueming Hu, et al.
0

In multi-agent reinforcement learning (MARL), it is challenging for a collection of agents to learn complex temporally extended tasks. The difficulties lie in computational complexity and how to learn the high-level ideas behind reward functions. We study the graph-based Markov Decision Process (MDP) where the dynamics of neighboring agents are coupled. We use a reward machine (RM) to encode each agent's task and expose reward function internal structures. RM has the capacity to describe high-level knowledge and encode non-Markovian reward functions. We propose a decentralized learning algorithm to tackle computational complexity, called decentralized graph-based reinforcement learning using reward machines (DGRM), that equips each agent with a localized policy, allowing agents to make decisions independently, based on the information available to the agents. DGRM uses the actor-critic structure, and we introduce the tabular Q-function for discrete state problems. We show that the dependency of Q-function on other agents decreases exponentially as the distance between them increases. Furthermore, the complexity of DGRM is related to the local information size of the largest κ-hop neighborhood, and DGRM can find an O(ρ^κ+1)-approximation of a stationary point of the objective function. To further improve efficiency, we also propose the deep DGRM algorithm, using deep neural networks to approximate the Q-function and policy function to solve large-scale or continuous state problems. The effectiveness of the proposed DGRM algorithm is evaluated by two case studies, UAV package delivery and COVID-19 pandemic mitigation. Experimental results show that local information is sufficient for DGRM and agents can accomplish complex tasks with the help of RM. DGRM improves the global accumulated reward by 119 the baseline in the case of COVID-19 pandemic mitigation.

READ FULL TEXT
research
10/31/2021

Decentralized Multi-Agent Reinforcement Learning: An Off-Policy Method

We discuss the problem of decentralized multi-agent reinforcement learni...
research
12/05/2019

Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

We study reinforcement learning (RL) in a setting with a network of agen...
research
12/16/2021

Learning to Share in Multi-Agent Reinforcement Learning

In this paper, we study the problem of networked multi-agent reinforceme...
research
05/30/2022

Learning Open Domain Multi-hop Search Using Reinforcement Learning

We propose a method to teach an automated agent to learn how to search f...
research
09/15/2019

Exploiting Fast Decaying and Locality in Multi-Agent MDP with Tree Dependence Structure

This paper considers a multi-agent Markov Decision Process (MDP), where ...
research
05/31/2022

Hierarchies of Reward Machines

Reward machines (RMs) are a recent formalism for representing the reward...
research
03/19/2023

Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning

Interactive segmentation has recently been explored to effectively and e...

Please sign up or login with your details

Forgot password? Click here to reset