Scalable Planning in Multi-Agent MDPs

03/29/2021
by   Dinuka Sahabandu, et al.
0

Multi-agent Markov Decision Processes (MMDPs) arise in a variety of applications including target tracking, control of multi-robot swarms, and multiplayer games. A key challenge in MMDPs occurs when the state and action spaces grow exponentially in the number of agents, making computation of an optimal policy computationally intractable for medium- to large-scale problems. One property that has been exploited to mitigate this complexity is transition independence, in which each agent's transition probabilities are independent of the states and actions of other agents. Transition independence enables factorization of the MMDP and computation of local agent policies but does not hold for arbitrary MMDPs. In this paper, we propose an approximate transition dependence property, called δ-transition dependence and develop a metric for quantifying how far an MMDP deviates from transition independence. Our definition of δ-transition dependence recovers transition independence as a special case when δ is zero. We develop a polynomial time algorithm in the number of agents that achieves a provable bound on the global optimum when the reward functions are monotone increasing and submodular in the agent actions. We evaluate our approach on two case studies, namely, multi-robot control and multi-agent patrolling example.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Model-Free Learning and Optimal Policy Design in Multi-Agent MDPs Under Probabilistic Agent Dropout

This work studies a multi-agent Markov decision process (MDP) that can u...
research
01/13/2023

Mean-Field Control based Approximation of Multi-Agent Reinforcement Learning in Presence of a Non-decomposable Shared Global State

Mean Field Control (MFC) is a powerful approximation tool to solve large...
research
09/15/2019

Exploiting Fast Decaying and Locality in Multi-Agent MDP with Tree Dependence Structure

This paper considers a multi-agent Markov Decision Process (MDP), where ...
research
03/24/2015

Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs

Interactive partially observable Markov decision processes (I-POMDP) pro...
research
06/25/2020

Distributed Policy Synthesis of Multi-Agent Systems With Graph Temporal Logic Specifications

We study the distributed synthesis of policies for multi-agent systems t...
research
12/03/2018

A Unified Approach to Dynamic Decision Problems with Asymmetric Information - Part I: Non-Strategic Agents

We study a general class of dynamic multi-agent decision problems with a...
research
07/11/2022

Cluster-Based Control of Transition-Independent MDPs

This work studies the ability of a third-party influencer to control the...

Please sign up or login with your details

Forgot password? Click here to reset