Information Gathering in Decentralized POMDPs by Policy Graph Improvement

02/26/2019
by   Mikko Lauri, et al.
0

Decentralized policies for information gathering are required when multiple autonomous agents are deployed to collect data about a phenomenon of interest without the ability to communicate. Decentralized partially observable Markov decision processes (Dec-POMDPs) are a general, principled model well-suited for such decentralized multiagent decision-making problems. In this paper, we investigate Dec-POMDPs for decentralized information gathering problems. An optimal solution of a Dec-POMDP maximizes the expected sum of rewards over time. To encourage information gathering, we set the reward as a function of the agents' state information, for example the negative Shannon entropy. We prove that if the reward is convex, then the finite-horizon value function of the corresponding Dec-POMDP is also convex. We propose the first heuristic algorithm for information gathering Dec-POMDPs, and empirically prove its effectiveness by solving problems an order of magnitude larger than previous state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2015

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes

It is well known that for any finite state Markov decision process (MDP)...
research
01/16/2013

The Complexity of Decentralized Control of Markov Decision Processes

Planning for distributed agents with partial state information is consid...
research
02/12/2014

Planning for Decentralized Control of Multiple Robots Under Uncertainty

We describe a probabilistic framework for synthesizing control policies ...
research
07/04/2012

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

We present multi-agent A* (MAA*), the first complete and optimal heurist...
research
08/06/2019

Online Planning for Decentralized Stochastic Control with Partial History Sharing

In decentralized stochastic control, standard approaches for sequential ...
research
10/25/2021

HSVI fo zs-POSGs using Concavity, Convexity and Lipschitz Properties

Dynamic programming and heuristic search are at the core of state-of-the...
research
06/09/2021

Information Avoidance and Overvaluation in Sequential Decision Making under Epistemic Constraints

Decision makers involved in the management of civil assets and systems u...

Please sign up or login with your details

Forgot password? Click here to reset