Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

02/18/2015
by   Frans A. Oliehoek, et al.
0

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Better Algorithms for Online Bin Stretching via Computer Search

We present an algorithm for computing upper bounds for the Online Bin St...
research
11/28/2022

Some Upper Bounds on the Running Time of Policy Iteration on Deterministic MDPs

Policy Iteration (PI) is a widely used family of algorithms to compute o...
research
08/05/2022

Planning under periodic observations: bounds and bounding-based solutions

We study planning problems faced by robots operating in uncertain enviro...
research
11/03/2020

Loss Bounds for Approximate Influence-Based Abstraction

Sequential decision making techniques hold great promise to improve the ...
research
10/16/2012

FHHOP: A Factored Hybrid Heuristic Online Planning Algorithm for Large POMDPs

Planning in partially observable Markov decision processes (POMDPs) rema...
research
11/30/2015

Scaling POMDPs For Selecting Sellers in E-markets-Extended Version

In multiagent e-marketplaces, buying agents need to select good sellers ...
research
09/26/2011

Learning in Real-Time Search: A Unifying Framework

Real-time search methods are suited for tasks in which the agent is inte...

Please sign up or login with your details

Forgot password? Click here to reset