Shielding in Resource-Constrained Goal POMDPs

11/28/2022
by   Michal Ajdarów, et al.
0

We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call resource-constrained goal optimization (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a shield for a given scenario: a procedure that observes the agent and prevents it from using actions that might eventually lead to resource exhaustion. Second, we augment the POMCP heuristic search algorithm for POMDP planning with our shields to obtain an algorithm solving the RSGO problem. We implement our algorithm and present experiments showing its applicability to benchmarks from the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2014

Quantum POMDPs

We present quantum observable Markov decision processes (QOMDPs), the qu...
research
05/05/2021

Efficient Strategy Synthesis for MDPs with Resource Constraints

We consider qualitative strategy synthesis for the formalism called cons...
research
07/04/2012

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

We present multi-agent A* (MAA*), the first complete and optimal heurist...
research
05/04/2021

Polynomial-Time Algorithms for Multi-Agent Minimal-Capacity Planning

We study the problem of minimizing the resource capacity of autonomous a...
research
05/14/2020

Qualitative Controller Synthesis for Consumption Markov Decision Processes

Consumption Markov Decision Processes (CMDPs) are probabilistic decision...
research
09/06/2020

Real-time and Large-scale Fleet Allocation of Autonomous Taxis: A Case Study in New York Manhattan Island

Nowadays, autonomous taxis become a highly promising transportation mode...
research
02/22/2017

Theoretical and Experimental Analysis of the Canadian Traveler Problem

Devising an optimal strategy for navigation in a partially observable en...

Please sign up or login with your details

Forgot password? Click here to reset