Distribution-based objectives for Markov Decision Processes

04/25/2018
by   S. Akshay, et al.
0

We consider distribution-based objectives for Markov Decision Processes (MDP). This class of objectives gives rise to an interesting trade-off between full and partial information. As in full observation, the strategy in the MDP can depend on the state of the system, but similar to partial information, the strategy needs to account for all the states at the same time. In this paper, we focus on two safety problems that arise naturally in this context, namely, existential and universal safety. Given an MDP A and a closed and convex polytope H of probability distributions over the states of A, the existential safety problem asks whether there exists some distribution d in H and a strategy of A, such that starting from d and repeatedly applying this strategy keeps the distribution forever in H. The universal safety problem asks whether for all distributions in H, there exists such a strategy of A which keeps the distribution forever in H. We prove that both problems are decidable, with tight complexity bounds: we show that existential safety is PTIME-complete, while universal safety is co-NP-complete. Further, we compare these results with existential and universal safety problems for Rabin's probabilistic finite-state automata (PFA), the subclass of Partially Observable MDPs which have zero observation. Compared to MDPs, strategies of PFAs are not state-dependent. In sharp contrast to the PTIME result, we show that existential safety for PFAs is undecidable, with H having closed and open boundaries. On the other hand, it turns out that the universal safety for PFAs is decidable in EXPTIME, with a co-NP lower bound. Finally, we show that an alternate representation of the input polytope allows us to improve the complexity of universal safety for MDPs and PFAs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/26/2020

Transience in Countable MDPs

The Transience objective is not to visit any state infinitely often. Whi...
research
05/26/2023

MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives

Markov decision processes can be viewed as transformers of probability d...
research
10/24/2019

Simple Strategies in Multi-Objective MDPs (Technical Report)

We consider the verification of multiple expected reward objectives at o...
research
06/21/2018

Universal Safety for Timed Petri Nets is PSPACE-complete

A timed network consists of an arbitrary number of initially identical 1...
research
04/19/2020

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) a...
research
07/13/2018

On the Complexity of Iterative Tropical Computation with Applications to Markov Decision Processes

We study the complexity of evaluating powered functions implemented by s...
research
06/19/2019

Strategy Representation by Decision Trees with Linear Classifiers

Graph games and Markov decision processes (MDPs) are standard models in ...

Please sign up or login with your details

Forgot password? Click here to reset