Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

06/30/2017
by   Peter Baumgartner, et al.
0

Markov decision processes (MDPs) are the standard formalism for modelling sequential decision making in stochastic environments. Policy synthesis addresses the problem of how to control or limit the decisions an agent makes so that a given specification is met. In this paper we consider PCTL*, the probabilistic counterpart of CTL*, as the specification language. Because in general the policy synthesis problem for PCTL* is undecidable, we restrict to policies whose execution history memory is finitely bounded a priori. Surprisingly, no algorithm for policy synthesis for this natural and expressive framework has been developed so far. We close this gap and describe a tableau-based algorithm that, given an MDP and a PCTL* specification, derives in a non-deterministic way a system of (possibly nonlinear) equalities and inequalities. The solutions of this system, if any, describe the desired (stochastic) policies. Our main result in this paper is the correctness of our method, i.e., soundness, completeness and termination.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2021

LTL-Constrained Steady-State Policy Synthesis

Decision-making policies for agents are often synthesized with the const...
research
02/02/2021

Stability-Constrained Markov Decision Processes Using MPC

In this paper, we consider solving discounted Markov Decision Processes ...
research
09/23/2020

LTLf Synthesis on Probabilistic Systems

Many systems are naturally modeled as Markov Decision Processes (MDPs), ...
research
07/16/2020

Strengthening Deterministic Policies for POMDPs

The synthesis problem for partially observable Markov decision processes...
research
06/30/2020

Verification of indefinite-horizon POMDPs

The verification problem in MDPs asks whether, for any policy resolving ...
research
02/13/2018

Parameter and Insertion Function Co-synthesis for Opacity Enhancement in Parametric Stochastic Discrete Event Systems

Opacity is a property that characterizes the system's capability to keep...
research
09/15/2021

Synthesizing Policies That Account For Human Execution Errors Caused By State-Aliasing In Markov Decision Processes

When humans are given a policy to execute, there can be policy execution...

Please sign up or login with your details

Forgot password? Click here to reset