LTL-Constrained Steady-State Policy Synthesis

05/31/2021
by   Jan Křetínský, et al.
0

Decision-making policies for agents are often synthesized with the constraint that a formal specification of behaviour is satisfied. Here we focus on infinite-horizon properties. On the one hand, Linear Temporal Logic (LTL) is a popular example of a formalism for qualitative specifications. On the other hand, Steady-State Policy Synthesis (SSPS) has recently received considerable attention as it provides a more quantitative and more behavioural perspective on specifications, in terms of the frequency with which states are visited. Finally, rewards provide a classic framework for quantitative properties. In this paper, we study Markov decision processes (MDP) with the specification combining all these three types. The derived policy maximizes the reward among all policies ensuring the LTL specification with the given probability and adhering to the steady-state constraints. To this end, we provide a unified solution reducing the multi-type specification to a multi-dimensional long-run average reward. This is enabled by Limit-Deterministic Büchi Automata (LDBA), recently studied in the context of LTL model checking on MDP, and allows for an elegant solution through a simple linear programme. The algorithm also extends to the general ω-regular properties and runs in time polynomial in the sizes of the MDP as well as the LDBA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2021

Controller Synthesis for Omega-Regular and Steady-State Specifications

Given a Markov decision process (MDP) and a linear-time (ω-regular or LT...
research
12/03/2020

Verifiable Planning in Expected Reward Multichain MDPs

The planning domain has experienced increased interest in the formal syn...
research
06/30/2017

Tableaux for Policy Synthesis for MDPs with PCTL* Constraints

Markov decision processes (MDPs) are the standard formalism for modellin...
research
07/09/2018

Entropy Maximization for Markov Decision Processes Under Temporal Logic Constraints

We study the problem of synthesizing a policy that maximizes the entropy...
research
05/26/2023

MULTIGAIN 2.0: MDP controller synthesis for multiple mean-payoff, LTL and steady-state constraints

We present MULTIGAIN 2.0, a major extension to the controller synthesis ...
research
03/16/2023

Reinforcement Learning for Omega-Regular Specifications on Continuous-Time MDP

Continuous-time Markov decision processes (CTMDPs) are canonical models ...
research
08/09/2014

POMDPs under Probabilistic Semantics

We consider partially observable Markov decision processes (POMDPs) with...

Please sign up or login with your details

Forgot password? Click here to reset