
Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition
In hierarchical planning for Markov decision processes (MDPs), temporal ...
read it

Interpretable Apprenticeship Learning with Temporal Logic Specifications
Recent work has addressed using formulas in linear temporal logic (LTL) ...
read it

Tableaux for Policy Synthesis for MDPs with PCTL* Constraints
Markov decision processes (MDPs) are the standard formalism for modellin...
read it

Shielded DecisionMaking in MDPs
A prominent problem in artificial intelligence and machine learning is t...
read it

Verifiable RNNBased Policies for POMDPs Under Temporal Logic Constraints
Recurrent neural networks (RNNs) have emerged as an effective representa...
read it

Strengthening Deterministic Policies for POMDPs
The synthesis problem for partially observable Markov decision processes...
read it

Computing the Value of Computation for Planning
An intelligent agent performs actions in order to achieve its goals. Suc...
read it
Verifiable Planning in Expected Reward Multichain MDPs
The planning domain has experienced increased interest in the formal synthesis of decisionmaking policies. This formal synthesis typically entails finding a policy which satisfies formal specifications in the form of some welldefined logic, such as Linear Temporal Logic (LTL) or Computation Tree Logic (CTL), among others. While such logics are very powerful and expressive in their capacity to capture desirable agent behavior, their value is limited when deriving decisionmaking policies which satisfy certain types of asymptotic behavior. In particular, we are interested in specifying constraints on the steadystate behavior of an agent, which captures the proportion of time an agent spends in each state as it interacts for an indefinite period of time with its environment. This is sometimes called the average or expected behavior of the agent. In this paper, we explore the steadystate planning problem of deriving a decisionmaking policy for an agent such that constraints on its steadystate behavior are satisfied. A linear programming solution for the general case of multichain Markov Decision Processes (MDPs) is proposed and we prove that optimal solutions to the proposed programs yield stationary policies with rigorous guarantees of behavior.
READ FULL TEXT
Comments
There are no comments yet.