Anytime State-Based Solution Methods for Decision Processes with non-Markovian Rewards

12/12/2012
by   Sylvie Thiebaux, et al.
0

A popular approach to solving a decision process with non-Markovian rewards (NMRDP) is to exploit a compact representation of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to our favorite MDP solution method. The contribution of this paper is a representation of non-Markovian reward functions and a translation into MDP aimed at making the best possible use of state-based anytime algorithms as the solution method. By explicitly constructing and exploring only parts of the state space, these algorithms are able to trade computation time for policy quality, and have proven quite effective in dealing with large MDPs. Our representation extends future linear temporal logic (FLTL) to express rewards. Our translation has the effect of embedding model-checking in the solution method. It results in an MDP of the minimal size achievable without stepping outside the anytime framework, and consequently in better policies by the deadline.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 8

page 9

page 10

research
10/19/2012

Implementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards

This paper examines a number of solution methods for decision processes ...
research
09/11/2011

Decision-Theoretic Planning with non-Markovian Rewards

A decision process in which rewards depend on history rather than merely...
research
09/26/2020

Online Learning of Non-Markovian Reward Models

There are situations in which an agent should receive rewards only after...
research
11/19/2021

Expert-Guided Symmetry Detection in Markov Decision Processes

Learning a Markov Decision Process (MDP) from a fixed batch of trajector...
research
07/10/2020

Efficient MDP Analysis for Selfish-Mining in Blockchains

A proof of work (PoW) blockchain protocol distributes rewards to its par...
research
03/02/2020

Learning and Solving Regular Decision Processes

Regular Decision Processes (RDPs) are a recently introduced model that e...
research
02/18/2022

SMC4PEP: Stochastic Model Checking of Product Engineering Processes

Product Engineering Processes (PEPs) are used for describing complex pro...

Please sign up or login with your details

Forgot password? Click here to reset