Specifying Non-Markovian Rewards in MDPs Using LDL on Finite Traces (Preliminary Version)

06/25/2017
by   Ronen Brafman, et al.
0

In Markov Decision Processes (MDPs), the reward obtained in a state depends on the properties of the last state and action. This state dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle such non-Markovian reward function was the subject of two previous lines of work, both using variants of LTL to specify the reward function and then compiling the new model back into a Markovian model. Building upon recent progress in the theories of temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2020

Learning Non-Markovian Reward Models in MDPs

There are situations in which an agent should receive rewards only after...
research
12/07/2016

Effect of Reward Function Choices in MDPs with Value-at-Risk

This paper studies Value-at-Risk (VaR) problems in short- and long-horiz...
research
07/22/2023

On the Expressivity of Multidimensional Markov Reward

We consider the expressivity of Markov rewards in sequential decision ma...
research
05/09/2012

Regret-based Reward Elicitation for Markov Decision Processes

The specification of aMarkov decision process (MDP) can be difficult. Re...
research
03/02/2020

Learning and Solving Regular Decision Processes

Regular Decision Processes (RDPs) are a recently introduced model that e...
research
04/16/2018

Distribution Estimation in Discounted MDPs via a Transformation

Although the general deterministic reward function in MDPs takes three a...
research
02/14/2021

State-Visitation Fairness in Average-Reward MDPs

Fairness has emerged as an important concern in automated decision-makin...

Please sign up or login with your details

Forgot password? Click here to reset