Learning Probabilistic Reward Machines from Non-Markovian Stochastic Reward Processes

07/09/2021
by   Alvaro Velasquez, et al.
0

The success of reinforcement learning in typical settings is, in part, predicated on underlying Markovian assumptions on the reward signal by which an agent learns optimal policies. In recent years, the use of reward machines has relaxed this assumption by enabling a structured representation of non-Markovian rewards. In particular, such representations can be used to augment the state space of the underlying decision process, thereby facilitating non-Markovian reinforcement learning. However, these reward machines cannot capture the semantics of stochastic reward signals. In this paper, we make progress on this front by introducing probabilistic reward machines (PRMs) as a representation of non-Markovian stochastic rewards. We present an algorithm to learn PRMs from the underlying decision process as well as to learn the PRM representation of a given decision-making policy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2023

Reinforcement Learning with Exogenous States and Rewards

Exogenous state variables and rewards can slow reinforcement learning by...
research
02/11/2019

Stochastic Reinforcement Learning

In reinforcement learning episodes, the rewards and punishments are ofte...
research
02/28/2023

Policy Dispersion in Non-Markovian Environment

Markov Decision Process (MDP) presents a mathematical framework to formu...
research
04/20/2022

Joint Learning of Reward Machines and Policies in Environments with Partially Known Semantics

We study the problem of reinforcement learning for a task encoded by a r...
research
07/29/2019

Reinforcement with Fading Memories

We study the effect of imperfect memory on decision making in the contex...
research
06/07/2021

Reconciling Rewards with Predictive State Representations

Predictive state representations (PSRs) are models of controlled non-Mar...
research
08/14/2023

Omega-Regular Reward Machines

Reinforcement learning (RL) is a powerful approach for training agents t...

Please sign up or login with your details

Forgot password? Click here to reset