Stochastic Reinforcement Learning

02/11/2019
by   Nikki Lijing Kuang, et al.
0

In reinforcement learning episodes, the rewards and punishments are often non-deterministic, and there are invariably stochastic elements governing the underlying situation. Such stochastic elements are often numerous and cannot be known in advance, and they have a tendency to obscure the underlying rewards and punishments patterns. Indeed, if stochastic elements were absent, the same outcome would occur every time and the learning problems involved could be greatly simplified. In addition, in most practical situations, the cost of an observation to receive either a reward or punishment can be significant, and one would wish to arrive at the correct learning conclusion by incurring minimum cost. In this paper, we present a stochastic approach to reinforcement learning which explicitly models the variability present in the learning environment and the cost of observation. Criteria and rules for learning success are quantitatively analyzed, and probabilities of exceeding the observation cost bounds are also obtained.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2021

Learning Probabilistic Reward Machines from Non-Markovian Stochastic Reward Processes

The success of reinforcement learning in typical settings is, in part, p...
research
02/11/2019

Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective

In reinforcement learning, a decision needs to be made at some point as ...
research
11/03/2016

Quantile Reinforcement Learning

In reinforcement learning, the standard criterion to evaluate policies i...
research
10/21/2019

Dealing with Sparse Rewards in Reinforcement Learning

Successfully navigating a complex environment to obtain a desired outcom...
research
05/06/2019

Deep Ordinal Reinforcement Learning

Reinforcement learning usually makes use of numerical rewards, which hav...
research
02/15/2020

The Archimedean trap: Why traditional reinforcement learning will probably not yield AGI

After generalizing the Archimedean property of real numbers in such a wa...
research
02/25/2022

Reachability analysis in stochastic directed graphs by reinforcement learning

We characterize the reachability probabilities in stochastic directed gr...

Please sign up or login with your details

Forgot password? Click here to reset