Deep Ordinal Reinforcement Learning

05/06/2019
by   Alexander Zap, et al.
0

Reinforcement learning usually makes use of numerical rewards, which have nice properties but also come with drawbacks and difficulties. Using rewards on an ordinal scale (ordinal rewards) is an alternative to numerical rewards that has received more attention in recent years. In this paper, a general approach to adapting reinforcement learning problems to the use of ordinal rewards is presented and motivated. We show how to convert common reinforcement learning algorithms to an ordinal variation by the example of Q-learning and introduce Ordinal Deep Q-Networks, which adapt deep reinforcement learning to ordinal rewards. Additionally, we run evaluations on problems provided by the OpenAI Gym framework, showing that our ordinal variants exhibit a performance that is comparable to the numerical variations for a number of problems. We also give first evidence that our ordinal variant is able to produce better results for problems with less engineered and simpler-to-design reward signals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2016

Lens depth function and k-relative neighborhood graph: versatile tools for ordinal data analysis

In recent years it has become popular to study machine learning problems...
research
01/14/2019

Ordinal Monte Carlo Tree Search

In many problem settings, most notably in game playing, an agent receive...
research
11/03/2016

Quantile Reinforcement Learning

In reinforcement learning, the standard criterion to evaluate policies i...
research
04/04/2022

The Cardinal Complexity of Comparison-based Online Algorithms

We consider ordinal online problems, i.e., those tasks that only depend ...
research
03/07/2020

Convergence of Q-value in case of Gaussian rewards

In this paper, as a study of reinforcement learning, we converge the Q f...
research
02/11/2019

Stochastic Reinforcement Learning

In reinforcement learning episodes, the rewards and punishments are ofte...
research
10/23/2018

OCAPIS: R package for Ordinal Classification And Preprocessing In Scala

Ordinal Data are those where a natural order exist between the labels. T...

Please sign up or login with your details

Forgot password? Click here to reset