Conjugated Discrete Distributions for Distributional Reinforcement Learning

12/14/2021
by   Björn Lindenberg, et al.
0

In this work we continue to build upon recent advances in reinforcement learning for finite Markov processes. A common approach among previous existing algorithms, both single-actor and distributed, is to either clip rewards or to apply a transformation method on Q-functions to handle a large variety of magnitudes in real discounted returns. We theoretically show that one of the most successful methods may not yield an optimal policy if we have a non-deterministic process. As a solution, we argue that distributional reinforcement learning lends itself to remedy this situation completely. By the introduction of a conjugated distributional operator we may handle a large class of transformations for real returns with guaranteed theoretical convergence. We propose an approximating single-actor algorithm based on this operator that trains agents directly on unaltered rewards using a proper distributional metric given by the Cramér distance. To evaluate its performance in a stochastic setting we train agents on a suite of 55 Atari 2600 games using sticky-actions and obtain state-of-the-art performance compared to other well-known algorithms in the Dopamine framework.

READ FULL TEXT

page 5

page 16

research
02/22/2018

An Analysis of Categorical Distributional Reinforcement Learning

Distributional approaches to value-based reinforcement learning model th...
research
10/27/2017

Distributional Reinforcement Learning with Quantile Regression

In reinforcement learning an agent interacts with the environment by tak...
research
04/21/2022

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach

Actor-critic algorithms that make use of distributional policy evaluatio...
research
04/23/2018

Distributed Distributional Deterministic Policy Gradients

This work adopts the very successful distributional perspective on reinf...
research
02/08/2019

Distributional reinforcement learning with linear function approximation

Despite many algorithmic advances, our theoretical understanding of prac...
research
03/24/2020

Distributional Reinforcement Learning with Ensembles

It is well-known that ensemble methods often provide enhanced performanc...
research
05/24/2022

Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning

Continuous-time reinforcement learning offers an appealing formalism for...

Please sign up or login with your details

Forgot password? Click here to reset