An Analysis of Categorical Distributional Reinforcement Learning

02/22/2018
by   Mark Rowland, et al.
0

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cramér distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2022

Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning

Distributional reinforcement learning (RL) aims to learn a value-network...
research
04/27/2023

One-Step Distributional Reinforcement Learning

Reinforcement learning (RL) allows an agent interacting sequentially wit...
research
03/24/2020

Distributional Reinforcement Learning with Ensembles

It is well-known that ensemble methods often provide enhanced performanc...
research
02/08/2019

Distributional reinforcement learning with linear function approximation

Despite many algorithmic advances, our theoretical understanding of prac...
research
12/14/2021

Conjugated Discrete Distributions for Distributional Reinforcement Learning

In this work we continue to build upon recent advances in reinforcement ...
research
03/27/2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

We present a distributional approach to theoretical analyses of reinforc...
research
01/30/2019

A Comparative Analysis of Expected and Distributional Reinforcement Learning

Since their introduction a year ago, distributional approaches to reinfo...

Please sign up or login with your details

Forgot password? Click here to reset