Distributional Reinforcement Learning with Maximum Mean Discrepancy

07/24/2020
by   Thanh Tang Nguyen, et al.
0

Distributional reinforcement learning (RL) has achieved state-of-the-art performance in Atari games by recasting the traditional RL into a distribution estimation problem, explicitly estimating the probability distribution instead of the expectation of a total return. The bottleneck in distributional RL lies in the estimation of this distribution where one must resort to an approximate representation of the return distributions which are infinite-dimensional. Most existing methods focus on learning a set of predefined statistic functionals of the return distributions requiring involved projections to maintain the order statistics. We take a different perspective using deterministic sampling wherein we approximate the return distributions with a set of deterministic particles that are not attached to any predefined statistic functional, allowing us to freely approximate the return distributions. The learning is then interpreted as evolution of these particles so that a distance between the return distribution and its target distribution is minimized. This learning aim is realized via maximum mean discrepancy (MMD) distance which in turn leads to a simpler loss amenable to backpropagation. Experiments on the suite of Atari 2600 games show that our algorithm outperforms the standard distributional RL baselines and sets a new record in the Atari games for non-distributed agents.

READ FULL TEXT
02/01/2022

Distributional Reinforcement Learning via Sinkhorn Iterations

Distributional reinforcement learning (RL) is a class of state-of-the-ar...
11/05/2019

Fully Parameterized Quantile Function for Distributional Reinforcement Learning

Distributional Reinforcement Learning (RL) differs from traditional RL i...
10/07/2021

Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm

Distributional reinforcement learning (RL) is a class of state-of-the-ar...
02/21/2019

Statistics and Samples in Distributional Reinforcement Learning

We present a unifying framework for designing and analysing distribution...
10/26/2021

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

A growing trend for value-based reinforcement learning (RL) algorithms i...
06/06/2021

Distributional Reinforcement Learning with Unconstrained Monotonic Neural Networks

The distributional reinforcement learning (RL) approach advocates for re...
06/11/2018

The Potential of the Return Distribution for Exploration in RL

This paper studies the potential of the return distribution for explorat...

Code Repositories

mmdrl

Code holder for https://arxiv.org/abs/2007.12354


view repo