Invariance to Quantile Selection in Distributional Continuous Control

12/29/2022
by   Felix Grün, et al.
0

In recent years distributional reinforcement learning has produced many state of the art results. Increasingly sample efficient Distributional algorithms for the discrete action domain have been developed over time that vary primarily in the way they parameterize their approximations of value distributions, and how they quantify the differences between those distributions. In this work we transfer three of the most well-known and successful of those algorithms (QR-DQN, IQN and FQF) to the continuous action domain by extending two powerful actor-critic algorithms (TD3 and SAC) with distributional critics. We investigate whether the relative performance of the methods for the discrete action space translates to the continuous case. To that end we compare them empirically on the pybullet implementations of a set of continuous control tasks. Our results indicate qualitative invariance regarding the number and placement of distributional atoms in the deterministic, continuous action setting.

READ FULL TEXT

page 9

page 10

research
02/06/2022

Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning

Distributional reinforcement learning (RL) aims to learn a value-network...
research
05/24/2021

GMAC: A Distributional Perspective on Actor-Critic Framework

In this paper, we devise a distributional framework on actor-critic as a...
research
01/09/2020

Addressing Value Estimation Errors in Reinforcement Learning with a State-Action Return Distribution Function

In current reinforcement learning (RL) methods, function approximation e...
research
05/08/2020

Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics

The overestimation bias is one of the major impediments to accurate off-...
research
03/23/2021

Binary disease prediction using tail quantiles of the distribution of continuous biomarkers

In the analysis of binary disease classification, single biomarkers migh...
research
10/01/2019

Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping

The distributional perspective on reinforcement learning (RL) has given ...
research
02/16/2023

The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation

We make a connection between multicalibration and property elicitation a...

Please sign up or login with your details

Forgot password? Click here to reset