QUOTA: The Quantile Option Architecture for Reinforcement Learning

11/05/2018
by   Shangtong Zhang, et al.
0

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.

READ FULL TEXT

page 7

page 14

research
06/13/2022

IGN : Implicit Generative Networks

In this work, we build recent advances in distributional reinforcement l...
research
11/05/2019

Fully Parameterized Quantile Function for Distributional Reinforcement Learning

Distributional Reinforcement Learning (RL) differs from traditional RL i...
research
05/13/2019

Distributional Reinforcement Learning for Efficient Exploration

In distributional reinforcement learning (RL), the estimated distributio...
research
05/28/2023

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

We study the problem of temporal-difference-based policy evaluation in r...
research
05/26/2023

Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

Successful applications of distributional reinforcement learning with qu...
research
07/28/2020

Munchausen Reinforcement Learning

Bootstrapping is a core mechanism in Reinforcement Learning (RL). Most a...
research
10/01/2021

A Cramér Distance perspective on Non-crossing Quantile Regression in Distributional Reinforcement Learning

Distributional reinforcement learning (DRL) extends the value-based appr...

Please sign up or login with your details

Forgot password? Click here to reset