Distributional Reinforcement Learning with Quantile Regression

10/27/2017
by   Will Dabney, et al.
0

In reinforcement learning an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value function. In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean. That is, we examine methods of learning the value distribution instead of the value function. We give results that close a number of gaps between the theoretical and algorithmic results given by Bellemare, Dabney, and Munos (2017). First, we extend existing results to the approximate distribution setting. Second, we present a novel distributional reinforcement learning algorithm consistent with our theoretical formulation. Finally, we evaluate this new algorithm on the Atari 2600 games, observing that it significantly outperforms many of the recent improvements on DQN, including the related distributional algorithm C51.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2022

DQMIX: A Distributional Perspective on Multi-Agent Reinforcement Learning

In cooperative multi-agent tasks, a team of agents jointly interact with...
research
07/21/2017

A Distributional Perspective on Reinforcement Learning

In this paper we argue for the fundamental importance of the value distr...
research
05/24/2018

Meta-Gradient Reinforcement Learning

The goal of reinforcement learning algorithms is to estimate and/or opti...
research
08/28/2022

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Learning a predictive model of the mean return, or value function, plays...
research
05/30/2022

GLDQN: Explicitly Parameterized Quantile Reinforcement Learning for Waste Reduction

We study the problem of restocking a grocery store's inventory with peri...
research
12/14/2021

Conjugated Discrete Distributions for Distributional Reinforcement Learning

In this work we continue to build upon recent advances in reinforcement ...
research
02/08/2019

Distributional reinforcement learning with linear function approximation

Despite many algorithmic advances, our theoretical understanding of prac...

Please sign up or login with your details

Forgot password? Click here to reset