Distributional Reinforcement Learning for Efficient Exploration

05/13/2019
by   Borislav Mavrin, et al.
0

In distributional reinforcement learning (RL), the estimated distribution of value function models both the parametric and intrinsic uncertainties. We propose a novel and efficient exploration method for deep RL that has two components. The first is a decaying schedule to suppress the intrinsic uncertainty. The second is an exploration bonus calculated from the upper quantiles of the learned distribution. In Atari 2600 games, our method outperforms QR-DQN in 12 out of 14 hard games (achieving 483 % average gain across 49 games in cumulative rewards over QR-DQN with a big win in Venture). We also compared our algorithm with QR-DQN in a challenging 3D driving simulator (CARLA). Results show that our algorithm achieves near-optimal safety rewards twice faster than QRDQN.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 7

page 8

page 9

page 10

research
05/04/2018

Exploration by Distributional Reinforcement Learning

We propose a framework based on distributional reinforcement learning an...
research
01/26/2023

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently a...
research
11/05/2018

QUOTA: The Quantile Option Architecture for Reinforcement Learning

In this paper, we propose the Quantile Option Architecture (QUOTA) for e...
research
02/13/2018

Efficient Exploration through Bayesian Deep Q-Networks

We propose Bayesian Deep Q-Network (BDQN), a practical Thompson sampling...
research
01/20/2020

Reinforcement Learning with Probabilistically Complete Exploration

Balancing exploration and exploitation remains a key challenge in reinfo...
research
08/03/2023

Bag of Policies for Distributional Deep Exploration

Efficient exploration in complex environments remains a major challenge ...
research
01/29/2023

Sample Efficient Deep Reinforcement Learning via Local Planning

The focus of this work is sample-efficient deep reinforcement learning (...

Please sign up or login with your details

Forgot password? Click here to reset