Deep Exploration via Bootstrapped DQN

02/15/2016
by   Ian Osband, et al.
0

Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as epsilon-greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; this can lead to exponentially faster learning. We demonstrate these benefits in complex stochastic MDPs and in the large-scale Arcade Learning Environment. Bootstrapped DQN substantially improves learning times and performance across most Atari games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2017

Deep Exploration via Randomized Value Functions

We study the use of randomized value functions to guide deep exploration...
research
07/03/2015

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

Achieving efficient and scalable exploration in complex domains poses a ...
research
08/06/2019

Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment

This paper provides an empirical evaluation of recently developed explor...
research
02/04/2014

Generalization and Exploration via Randomized Value Functions

We propose randomized least-squares value iteration (RLSVI) -- a new rei...
research
06/13/2017

On Optimistic versus Randomized Exploration in Reinforcement Learning

We discuss the relative merits of optimistic and randomized approaches t...
research
01/06/2016

Angrier Birds: Bayesian reinforcement learning

We train a reinforcement learner to play a simplified version of the gam...
research
06/06/2018

Randomized Value Functions via Multiplicative Normalizing Flows

Randomized value functions offer a promising approach towards the challe...

Please sign up or login with your details

Forgot password? Click here to reset