DeepAI AI Chat
Log In Sign Up

Efficient Exploration through Bayesian Deep Q-Networks

by   Kamyar Azizzadenesheli, et al.

We propose Bayesian Deep Q-Network (BDQN), a practical Thompson sampling based Reinforcement Learning (RL) Algorithm. Thompson sampling allows for targeted exploration in high dimensions through posterior sampling but is usually computationally expensive. We address this limitation by introducing uncertainty only at the output layer of the network through a Bayesian Linear Regression (BLR) model. This layer can be trained with fast closed-form updates and its samples can be drawn efficiently through the Gaussian distribution. We apply our method to a wide range of Atari games in Arcade Learning Environments. Since BDQN carries out more efficient exploration, it is able to reach higher rewards substantially faster than a key baseline, the double deep Q network (DDQN).


page 1

page 2

page 3

page 4


Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Recent advances in deep reinforcement learning have made significant str...

Exploring Variational Deep Q Networks

This study provides both analysis and a refined, research-ready implemen...

Distributional Reinforcement Learning for Efficient Exploration

In distributional reinforcement learning (RL), the estimated distributio...

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

Achieving efficient and scalable exploration in complex domains poses a ...

Efficient Model-Free Reinforcement Learning Using Gaussian Process

Efficient Reinforcement Learning usually takes advantage of demonstratio...

STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

Directed Exploration is a crucial challenge in reinforcement learning (R...

Uncertainty Surrogates for Deep Learning

In this paper we introduce a novel way of estimating prediction uncertai...