Langevin DQN

02/17/2020
by   Vikranth Dwaracherla, et al.
0

Algorithms that tackle deep exploration – an important challenge in reinforcement learning – have relied on epistemic uncertainty representation through ensembles or other hypermodels, exploration bonuses, or visitation count distributions. An open question is whether deep exploration can be achieved by an incremental reinforcement learning algorithm that tracks a single point estimate, without additional complexity required to account for epistemic uncertainty. We answer this question in the affirmative. In particular, we develop Langevin DQN, a variation of DQN that differs only in perturbing parameter updates with Gaussian noise, and demonstrate through a computational study that the algorithm achieves deep exploration. We also provide an intuition for why Langevin DQN performs deep exploration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

Exploration via Epistemic Value Estimation

How to efficiently explore in reinforcement learning is an open problem....
research
05/04/2018

Exploration by Distributional Reinforcement Learning

We propose a framework based on distributional reinforcement learning an...
research
06/15/2020

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Model-based reinforcement learning algorithms with probabilistic dynamic...
research
05/23/2019

Estimating Risk and Uncertainty in Deep Reinforcement Learning

This paper demonstrates a novel method for separately estimating aleator...
research
07/31/2018

Count-Based Exploration with the Successor Representation

The problem of exploration in reinforcement learning is well-understood ...
research
01/12/2021

Deep Gaussian Denoiser Epistemic Uncertainty and Decoupled Dual-Attention Fusion

Following the performance breakthrough of denoising networks, improvemen...
research
10/30/2022

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the...

Please sign up or login with your details

Forgot password? Click here to reset