On Optimistic versus Randomized Exploration in Reinforcement Learning

06/13/2017
by   Ian Osband, et al.
0

We discuss the relative merits of optimistic and randomized approaches to exploration in reinforcement learning. Optimistic approaches presented in the literature apply an optimistic boost to the value estimate at each state-action pair and select actions that are greedy with respect to the resulting optimistic value function. Randomized approaches sample from among statistically plausible value functions and select actions that are greedy with respect to the random sample. Prior computational experience suggests that randomized approaches can lead to far more statistically efficient learning. We present two simple analytic examples that elucidate why this is the case. In principle, there should be optimistic approaches that fare well relative to randomized approaches, but that would require intractable computation. Optimistic approaches that have been proposed in the literature sacrifice statistical efficiency for the sake of computational efficiency. Randomized approaches, on the other hand, may enable simultaneous statistical and computational efficiency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2017

Deep Exploration via Randomized Value Functions

We study the use of randomized value functions to guide deep exploration...
research
02/04/2014

Generalization and Exploration via Randomized Value Functions

We propose randomized least-squares value iteration (RLSVI) -- a new rei...
research
10/05/2021

Dropout Q-Functions for Doubly Efficient Reinforcement Learning

Randomized ensemble double Q-learning (REDQ) has recently achieved state...
research
02/15/2016

Deep Exploration via Bootstrapped DQN

Efficient exploration in complex environments remains a major challenge ...
research
06/07/2019

Worst-Case Regret Bounds for Exploration via Randomized Value Functions

This paper studies a recent proposal to use randomized value functions t...
research
06/08/2018

Randomized Prior Functions for Deep Reinforcement Learning

Dealing with uncertainty is essential for efficient reinforcement learni...
research
04/28/2021

Optimal Stopping via Randomized Neural Networks

This paper presents new machine learning approaches to approximate the s...

Please sign up or login with your details

Forgot password? Click here to reset