Model-Based Uncertainty in Value Functions

02/24/2023
by   Carlos E. Luis, et al.
0

We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. In particular, we focus on characterizing the variance over values induced by a distribution over MDPs. Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation, but the over-approximation may result in inefficient exploration. We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values and explicitly characterizes the gap in previous work. Moreover, our uncertainty quantification technique is easily integrated into common exploration strategies and scales naturally beyond the tabular setting by using standard deep reinforcement learning architectures. Experiments in difficult exploration tasks, both in tabular and continuous control settings, show that our sharper uncertainty estimates improve sample-efficiency.

READ FULL TEXT

page 9

page 21

page 23

page 24

research
04/30/2023

Posterior Sampling for Deep Reinforcement Learning

Despite remarkable successes, deep reinforcement learning algorithms rem...
research
08/12/2023

Value-Distributional Model-Based Reinforcement Learning

Quantifying uncertainty about a policy's long-term performance is import...
research
09/15/2017

The Uncertainty Bellman Equation and Exploration

We consider the exploration/exploitation problem in reinforcement learni...
research
01/05/2022

Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation

In model-free deep reinforcement learning (RL) algorithms, using noisy v...
research
11/21/2019

Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control

Deep networks have enabled reinforcement learning to scale to more compl...
research
10/06/2021

Residual Overfit Method of Exploration

Exploration is a crucial aspect of bandit and reinforcement learning alg...
research
02/08/2020

Inferential Induction: Joint Bayesian Estimation of MDPs and Value Functions

Bayesian reinforcement learning (BRL) offers a decision-theoretic soluti...

Please sign up or login with your details

Forgot password? Click here to reset