Estimating Risk and Uncertainty in Deep Reinforcement Learning

05/23/2019
by   William R. Clements, et al.
0

This paper demonstrates a novel method for separately estimating aleatoric risk and epistemic uncertainty in deep reinforcement learning. Aleatoric risk, which arises from inherently stochastic environments or agents, must be accounted for in the design of risk-sensitive algorithms. Epistemic uncertainty, which stems from limited data, is important both for risk-sensitivity and to efficiently explore an environment. We first present a Bayesian framework for learning the return distribution in reinforcement learning, which provides theoretical foundations for quantifying both types of uncertainty. Based on this framework, we show that the disagreement between only two neural networks is sufficient to produce a low-variance estimate of the epistemic uncertainty on the return distribution, thus providing a simple and computationally cheap uncertainty metric. We demonstrate experiments that illustrate our method and some applications.

READ FULL TEXT
research
01/08/2019

Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning

We consider the problem of detecting out-of-distribution (OOD) samples i...
research
11/09/2021

Risk Sensitive Model-Based Reinforcement Learning using Uncertainty Guided Planning

Identifying uncertainty and taking mitigating actions is crucial for saf...
research
06/14/2019

Epistemic Risk-Sensitive Reinforcement Learning

We develop a framework for interacting with uncertain environments in re...
research
10/25/2022

The uncertainty of infectious disease outbreaks is underestimated

Uncertainty can be classified as either aleatoric (intrinsic randomness)...
research
08/01/2023

Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks

The modern pervasiveness of large-scale deep neural networks (NNs) is dr...
research
02/17/2020

Langevin DQN

Algorithms that tackle deep exploration – an important challenge in rein...
research
12/31/2019

Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning

Robustness to out-of-distribution (OOD) data is an important goal in bui...

Please sign up or login with your details

Forgot password? Click here to reset