Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation

01/05/2022
by   Vincent Mai, et al.
0

In model-free deep reinforcement learning (RL) algorithms, using noisy value estimates to supervise policy evaluation and optimization is detrimental to the sample efficiency. As this noise is heteroscedastic, its effects can be mitigated using uncertainty-based weights in the optimization process. Previous methods rely on sampled ensembles, which do not capture all aspects of uncertainty. We provide a systematic analysis of the sources of uncertainty in the noisy supervision that occurs in RL, and introduce inverse-variance RL, a Bayesian framework which combines probabilistic ensembles and Batch Inverse Variance weighting. We propose a method whereby two complementary uncertainty estimation methods account for both the Q-value and the environment stochasticity to better mitigate the negative impacts of noisy supervision. Our results show significant improvement in terms of sample efficiency on discrete and continuous control tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2018

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Model-based reinforcement learning (RL) algorithms can attain excellent ...
research
07/09/2020

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Model-free deep reinforcement learning (RL) has been successful in a ran...
research
12/11/2018

KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning

A key challenge for gradient based optimization methods in model-free re...
research
07/09/2021

Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression

Heteroscedastic regression is the task of supervised learning where each...
research
02/24/2023

Model-Based Uncertainty in Value Functions

We consider the problem of quantifying uncertainty over expected cumulat...
research
11/30/2020

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

Although deep reinforcement learning (RL) has been successfully applied ...
research
10/07/2022

How to Enable Uncertainty Estimation in Proximal Policy Optimization

While deep reinforcement learning (RL) agents have showcased strong resu...

Please sign up or login with your details

Forgot password? Click here to reset