Reward Estimation for Variance Reduction in Deep Reinforcement Learning

05/09/2018
by   Joshua Romoff, et al.
0

In reinforcement learning (RL), stochastic environments can make learning a policy difficult due to high degrees of variance. As such, variance reduction methods have been investigated in other works, such as advantage estimation and control-variates estimation. Here, we propose to learn a separate reward estimator to train the value function, to help reduce variance caused by a noisy reward signal. This results in theoretical reductions in variance in the tabular case, as well as empirical improvements in both the function approximation and tabular settings in environments where rewards are stochastic. To do so, we use a modified version of Advantage Actor Critic (A2C) on variations of Atari games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2019

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

We study the problem of off-policy critic evaluation in several variants...
research
06/10/2018

Distributional Advantage Actor-Critic

In traditional reinforcement learning, an agent maximizes the reward col...
research
05/24/2019

Adaptive Symmetric Reward Noising for Reinforcement Learning

Recent reinforcement learning algorithms, though achieving impressive re...
research
05/07/2021

Reward prediction for representation learning and reward shaping

One of the fundamental challenges in reinforcement learning (RL) is the ...
research
10/21/2021

Can Q-learning solve Multi Armed Bantids?

When a reinforcement learning (RL) method has to decide between several ...
research
06/04/2018

TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Our understanding of reinforcement learning (RL) has been shaped by theo...
research
06/16/2021

Unbiased Methods for Multi-Goal Reinforcement Learning

In multi-goal reinforcement learning (RL) settings, the reward for each ...

Please sign up or login with your details

Forgot password? Click here to reset