Stochastic Variance Reduction for Deep Q-learning

05/20/2019
by   Wei-Ye Zhao, et al.
5

Recent advances in deep reinforcement learning have achieved human-level performance on a variety of real-world applications. However, the current algorithms still suffer from poor gradient estimation with excessive variance, resulting in unstable training and poor sample efficiency. In our paper, we proposed an innovative optimization strategy by utilizing stochastic variance reduced gradient (SVRG) techniques. With extensive experiments on Atari domain, our method outperforms the deep q-learning baselines on 18 out of 20 games.

READ FULL TEXT
research
07/25/2020

Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient

Deep Q-learning algorithms often suffer from poor gradient estimations w...
research
09/14/2020

Variance-Reduced Off-Policy Memory-Efficient Policy Search

Off-policy policy optimization is a challenging problem in reinforcement...
research
10/14/2019

On the Reduction of Variance and Overestimation of Deep Q-Learning

The breakthrough of deep Q-Learning on different types of environments r...
research
05/29/2019

Variance Reduction for Evolution Strategies via Structured Control Variates

Evolution Strategies (ES) are a powerful class of blackbox optimization ...
research
12/11/2018

On the Ineffectiveness of Variance Reduced Optimization for Deep Learning

The application of stochastic variance reduction to optimization has sho...
research
11/15/2018

Asynchronous Stochastic Composition Optimization with Variance Reduction

Composition optimization has drawn a lot of attention in a wide variety ...
research
07/06/2018

Variance Reduction for Reinforcement Learning in Input-Driven Environments

We consider reinforcement learning in input-driven environments, where a...

Please sign up or login with your details

Forgot password? Click here to reset