Accelerated Methods for Deep Reinforcement Learning

03/07/2018
by   Adam Stooke, et al.
0

Deep reinforcement learning (RL) has achieved many recent successes, yet experiment turn-around time remains a key bottleneck in research and in practice. We investigate how to optimize existing deep RL algorithms for modern computers, specifically for a combination of CPUs and GPUs. We confirm that both policy gradient and Q-value learning algorithms can be adapted to learn using many parallel simulator instances. We further find it possible to train using batch sizes considerably larger than are standard, without negatively affecting sample complexity or final performance. We leverage these facts to build a unified framework for parallelization that dramatically hastens experiments in both classes of algorithm. All neural network computations use GPUs, accelerating both data collection and training. Our results include using an entire NVIDIA DGX-1 to learn successful strategies in Atari games in single-digit minutes, using both synchronous and asynchronous algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2018

AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Reinforcement Learning with Near-Optimal Sample Complexity

In this paper, we propose AsyncQVI: Asynchronous-Parallel Q-value Iterat...
research
01/09/2018

Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

We present a study in Distributed Deep Reinforcement Learning (DDRL) foc...
research
09/22/2021

MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning

Ensemble reinforcement learning (RL) aims to mitigate instability in Q-l...
research
05/19/2017

Atari games and Intel processors

The asynchronous nature of the state-of-the-art reinforcement learning a...
research
07/19/2019

GPU-Accelerated Atari Emulation for Reinforcement Learning

We designed and implemented a CUDA port of the Atari Learning Environmen...
research
12/10/2021

Edge-Compatible Reinforcement Learning for Recommendations

Most reinforcement learning (RL) recommendation systems designed for edg...

Please sign up or login with your details

Forgot password? Click here to reset