High-Throughput Synchronous Deep RL

12/17/2020
by   Iou-Jen Liu, et al.
0

Deep reinforcement learning (RL) is computationally demanding and requires processing of many data points. Synchronous methods enjoy training stability while having lower data throughput. In contrast, asynchronous methods achieve high throughput but suffer from stability issues and lower sample efficiency due to `stale policies.' To combine the advantages of both methods we propose High-Throughput Synchronous Deep Reinforcement Learning (HTS-RL). In HTS-RL, we perform learning and rollouts concurrently, devise a system design which avoids `stale policies' and ensure that actors interact with environment replicas in an asynchronous manner while maintaining full determinism. We evaluate our approach on Atari games and the Google Research Football environment. Compared to synchronous baselines, HTS-RL is 2-6× faster. Compared to state-of-the-art asynchronous methods, HTS-RL has competitive throughput and consistently achieves higher average episode rewards.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning

Training generally capable agents that perform well in unseen dynamic en...
research
11/30/2019

IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

The practical usage of reinforcement learning agents is often bottleneck...
research
12/28/2019

SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning

We introduce SLM Lab, a software framework for reproducible reinforcemen...
research
10/24/2019

XPipe: Efficient Pipeline Model Parallelism for Multi-GPU DNN Training

We propose XPipe, an efficient asynchronous pipeline model parallelism a...
research
03/03/2019

Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments

Deep Deterministic Policy Gradient (DDPG) has been proved to be a succes...
research
12/10/2020

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Deep reinforcement learning (DRL) algorithms and evolution strategies (E...
research
12/26/2019

Make TCP Great (again?!) in Cellular Networks: A Deep Reinforcement Learning Approach

Can we instead of designing just another new TCP, design a TCP plug-in w...

Please sign up or login with your details

Forgot password? Click here to reset