WALL-E: An Efficient Reinforcement Learning Research Framework

01/18/2019
by   Tianbing Xu, et al.
0

There are two halves to RL systems: experience collection time and policy learning time. For a large number of samples in rollouts, experience collection time is the major bottleneck. Thus, it is necessary to speed up the rollout generation time with multi-process architecture support. Our work, dubbed WALL-E, utilizes multiple rollout samplers running in parallel to rapidly generate experience. Due to our parallel samplers, we experience not only faster convergence times, but also higher average reward thresholds. For example, on the MuJoCo HalfCheetah-v2 task, with N = 10 parallel sampler processes, we are able to achieve much higher average return than those from using only a single process architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2015

Massively Parallel Methods for Deep Reinforcement Learning

We present the first massively distributed architecture for deep reinfor...
research
07/24/2023

Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

Reinforcement learning is time-consuming for complex tasks due to the ne...
research
05/19/2020

Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning

Exploration of the high-dimensional state action space is one of the big...
research
03/02/2018

Distributed Prioritized Experience Replay

We propose a distributed architecture for deep reinforcement learning at...
research
11/12/2018

Importance Weighted Evolution Strategies

Evolution Strategies (ES) emerged as a scalable alternative to popular R...
research
05/31/2019

Reinforcement Learning Experience Reuse with Policy Residual Representation

Experience reuse is key to sample-efficient reinforcement learning. One ...

Please sign up or login with your details

Forgot password? Click here to reset