DeepAI AI Chat
Log In Sign Up

AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning

by   Hongjie Zhang, et al.
Sichuan Normal University

Deep reinforcement learning has achieved great success in various fields with its super decision-making ability. However, the policy learning process requires a large amount of training time, causing energy consumption. Inspired by the redundancy of neural networks, we propose a lightweight parallel training framework based on neural network compression, AcceRL, to accelerate the policy learning while ensuring policy quality. Specifically, AcceRL speeds up the experience collection by flexibly combining various neural network compression methods. Overall, the AcceRL consists of five components, namely Actor, Learner, Compressor, Corrector, and Monitor. The Actor uses the Compressor to compress the Learner's policy network to interact with the environment. And the generated experiences are transformed by the Corrector with Off-Policy methods, such as V-trace, Retrace and so on. Then the corrected experiences are feed to the Learner for policy learning. We believe this is the first general reinforcement learning framework that incorporates multiple neural network compression techniques. Extensive experiments conducted in gym show that the AcceRL reduces the time cost of the actor by about 2.0 X to 4.13 X compared to the traditional methods. Furthermore, the AcceRL reduces the whole training time by about 29.8 while keeps the same policy quality.


A DPDK-Based Acceleration Method for Experience Sampling of Distributed Reinforcement Learning

A computing cluster that interconnects multiple compute nodes is used to...

Dynamic Sparse Training for Deep Reinforcement Learning

Deep reinforcement learning has achieved significant success in many dec...

DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement Learning

Application of ensemble of neural networks is becoming an imminent tool ...

Learning to Explore with Meta-Policy Gradient

The performance of off-policy learning, including deep Q-learning and de...

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

In this work we aim to solve a large collection of tasks using a single ...

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Multi-simulator training has contributed to the recent success of Deep R...

RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization

Joint replacement is the most common inpatient surgical treatment in the...