Accelerated Policy Learning with Parallel Differentiable Simulation

04/14/2022
by   Jie Xu, et al.
0

Deep reinforcement learning can generate complex control policies, but requires large amounts of training data to work effectively. Recent work has attempted to address this issue by leveraging differentiable simulators. However, inherent problems such as local minima and exploding/vanishing numerical gradients prevent these methods from being generally applied to control tasks with complex contact-rich dynamics, such as humanoid locomotion in classical RL benchmarks. In this work we present a high-performance differentiable simulator and a new policy learning algorithm (SHAC) that can effectively leverage simulation gradients, even in the presence of non-smoothness. Our learning algorithm alleviates problems with local minima through a smooth critic function, avoids vanishing/exploding gradients through a truncated learning window, and allows many physical environments to be run in parallel. We evaluate our method on classical RL control tasks, and show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms. In addition, we demonstrate the scalability of our method by applying it to the challenging high-dimensional problem of muscle-actuated locomotion with a large action space, achieving a greater than 17x reduction in training time over the best-performing established RL algorithm.

READ FULL TEXT

page 2

page 8

research
10/12/2018

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

Most Deep Reinforcement Learning (Deep RL) algorithms require a prohibit...
research
06/06/2023

BackpropTools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Deep Reinforcement Learning (RL) has been demonstrated to yield capable ...
research
03/06/2022

Leveraging Reward Gradients For Reinforcement Learning in Differentiable Physics Simulations

In recent years, fully differentiable rigid body physics simulators have...
research
02/08/2022

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Consider a walking agent that must adapt to damage. To approach this tas...
research
04/06/2023

DiffMimic: Efficient Motion Mimicking with Differentiable Physics

Motion mimicking is a foundational task in physics-based character anima...
research
07/24/2023

Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

Reinforcement learning is time-consuming for complex tasks due to the ne...
research
06/23/2022

Augmenting differentiable physics with randomized smoothing

In the past few years, following the differentiable programming paradigm...

Please sign up or login with your details

Forgot password? Click here to reset