Leveraging Reward Gradients For Reinforcement Learning in Differentiable Physics Simulations

03/06/2022
by   Sean Gillen, et al.
0

In recent years, fully differentiable rigid body physics simulators have been developed, which can be used to simulate a wide range of robotic systems. In the context of reinforcement learning for control, these simulators theoretically allow algorithms to be applied directly to analytic gradients of the reward function. However, to date, these gradients have proved extremely challenging to use, and are outclassed by algorithms using no gradient information at all. In this work we present a novel algorithm, cross entropy analytic policy gradients, that is able to leverage these gradients to outperform state of art deep reinforcement learning on a set of challenging nonlinear control problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2021

Deep Reinforcement Learning for Continuous Docking Control of Autonomous Underwater Vehicles: A Benchmarking Study

Docking control of an autonomous underwater vehicle (AUV) is a task that...
research
12/21/2020

Difference Rewards Policy Gradients

Policy gradient methods have become one of the most popular classes of a...
research
10/24/2022

Benchmarking Deformable Object Manipulation with Differentiable Physics

Deformable Object Manipulation (DOM) is of significant importance to bot...
research
04/14/2022

Accelerated Policy Learning with Parallel Differentiable Simulation

Deep reinforcement learning can generate complex control policies, but r...
research
05/26/2019

Interactive Differentiable Simulation

Intelligent agents need a physical understanding of the world to predict...
research
09/20/2019

How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning?

Deep Reinforcement Learning is a promising paradigm for robotic control ...
research
03/02/2023

Co-learning Planning and Control Policies Using Differentiable Formal Task Constraints

This paper presents a hierarchical reinforcement learning algorithm cons...

Please sign up or login with your details

Forgot password? Click here to reset