RMP2: A Structured Composable Policy Class for Robot Learning

03/10/2021
by   Anqi Li, et al.
0

We consider the problem of learning motion policies for acceleration-based robotics systems with a structured policy class specified by RMPflow. RMPflow is a multi-task control framework that has been successfully applied in many robotics problems. Using RMPflow as a structured policy class in learning has several benefits, such as sufficient expressiveness, the flexibility to inject different levels of prior knowledge as well as the ability to transfer policies between robots. However, implementing a system for end-to-end learning RMPflow policies faces several computational challenges. In this work, we re-examine the message passing algorithm of RMPflow and propose a more efficient alternate algorithm, called RMP2, that uses modern automatic differentiation tools (such as TensorFlow and PyTorch) to compute RMPflow policies. Our new design retains the strengths of RMPflow while bringing in advantages from automatic differentiation, including 1) easy programming interfaces to designing complex transformations; 2) support of general directed acyclic graph (DAG) transformation structures; 3) end-to-end differentiability for policy learning; 4) improved computational efficiency. Because of these features, RMP2 can be treated as a structured policy class for efficient robot learning which is suitable encoding domain knowledge. Our experiments show that using structured policy class given by RMP2 can improve policy performance and safety in reinforcement learning tasks for goal reaching in cluttered space.

READ FULL TEXT

page 1

page 7

page 13

research
08/29/2020

How does the structure embedded in learning policy affect learning quadruped locomotion?

Reinforcement learning (RL) is a popular data-driven method that has dem...
research
01/24/2019

Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based robotics

Scaling end-to-end reinforcement learning to control real robots from vi...
research
11/16/2018

RMPflow: A Computational Graph for Automatic Motion Policy Generation

We develop a novel policy synthesis algorithm, RMPflow, based on geometr...
research
07/02/2013

Multi-Task Policy Search

Learning policies that generalize across multiple tasks is an important ...
research
06/26/2019

Regularized Hierarchical Policies for Compositional Transfer in Robotics

The successful application of flexible, general learning algorithms -- s...
research
07/08/2019

Graph Policy Gradients for Large Scale Robot Control

In this paper, we consider the problem of learning policies to control a...

Please sign up or login with your details

Forgot password? Click here to reset