Self-Imitation Learning by Planning

03/25/2021
by   Sha Luo, et al.
0

Imitation learning (IL) enables robots to acquire skills quickly by transferring expert knowledge, which is widely adopted in reinforcement learning (RL) to initialize exploration. However, in long-horizon motion planning tasks, a challenging problem in deploying IL and RL methods is how to generate and collect massive, broadly distributed data such that these methods can generalize effectively. In this work, we solve this problem using our proposed approach called self-imitation learning by planning (SILP), where demonstration data are collected automatically by planning on the visited states from the current policy. SILP is inspired by the observation that successfully visited states in the early reinforcement learning stage are collision-free nodes in the graph-search based motion planner, so we can plan and relabel robot's own trials as demonstrations for policy learning. Due to these self-generated demonstrations, we relieve the human operator from the laborious data preparation process required by IL and RL methods in solving complex motion planning tasks. The evaluation results show that our SILP method achieves higher success rates and enhances sample efficiency compared to selected baselines, and the policy learned in simulation performs well in a real-world placement task with changing goals and obstacles.

READ FULL TEXT

page 1

page 6

research
06/11/2023

Reinforcement Learning in Robotic Motion Planning by Combined Experience-based Planning and Self-Imitation Learning

High-quality and representative data is essential for both Imitation Lea...
research
04/15/2022

Divide Conquer Imitation Learning

When cast into the Deep Reinforcement Learning framework, many robotics ...
research
05/25/2023

Imitating Task and Motion Planning with Visuomotor Transformers

Imitation learning is a powerful tool for training robot manipulation po...
research
03/01/2018

Hierarchical Imitation and Reinforcement Learning

We study the problem of learning policies over long time horizons. We pr...
research
11/09/2020

Bimanual Regrasping for Suture Needles using Reinforcement Learning for Rapid Motion Planning

Regrasping a suture needle is an important process in suturing, and prev...
research
12/06/2021

Guided Imitation of Task and Motion Planning

While modern policy optimization methods can do complex manipulation fro...
research
05/26/2021

What data do we need for training an AV motion planner?

We investigate what grade of sensor data is required for training an imi...

Please sign up or login with your details

Forgot password? Click here to reset