Reinforcement Learning in Robotic Motion Planning by Combined Experience-based Planning and Self-Imitation Learning

06/11/2023
by   Sha Luo, et al.
0

High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks. For real robots, it is challenging to collect enough qualified data either as demonstrations for IL or experiences for RL due to safety considerations in environments with obstacles. We target this challenge by proposing the self-imitation learning by planning plus (SILP+) algorithm, which efficiently embeds experience-based planning into the learning architecture to mitigate the data-collection problem. The planner generates demonstrations based on successfully visited states from the current RL policy, and the policy improves by learning from these demonstrations. In this way, we relieve the demand for human expert operators to collect demonstrations required by IL and improve the RL performance as well. Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks compared to several other methods. Extensive tests on physical robots illustrate the effectiveness of SILP+ in a physical setting.

READ FULL TEXT

page 16

page 29

research
03/25/2021

Self-Imitation Learning by Planning

Imitation learning (IL) enables robots to acquire skills quickly by tran...
research
04/07/2022

3D Perception based Imitation Learning under Limited Demonstration for Laparoscope Control in Robotic Surgery

Automatic laparoscope motion control is fundamentally important for surg...
research
05/31/2017

The Atari Grand Challenge Dataset

Recent progress in Reinforcement Learning (RL), fueled by its combinatio...
research
05/26/2021

What data do we need for training an AV motion planner?

We investigate what grade of sensor data is required for training an imi...
research
01/23/2023

Learning to View: Decision Transformers for Active Object Detection

Active perception describes a broad class of techniques that couple plan...
research
04/03/2018

Learning to Search via Self-Imitation

We study the problem of learning a good search policy. To do so, we prop...
research
07/10/2019

RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators from RL Policies

This paper addresses two challenges facing sampling-based kinodynamic mo...

Please sign up or login with your details

Forgot password? Click here to reset