Disturbance-Injected Robust Imitation Learning with Task Achievement

05/09/2022
by   Hirotaka Tahara, et al.
0

Robust imitation learning using disturbance injections overcomes issues of limited variation in demonstrations. However, these methods assume demonstrations are optimal, and that policy stabilization can be learned via simple augmentations. In real-world scenarios, demonstrations are often of diverse-quality, and disturbance injection instead learns sub-optimal policies that fail to replicate desired behavior. To address this issue, this paper proposes a novel imitation learning framework that combines both policy robustification and optimal demonstration learning. Specifically, this combinatorial approach forces policy learning and disturbance injection optimization to focus on mainly learning from high task achievement demonstrations, while utilizing low achievement ones to decrease the number of samples needed. The effectiveness of the proposed method is verified through experiments using an excavation task in both simulations and a real robot, resulting in high-achieving policies that are more stable and robust to diverse-quality demonstrations. In addition, this method utilizes all of the weighted sub-optimal demonstrations without eliminating them, resulting in practical data efficiency benefits.

READ FULL TEXT

page 1

page 4

page 5

page 6

research
03/10/2021

Learning from Imperfect Demonstrations from Agents with Varying Dynamics

Imitation learning enables robots to learn from demonstrations. Previous...
research
04/12/2019

Few-Shot Bayesian Imitation Learning with Logic over Programs

We describe an expressive class of policies that can be efficiently lear...
research
03/25/2021

Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies

Scenarios requiring humans to choose from multiple seemingly optimal act...
research
06/13/2023

Skill Disentanglement for Imitation Learning from Suboptimal Demonstrations

Imitation learning has achieved great success in many sequential decisio...
research
04/03/2023

Chain-of-Thought Predictive Control

We study generalizable policy learning from demonstrations for complex l...
research
03/01/2019

GRP Model for Sensorimotor Learning

Learning from complex demonstrations is challenging, especially when the...
research
11/07/2022

Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies for Robot Manipulation

Humans demonstrate a variety of interesting behavioral characteristics w...

Please sign up or login with your details

Forgot password? Click here to reset