Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies

03/25/2021
by   Hanbit Oh, et al.
0

Scenarios requiring humans to choose from multiple seemingly optimal actions are commonplace, however standard imitation learning often fails to capture this behavior. Instead, an over-reliance on replicating expert actions induces inflexible and unstable policies, leading to poor generalizability in an application. To address the problem, this paper presents the first imitation learning framework that incorporates Bayesian variational inference for learning flexible non-parametric multi-action policies, while simultaneously robustifying the policies against sources of error, by introducing and optimizing disturbances to create a richer demonstration dataset. This combinatorial approach forces the policy to adapt to challenging situations, enabling stable multi-action policies to be learned efficiently. The effectiveness of our proposed method is evaluated through simulations and real-robot experiments for a table-sweep task using the UR3 6-DOF robotic arm. Results show that, through improved flexibility and robustness, the learning performance and control safety are better than comparison methods.

READ FULL TEXT

page 1

page 5

research
11/07/2022

Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies for Robot Manipulation

Humans demonstrate a variety of interesting behavioral characteristics w...
research
05/09/2022

Disturbance-Injected Robust Imitation Learning with Task Achievement

Robust imitation learning using disturbance injections overcomes issues ...
research
05/22/2023

End-to-End Stable Imitation Learning via Autonomous Neural Dynamic Policies

State-of-the-art sensorimotor learning algorithms offer policies that ca...
research
03/22/2023

Disturbance Injection under Partial Automation: Robust Imitation Learning for Long-horizon Tasks

Partial Automation (PA) with intelligent support systems has been introd...
research
11/17/2017

Data-driven Planning via Imitation Learning

Robot planning is the process of selecting a sequence of actions that op...
research
04/22/2022

The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models

Models of human behavior for prediction and collaboration tend to fall i...
research
05/07/2022

Gaussian Process Self-triggered Policy Search in Weakly Observable Environments

The environments of such large industrial machines as waste cranes in wa...

Please sign up or login with your details

Forgot password? Click here to reset