Robust Imitation of a Few Demonstrations with a Backwards Model

10/17/2022
by   Jung Yeon Park, et al.
5

Behavior cloning of expert demonstrations can speed up learning optimal policies in a more sample-efficient way over reinforcement learning. However, the policy cannot extrapolate well to unseen states outside of the demonstration data, creating covariate shift (agent drifting away from demonstrations) and compounding errors. In this work, we tackle this issue by extending the region of attraction around the demonstrations so that the agent can learn how to get back onto the demonstrated trajectories if it veers off-course. We train a generative backwards dynamics model and generate short imagined trajectories from states in the demonstrations. By imitating both demonstrations and these model rollouts, the agent learns the demonstrated paths and how to get back onto these paths. With optimal or near-optimal demonstrations, the learned policy will be both optimal and robust to deviations, with a wider region of attraction. On continuous control domains, we evaluate the robustness when starting from different initial states unseen in the demonstration data. While both our method and other imitation learning baselines can successfully solve the tasks for initial states in the training distribution, our method exhibits considerably more robustness to different initial states.

READ FULL TEXT

page 6

page 8

page 9

page 15

page 16

research
06/16/2021

Automatic Curricula via Expert Demonstrations

We propose Automatic Curricula via Expert Demonstrations (ACED), a reinf...
research
03/21/2017

One-Shot Imitation Learning

Imitation learning has been commonly applied to solve different tasks in...
research
12/04/2021

Stage Conscious Attention Network (SCAN) : A Demonstration-Conditioned Policy for Few-Shot Imitation

In few-shot imitation learning (FSIL), using behavioral cloning (BC) to ...
research
06/07/2023

Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations

We consider the problem of learning to perform a task from demonstration...
research
01/29/2022

Robust Imitation Learning from Corrupted Demonstrations

We consider offline Imitation Learning from corrupted demonstrations whe...
research
04/14/2017

Incremental learning of high-level concepts by imitation

Nowadays, robots become a companion in everyday life. To be well-accepte...

Please sign up or login with your details

Forgot password? Click here to reset