Concurrent Training Improves the Performance of Behavioral Cloning from Observation

08/03/2020
by   Zachary W. Robertson, et al.
0

Learning from demonstration is widely used as an efficient way for robots to acquire new skills. However, it typically requires that demonstrations provide full access to the state and action sequences. In contrast, learning from observation offers a way to utilize unlabeled demonstrations (e.g., video) to perform imitation learning. One approach to this is behavioral cloning from observation (BCO). The original implementation of BCO proceeds by first learning an inverse dynamics model and then using that model to estimate action labels, thereby reducing the problem to behavioral cloning. However, existing approaches to BCO require a large number of initial interactions in the first step. Here, we provide a novel theoretical analysis of BCO, introduce a modification BCO*, and show that in the semi-supervised setting, BCO* can concurrently improve both its estimate for the inverse dynamics model and the expert policy. This result allows us to eliminate the dependence on initial interactions and dramatically improve the sample complexity of BCO. We evaluate the effectiveness of our algorithm through experiments on various benchmark domains. The results demonstrate that concurrent training not only improves over the performance of BCO but also results in performance that is competitive with state-of-the-art imitation learning methods such as GAIL and Value-Dice.

READ FULL TEXT
research
03/14/2023

Sample-efficient Adversarial Imitation Learning

Imitation learning, in which learning is performed by demonstration, has...
research
04/07/2020

State-Only Imitation Learning for Dexterous Manipulation

Dexterous manipulation has been a long-standing challenge in robotics. R...
research
05/04/2018

Behavioral Cloning from Observation

Humans often learn how to perform tasks via imitation: they observe othe...
research
09/23/2021

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

In this paper, we consider the problem of autonomous driving using imita...
research
01/29/2022

Robust Imitation Learning from Corrupted Demonstrations

We consider offline Imitation Learning from corrupted demonstrations whe...
research
10/10/2019

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

This paper studies Learning from Observations (LfO) for imitation learni...
research
10/27/2021

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Behavioral cloning has proven to be effective for learning sequential de...

Please sign up or login with your details

Forgot password? Click here to reset