Data Quality in Imitation Learning

06/04/2023
by   Suneel Belkhale, et al.
0

In supervised learning, the question of data quality and curation has been over-shadowed in recent years by increasingly more powerful and expressive models that can ingest internet-scale data. However, in offline learning for robotics, we simply lack internet scale data, and so high quality datasets are a necessity. This is especially true in imitation learning (IL), a sample efficient paradigm for robot learning using expert demonstrations. Policies learned through IL suffer from state distribution shift at test time due to compounding errors in action prediction, which leads to unseen states that the policy cannot recover from. Instead of designing new algorithms to address distribution shift, an alternative perspective is to develop new ways of assessing and curating datasets. There is growing evidence that the same IL algorithms can have substantially different performance across different datasets. This calls for a formalism for defining metrics of "data quality" that can further be leveraged for data curation. In this work, we take the first step toward formalizing data quality for imitation learning through the lens of distribution shift: a high quality dataset encourages the policy to stay in distribution at test time. We propose two fundamental properties that shape the quality of a dataset: i) action divergence: the mismatch between the expert and learned policy at certain states; and ii) transition diversity: the noise present in the system for a given state and action. We investigate the combined effect of these two key properties in imitation learning theoretically, and we empirically analyze models trained on a variety of different data sources. We show that state diversity is not always beneficial, and we demonstrate how action divergence and transition diversity interact in practice.

READ FULL TEXT
research
06/29/2023

HYDRA: Hybrid Robot Actions for Imitation Learning

Imitation Learning (IL) is a sample efficient paradigm for robot learnin...
research
10/27/2021

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

The aim in imitation learning is to learn effective policies by utilizin...
research
11/18/2020

SAFARI: Safe and Active Robot Imitation Learning with Imagination

One of the main issues in Imitation Learning is the erroneous behavior o...
research
06/06/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

This paper studies offline Imitation Learning (IL) where an agent learns...
research
02/04/2021

Feedback in Imitation Learning: The Three Regimes of Covariate Shift

Imitation learning practitioners have often noted that conditioning poli...
research
05/02/2023

Get Back Here: Robust Imitation by Return-to-Distribution Planning

We consider the Imitation Learning (IL) setup where expert data are not ...
research
11/05/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Quality-Diversity (QD) is a concept from Neuroevolution with some intrig...

Please sign up or login with your details

Forgot password? Click here to reset