Feedback in Imitation Learning: The Three Regimes of Covariate Shift

02/04/2021
by   Jonathan Spencer, et al.
0

Imitation learning practitioners have often noted that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ. Interactive approaches can provably address this divergence but require repeated querying of a demonstrator. Recent work identifies this divergence as stemming from a "causal confound" in predicting the current action, and seek to ablate causal aspects of current state using tools from causal inference. In this work, we argue instead that this divergence is simply another manifestation of covariate shift, exacerbated particularly by settings of feedback between decisions and input features. The learner often comes to rely on features that are strongly predictive of decisions, but are subject to strong covariate shift. Our work demonstrates a broad class of problems where this shift can be mitigated, both theoretically and practically, by taking advantage of a simulator but without any further querying of expert demonstration. We analyze existing benchmarks used to test imitation learning approaches and find that these benchmarks are realizable and simple and thus insufficient for capturing the harder regimes of error compounding seen in real-world decision making problems. We find, in a surprising contrast with previous literature, but consistent with our theory, that naive behavioral cloning provides excellent results. We detail the need for new standardized benchmarks that capture the phenomena seen in robotics problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2023

DITTO: Offline Imitation Learning with World Models

We propose DITTO, an offline imitation learning algorithm which uses wor...
research
06/23/2014

Reinforcement and Imitation Learning via Interactive No-Regret Learning

Recent work has demonstrated that problems-- particularly imitation lear...
research
06/06/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

This paper studies offline Imitation Learning (IL) where an agent learns...
research
06/04/2023

Data Quality in Imitation Learning

In supervised learning, the question of data quality and curation has be...
research
05/28/2019

Causal Confusion in Imitation Learning

Behavioral cloning reduces policy learning to supervised learning by tra...
research
11/13/2020

Grasping with Chopsticks: Combating Covariate Shift in Model-free Imitation Learning for Fine Manipulation

Billions of people use chopsticks, a simple yet versatile tool, for fine...
research
02/17/2021

Fully General Online Imitation Learning

In imitation learning, imitators and demonstrators are policies for pick...

Please sign up or login with your details

Forgot password? Click here to reset