Causal Confusion in Imitation Learning

05/28/2019
by   Pim de Haan, et al.
0

Behavioral cloning reduces policy learning to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment. We point out that ignoring causality is particularly damaging because of the distributional shift in imitation learning. In particular, it leads to a counter-intuitive "causal confusion" phenomenon: access to more information can yield worse performance. We investigate how this problem arises, and propose a solution to combat it through targeted interventions---either environment interaction or expert queries---to determine the correct causal model. We show that causal confusion occurs in several benchmark control domains as well as realistic driving settings, and validate our solution against DAgger and other baselines and ablations.

READ FULL TEXT

page 2

page 4

page 8

research
07/29/2023

Initial State Interventions for Deconfounded Imitation Learning

Imitation learning suffers from causal confusion. This phenomenon occurs...
research
10/27/2021

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Behavioral cloning has proven to be effective for learning sequential de...
research
12/07/2021

Causal Imitative Model for Autonomous Driving

Imitation learning is a powerful approach for learning autonomous drivin...
research
02/02/2022

Causal Imitation Learning under Temporally Correlated Noise

We develop algorithms for imitation learning from policy data that was c...
research
02/26/2023

Diffusion Model-Augmented Behavioral Cloning

Imitation learning addresses the challenge of learning by observing an e...
research
05/21/2018

Imitating Latent Policies from Observation

We describe a novel approach to imitation learning that infers latent po...
research
02/04/2021

Feedback in Imitation Learning: The Three Regimes of Covariate Shift

Imitation learning practitioners have often noted that conditioning poli...

Please sign up or login with your details

Forgot password? Click here to reset