Causal Imitation Learning under Temporally Correlated Noise

02/02/2022
by   Gokul Swamy, et al.
0

We develop algorithms for imitation learning from policy data that was corrupted by temporally correlated noise in expert actions. When noise affects multiple timesteps of recorded data, it can manifest as spurious correlations between states and actions that a learner might latch on to, leading to poor policy performance. To break up these spurious correlations, we apply modern variants of the instrumental variable regression (IVR) technique of econometrics, enabling us to recover the underlying policy without requiring access to an interactive expert. In particular, we present two techniques, one of a generative-modeling flavor (DoubIL) that can utilize access to a simulator, and one of a game-theoretic flavor (ResiduIL) that can be run entirely offline. We find both of our algorithms compare favorably to behavioral cloning on simulated control tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2020

Non-Adversarial Imitation Learning and its Connections to Adversarial Methods

Many modern methods for imitation learning and inverse reinforcement lea...
research
08/03/2022

Sequence Model Imitation Learning with Unobserved Contexts

We consider imitation learning problems where the expert has access to a...
research
05/28/2019

Causal Confusion in Imitation Learning

Behavioral cloning reduces policy learning to supervised learning by tra...
research
04/25/2022

Imitation Learning from Observations under Transition Model Disparity

Learning to perform tasks by leveraging a dataset of expert observations...
research
02/04/2022

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

We propose State Matching Offline DIstribution Correction Estimation (SM...
research
05/30/2022

Minimax Optimal Online Imitation Learning via Replay Estimation

Online imitation learning is the problem of how best to mimic expert dem...
research
07/29/2023

Initial State Interventions for Deconfounded Imitation Learning

Imitation learning suffers from causal confusion. This phenomenon occurs...

Please sign up or login with your details

Forgot password? Click here to reset