Sequential Causal Imitation Learning with Unobserved Confounders

08/12/2022
by   Daniel Kumor, et al.
0

"Monkey see monkey do" is an age-old adage, referring to naïve imitation without a deep understanding of a system's underlying mechanics. Indeed, if a demonstrator has access to information unavailable to the imitator (monkey), such as a different set of sensors, then no matter how perfectly the imitator models its perceived environment (See), attempting to reproduce the demonstrator's behavior (Do) can lead to poor outcomes. Imitation learning in the presence of a mismatch between demonstrator and imitator has been studied in the literature under the rubric of causal imitation learning (Zhang et al., 2020), but existing solutions are limited to single-stage decision-making. This paper investigates the problem of causal imitation learning in sequential settings, where the imitator must make multiple decisions per episode. We develop a graphical criterion that is necessary and sufficient for determining the feasibility of causal imitation, providing conditions when an imitator can match a demonstrator's performance despite differing capabilities. Finally, we provide an efficient algorithm for determining imitability and corroborate our theory with simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2023

Causal Imitability Under Context-Specific Independence Relations

Drawbacks of ignoring the causal mechanisms when performing imitation le...
research
08/12/2022

Causal Imitation Learning with Unobserved Confounders

One of the common ways children learn is by mimicking adults. Imitation ...
research
11/04/2022

Deconfounded Imitation Learning

Standard imitation learning can fail when the expert demonstrators have ...
research
05/23/2019

Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces

We propose a computationally efficient algorithm that combines compresse...
research
05/22/2018

Maximum Causal Tsallis Entropy Imitation Learning

In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) ...
research
03/02/2020

Causal Transfer for Imitation Learning and Decision Making under Sensor-shift

Learning from demonstrations (LfD) is an efficient paradigm to train AI ...
research
04/27/2023

Learning Environment for the Air Domain (LEAD)

A substantial part of fighter pilot training is simulation-based and inv...

Please sign up or login with your details

Forgot password? Click here to reset