Cross-domain Imitation from Observations

05/20/2021
by   Dripta S. Raychaudhuri, et al.
3

Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior. With environments modeled as Markov Decision Processes (MDP), most of the existing imitation algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which a new imitation policy is to be learned. In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP. These discrepancies across domains could include differing dynamics, viewpoint, or morphology; we present a novel framework to learn correspondences across such domains. Importantly, in contrast to prior works, we use unpaired and unaligned trajectories containing only states in the expert domain, to learn this correspondence. We utilize a cycle-consistency constraint on both the state space and a domain agnostic latent space to do this. In addition, we enforce consistency on the temporal position of states via a normalized position estimator function, to align the trajectories across the two domains. Once this correspondence is found, we can directly transfer the demonstrations on one domain to the other and use it for imitation. Experiments across a wide variety of challenging domains demonstrate the efficacy of our approach.

READ FULL TEXT

page 1

page 6

page 8

page 12

page 13

research
10/07/2021

Cross-Domain Imitation Learning via Optimal Transport

Cross-domain imitation learning studies how to leverage expert demonstra...
research
09/30/2019

Cross Domain Imitation Learning

We study the question of how to imitate tasks across domains with discre...
research
01/31/2020

Domain-Adversarial and -Conditional State Space Model for Imitation Learning

State representation learning (SRL) in partially observable Markov decis...
research
09/24/2022

Learn what matters: cross-domain imitation learning with task-relevant embeddings

We study how an autonomous agent learns to perform a task from demonstra...
research
02/27/2020

State-only Imitation with Transition Dynamics Mismatch

Imitation Learning (IL) is a popular paradigm for training agents to ach...
research
10/02/2019

Learning Calibratable Policies using Programmatic Style-Consistency

We study the important and challenging problem of controllable generatio...
research
09/13/2021

Cross Domain Robot Imitation with Invariant Representation

Animals are able to imitate each others' behavior, despite their differe...

Please sign up or login with your details

Forgot password? Click here to reset