Task-Relevant Adversarial Imitation Learning

10/02/2019
by   Konrad Zolna, et al.
35

We show that a critical problem in adversarial imitation from high-dimensional sensory data is the tendency of discriminator networks to distinguish agent and expert behaviour using task-irrelevant features beyond the control of the agent. We analyze this problem in detail and propose a solution as well as several baselines that outperform standard Generative Adversarial Imitation Learning (GAIL). Our proposed solution, Task-Relevant Adversarial Imitation Learning (TRAIL), uses a constrained optimization objective to overcome task-irrelevant features. Comprehensive experiments show that TRAIL can solve challenging manipulation tasks from pixels by imitating human operators, where other agents such as behaviour cloning (BC), standard GAIL, improved GAIL variants including our newly proposed baselines, and Deterministic Policy Gradients from Demonstrations (DPGfD) fail to find solutions, even when the other agents have access to task reward.

READ FULL TEXT

page 1

page 2

page 6

page 12

research
06/22/2022

Latent Policies for Adversarial Imitation Learning

This paper considers learning robot locomotion and manipulation tasks fr...
research
06/19/2023

SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models

Model-based imitation learning (MBIL) is a popular reinforcement learnin...
research
04/03/2023

Generative Adversarial Neuroevolution for Control Behaviour Imitation

There is a recent surge in interest for imitation learning, with large h...
research
03/08/2021

Domain-Robust Visual Imitation Learning with Mutual Information Constraints

Human beings are able to understand objectives and learn by simply obser...
research
09/02/2022

Co-Imitation: Learning Design and Behaviour by Imitation

The co-adaptation of robots has been a long-standing research endeavour ...
research
07/08/2021

Imitation by Predicting Observations

Imitation learning enables agents to reuse and adapt the hard-won expert...
research
03/02/2020

Causal Transfer for Imitation Learning and Decision Making under Sensor-shift

Learning from demonstrations (LfD) is an efficient paradigm to train AI ...

Please sign up or login with your details

Forgot password? Click here to reset