Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning

02/13/2023
by   Yunke Wang, et al.
0

Adversarial imitation learning has become a widely used imitation learning framework. The discriminator is often trained by taking expert demonstrations and policy trajectories as examples respectively from two categories (positive vs. negative) and the policy is then expected to produce trajectories that are indistinguishable from the expert demonstrations. But in the real world, the collected expert demonstrations are more likely to be imperfect, where only an unknown fraction of the demonstrations are optimal. Instead of treating imperfect expert demonstrations as absolutely positive or negative, we investigate unlabeled imperfect expert demonstrations as they are. A positive-unlabeled adversarial imitation learning algorithm is developed to dynamically sample expert demonstrations that can well match the trajectories from the constantly optimized agent policy. The trajectories of an initial agent policy could be closer to those non-optimal expert demonstrations, but within the framework of adversarial imitation learning, agent policy will be optimized to cheat the discriminator and produce trajectories that are similar to those optimal expert demonstrations. Theoretical analysis shows that our method learns from the imperfect demonstrations via a self-paced way. Experimental results on MuJoCo and RoboSuite platforms demonstrate the effectiveness of our method from different aspects.

READ FULL TEXT

page 6

page 7

research
02/02/2020

Combating False Negatives in Adversarial Imitation Learning

In adversarial imitation learning, a discriminator is trained to differe...
research
07/20/2023

On Combining Expert Demonstrations in Imitation Learning via Optimal Transport

Imitation learning (IL) seeks to teach agents specific tasks through exp...
research
03/01/2023

MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts

Imitation learning has been widely applied to various autonomous systems...
research
01/29/2022

Robust Imitation Learning from Corrupted Demonstrations

We consider offline Imitation Learning from corrupted demonstrations whe...
research
09/23/2021

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

In this paper, we consider the problem of autonomous driving using imita...
research
04/03/2018

Learning to Search via Self-Imitation

We study the problem of learning a good search policy. To do so, we prop...
research
06/07/2023

Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations

We consider the problem of learning to perform a task from demonstration...

Please sign up or login with your details

Forgot password? Click here to reset