Combating False Negatives in Adversarial Imitation Learning

02/02/2020
by   Konrad Zolna, et al.
5

In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's trajectories, the discriminator is trained to output low values for them. We hypothesize that this inconsistent training signal for the discriminator can impede its learning, and consequently leads to worse overall performance of the agent. We show experimental evidence for this hypothesis and that the 'False Negatives' (i.e. successful agent episodes) significantly hinder adversarial imitation learning, which is the first contribution of this paper. Then, we propose a method to alleviate the impact of false negatives and test it on the BabyAI environment. This method consistently improves sample efficiency over the baselines by at least an order of magnitude.

READ FULL TEXT
research
02/13/2023

Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning

Adversarial imitation learning has become a widely used imitation learni...
research
06/23/2020

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Adversarial imitation learning alternates between learning a discriminat...
research
04/21/2023

Self-Supervised Adversarial Imitation Learning

Behavioural cloning is an imitation learning technique that teaches an a...
research
04/05/2022

GAIL-PT: A Generic Intelligent Penetration Testing Framework with Generative Adversarial Imitation Learning

Penetration testing (PT) is an efficient network testing and vulnerabili...
research
06/19/2023

SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models

Model-based imitation learning (MBIL) is a popular reinforcement learnin...
research
06/19/2022

Robust Imitation Learning against Variations in Environment Dynamics

In this paper, we propose a robust imitation learning (IL) framework tha...
research
10/01/2018

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

Adversarial learning methods have been proposed for a wide range of appl...

Please sign up or login with your details

Forgot password? Click here to reset