Fail-Safe Generative Adversarial Imitation Learning

03/03/2022
by   Philipp Geiger, et al.
0

For flexible yet safe imitation learning (IL), we propose a modular approach that uses a generative imitator policy with a safety layer, has an overall explicit density/gradient, can therefore be end-to-end trained using generative adversarial IL (GAIL), and comes with theoretical worst-case safety/robustness guarantees. The safety layer's exact density comes from using a countable non-injective gluing of piecewise differentiable injections and the change-of-variables formula. The safe set (into which the safety layer maps) is inferred by sampling actions and their potential future fail-safe fallback continuations, together with Lipschitz continuity and convexity arguments. We also provide theoretical bounds showing the advantage of using the safety layer already during training (imitation error linear in the horizon) compared to only using it at test time (quadratic error). In an experiment on challenging real-world driver interaction data, we empirically demonstrate tractability, safety and imitation performance of our approach.

READ FULL TEXT
research
07/22/2018

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is often used in robotics, this approach often ...
research
09/18/2017

DropoutDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is becoming common practice in robotics, this a...
research
01/22/2020

Safety Considerations in Deep Control Policies with Probabilistic Safety Barrier Certificates

Recent advances in Deep Machine Learning have shown promise in solving c...
research
07/09/2021

Adversarial Mixture Density Networks: Learning to Drive Safely from Collision Data

Imitation learning has been widely used to learn control policies for au...
research
06/30/2021

Robust Generative Adversarial Imitation Learning via Local Lipschitzness

We explore methodologies to improve the robustness of generative adversa...
research
01/11/2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

We study the global convergence of generative adversarial imitation lear...
research
06/26/2023

Imitation with Spatial-Temporal Heatmap: 2nd Place Solution for NuPlan Challenge

This paper presents our 2nd place solution for the NuPlan Challenge 2023...

Please sign up or login with your details

Forgot password? Click here to reset