EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

07/22/2018
by   Kunal Menda, et al.
0

While imitation learning is often used in robotics, this approach often suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by aggregating training data from both the expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which attempts to quantify the confidence of the novice policy as a proxy for safety. Our method, EnsembleDAgger, approximates a GP using an ensemble of neural networks. Using the variance as a measure of confidence, we compute a decision rule that captures how much we doubt the novice, thus determining when it is safe to allow the novice to act. With this approach, we aim to maximize the novice's share of actions, while constraining the probability of failure. We demonstrate improved safety and learning performance compared to other DAgger variants and classic imitation learning on an inverted pendulum and in the MuJoCo HalfCheetah environment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2017

DropoutDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is becoming common practice in robotics, this a...
research
03/03/2022

Fail-Safe Generative Adversarial Imitation Learning

For flexible yet safe imitation learning (IL), we propose a modular appr...
research
01/22/2020

Safety Considerations in Deep Control Policies with Probabilistic Safety Barrier Certificates

Recent advances in Deep Machine Learning have shown promise in solving c...
research
12/10/2019

Deep Bayesian Reward Learning from Preferences

Bayesian inverse reinforcement learning (IRL) methods are ideal for safe...
research
06/11/2022

Model-based Offline Imitation Learning with Non-expert Data

Although Behavioral Cloning (BC) in theory suffers compounding errors, i...
research
02/18/2021

Closing the Closed-Loop Distribution Shift in Safe Imitation Learning

Commonly used optimization-based control strategies such as model-predic...
research
04/04/2023

Quantum Imitation Learning

Despite remarkable successes in solving various complex decision-making ...

Please sign up or login with your details

Forgot password? Click here to reset