Closing the Closed-Loop Distribution Shift in Safe Imitation Learning

02/18/2021
by   Stephen Tu, et al.
10

Commonly used optimization-based control strategies such as model-predictive and control Lyapunov/barrier function based controllers often enjoy provable stability, robustness, and safety properties. However, implementing such approaches requires solving optimization problems online at high-frequencies, which may not be possible on resource-constrained commodity hardware. Furthermore, how to extend the safety guarantees of such approaches to systems that use rich perceptual sensing modalities, such as cameras, remains unclear. In this paper, we address this gap by treating safe optimization-based control strategies as experts in an imitation learning problem, and train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert. In particular, we propose Constrained Mixing Iterative Learning (CMILe), a novel on-policy robust imitation learning algorithm that integrates ideas from stochastic mixing iterative learning, constrained policy optimization, and nonlinear robust control. Our approach allows us to control errors introduced by both the learning task of imitating an expert and by the distribution shift inherent to deviating from the original expert policy. The value of using tools from nonlinear robust control to impose stability constraints on learned policies is shown through sample-complexity bounds that are independent of the task time-horizon. We demonstrate the usefulness of CMILe through extensive experiments, including training a provably safe perception-based controller using a state-feedback-based expert.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2021

On Imitation Learning of Linear Control Policies: Enforcing Stability and Robustness Constraints via LMI Conditions

When applying imitation learning techniques to fit a policy from expert ...
research
05/30/2022

TaSIL: Taylor Series Imitation Learning

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentati...
research
10/17/2022

Model Predictive Control via On-Policy Imitation Learning

In this paper, we leverage the rapid advances in imitation learning, a t...
research
09/18/2017

DropoutDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is becoming common practice in robotics, this a...
research
09/14/2021

Reactive and Safe Road User Simulations using Neural Barrier Certificates

Reactive and safe agent modelings are important for nowadays traffic sim...
research
04/24/2023

Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear Systems via Sums-of-Squares Optimization

We present a method for synthesizing dynamic, reduced-order output-feedb...
research
07/22/2018

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is often used in robotics, this approach often ...

Please sign up or login with your details

Forgot password? Click here to reset