Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees

01/20/2022
by   Kai-Chieh Hsu, et al.
4

Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In particular, policies learned using reinforcement learning often fail to generalize to novel environments due to unsafe behavior. In this paper, we propose Sim-to-Lab-to-Real to safely close the reality gap. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the reach-avoid Bellman Equation based on Hamilton-Jacobi reachability analysis. In Sim-to-Lab transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in Lab-to-Real transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. We empirically study the proposed framework for ego-vision navigation in two types of indoor environments including a photo-realistic one. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. See https://sites.google.com/princeton.edu/sim-to-lab-to-real for supplementary material.

READ FULL TEXT

page 2

page 12

page 15

page 16

page 19

page 28

research
11/17/2020

Reachability-based Trajectory Safeguard (RTS): A Safe and Fast Reinforcement Learning Safety Layer for Continuous Control

Reinforcement Learning (RL) algorithms have achieved remarkable performa...
research
05/01/2019

An Efficient Reachability-Based Framework for Provably Safe Autonomous Navigation in Unknown Environments

Real-world autonomous vehicles often operate in a priori unknown environ...
research
02/28/2020

Probably Approximately Correct Vision-Based Planning using Motion Primitives

This paper presents a deep reinforcement learning approach for synthesiz...
research
08/05/2020

Generalization Guarantees for Multi-Modal Imitation Learning

Control policies from imitation learning can often fail to generalize to...
research
09/11/2020

Embodied Visual Navigation with Automatic Curriculum Learning in Real Environments

We present NavACL, a method of automatic curriculum learning tailored to...
research
11/16/2021

Stronger Generalization Guarantees for Robot Learning by Combining Generative Models and Real-World Data

We are motivated by the problem of learning policies for robotic systems...
research
10/16/2020

Robot Navigation in Constrained Pedestrian Environments using Reinforcement Learning

Navigating fluently around pedestrians is a necessary capability for mob...

Please sign up or login with your details

Forgot password? Click here to reset