Verifiably Safe Exploration for End-to-End Reinforcement Learning

07/02/2020
by   Nathan Hunt, et al.
22

Deploying deep reinforcement learning in safety-critical settings requires developing algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints. Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints. On each of these benchmark problems, our algorithm completely avoids unsafe behavior while remaining competitive at optimizing for as much reward as is safe. We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment.

READ FULL TEXT

page 5

page 6

research
02/24/2021

Towards Safe Continuing Task Reinforcement Learning

Safety is a critical feature of controller design for physical systems. ...
research
02/26/2020

Cautious Reinforcement Learning with Logical Constraints

This paper presents the concept of an adaptive safe padding that forces ...
research
09/30/2022

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Recent rapid developments in reinforcement learning algorithms have been...
research
10/02/2022

Safe Reinforcement Learning From Pixels Using a Stochastic Latent Representation

We address the problem of safe reinforcement learning from pixel observa...
research
11/27/2017

AI Safety Gridworlds

We present a suite of reinforcement learning environments illustrating v...
research
03/20/2020

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

This paper focuses on finding reinforcement learning policies for contro...
research
12/20/2021

Safe multi-agent deep reinforcement learning for joint bidding and maintenance scheduling of generation units

This paper proposes a safe reinforcement learning algorithm for generati...

Please sign up or login with your details

Forgot password? Click here to reset