Joint Synthesis of Safety Certificate and Safe Control Policy using Constrained Reinforcement Learning

11/15/2021
by   Haitong Ma, et al.
0

Safety is the major consideration in controlling complex dynamical systems using reinforcement learning (RL), where the safety certificate can provide provable safety guarantee. A valid safety certificate is an energy function indicating that safe states are with low energy, and there exists a corresponding safe control policy that allows the energy function to always dissipate. The safety certificate and the safe control policy are closely related to each other and both challenging to synthesize. Therefore, existing learning-based studies treat either of them as prior knowledge to learn the other, which limits their applicability with general unknown dynamics. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL. We do not rely on prior knowledge about either an available model-based controller or a perfect safety certificate. In particular, we formulate a loss function to optimize the safety certificate parameters by minimizing the occurrence of energy increases. By adding this optimization procedure as an outer loop to the Lagrangian-based constrained reinforcement learning (CRL), we jointly update the policy and safety certificate parameters and prove that they will converge to their respective local optima, the optimal safe policy and a valid safety certificate. We evaluate our algorithms on multiple safety-critical benchmark environments. The results show that the proposed algorithm learns provably safe policies with no constraint violation. The validity or feasibility of synthesized safety certificate is also verified numerically.

READ FULL TEXT

page 8

page 23

research
11/25/2021

Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

In the trial-and-error mechanism of reinforcement learning (RL), a notor...
research
09/23/2022

Synthesize Efficient Safety Certificates for Learning-Based Safe Control using Magnitude Regularization

Energy-function-based safety certificates can provide provable safety gu...
research
03/17/2023

Zero-shot Transferable and Persistently Feasible Safe Control for High Dimensional Systems by Consistent Abstraction

Safety is critical in robotic tasks. Energy function based methods have ...
research
02/14/2019

Verifiably Safe Off-Model Reinforcement Learning

The desire to use reinforcement learning in safety-critical settings has...
research
08/17/2020

Runtime-Safety-Guided Policy Repair

We study the problem of policy repair for learning-based control policie...
research
06/30/2020

It's Time to Play Safe: Shield Synthesis for Timed Systems

Erroneous behaviour in safety critical real-time systems may inflict ser...
research
01/28/2022

Towards Safe Reinforcement Learning with a Safety Editor Policy

We consider the safe reinforcement learning (RL) problem of maximizing u...

Please sign up or login with your details

Forgot password? Click here to reset