Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic through Gaussian Processes and Control Barrier Functions

09/07/2021
by   Mingyu Cai, et al.
13

Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications, because ensuring safe exploration or facilitating adequate exploitation is a challenges for controlling robotic systems with unknown models and measurement uncertainties. Such a learning problem becomes even more intractable for complex tasks over continuous space (state-space and action-space). In this paper, we propose a learning-based control framework consisting of several aspects: (1) linear temporal logic (LTL) is leveraged to facilitate complex tasks over an infinite horizons which can be translated to a novel automaton structure; (2) we propose an innovative reward scheme for RL-agent with the formal guarantee such that global optimal policies maximize the probability of satisfying the LTL specifications; (3) based on a reward shaping technique, we develop a modular policy-gradient architecture utilizing the benefits of automaton structures to decompose overall tasks and facilitate the performance of learned controllers; (4) by incorporating Gaussian Processes (GPs) to estimate the uncertain dynamic systems, we synthesize a model-based safeguard using Exponential Control Barrier Functions (ECBFs) to address problems with high-order relative degrees. In addition, we utilize the properties of LTL automatons and ECBFs to construct a guiding process to further improve the efficiency of exploration. Finally, we demonstrate the effectiveness of the framework via several robotic environments. And we show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability confidence during training.

READ FULL TEXT

page 1

page 15

page 16

page 17

research
03/21/2019

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Reinforcement Learning (RL) algorithms have found limited success beyond...
research
01/26/2020

Tractable Reinforcement Learning of Signal Temporal Logic Objectives

Signal temporal logic (STL) is an expressive language to specify time-bo...
research
03/23/2019

Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions

Using reinforcement learning to learn control policies is a challenge wh...
research
11/10/2020

Model-based Reinforcement Learning from Signal Temporal Logic Specifications

Techniques based on Reinforcement Learning (RL) are increasingly being u...
research
01/28/2022

Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

We present a Deep Reinforcement Learning (DRL) algorithm for a task-guid...
research
06/06/2023

Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

Guaranteeing safe behaviour of reinforcement learning (RL) policies pose...
research
10/30/2020

Learning Stable Normalizing-Flow Control for Robotic Manipulation

Reinforcement Learning (RL) of robotic manipulation skills, despite its ...

Please sign up or login with your details

Forgot password? Click here to reset