Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions

03/23/2019
by   Xiao Li, et al.
0

Using reinforcement learning to learn control policies is a challenge when the task is complex with potentially long horizons. Ensuring adequate but safe exploration is also crucial for controlling physical systems. In this paper, we use temporal logic to facilitate specification and learning of complex tasks. We combine temporal logic with control Lyapunov functions to improve exploration. We incorporate control barrier functions to safeguard the exploration and deployment process. We develop a flexible and learnable system that allows users to specify task objectives and constraints in different forms and at various levels. The framework is also able to take advantage of known system dynamics and handle unknown environmental dynamics by integrating model-free learning with model-based planning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2021

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

This paper studies the problem of developing an approximate dynamic prog...
research
09/07/2021

Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic through Gaussian Processes and Control Barrier Functions

Reinforcement learning (RL) is a promising approach and has limited succ...
research
01/28/2022

Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications

We present a Deep Reinforcement Learning (DRL) algorithm for a task-guid...
research
03/29/2021

Model-Based Safe Policy Search from Signal Temporal Logic Specifications Using Recurrent Neural Networks

We propose a policy search approach to learn controllers from specificat...
research
03/26/2021

Model-Free Learning of Safe yet Effective Controllers

In this paper, we study the problem of learning safe control policies th...
research
04/19/2023

Model Based Reinforcement Learning for Personalized Heparin Dosing

A key challenge in sequential decision making is optimizing systems safe...

Please sign up or login with your details

Forgot password? Click here to reset