Cautious Reinforcement Learning with Logical Constraints

02/26/2020
by   Mohammadhosein Hasanbeig, et al.
0

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesize optimal control policies while ensuring safety during the learning process. We express the safety requirements as a temporal logic formula. Enforcing the RL agent to stay safe during learning might limit the exploration in some safety-critical cases. However, we show that the proposed architecture is able to automatically handle the trade-off between efficient progress in exploration and ensuring strict safety. Theoretical guarantees are available on the convergence of the algorithm. Finally experimental results are provided to showcase the performance of the proposed method.

READ FULL TEXT

page 6

page 7

page 8

research
02/21/2021

Safe Reinforcement Learning Using Robust Action Governor

Reinforcement Learning (RL) is essentially a trial-and-error learning pr...
research
04/30/2023

Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning

In many real-world applications, safety constraints for reinforcement le...
research
07/16/2018

Shielded Decision-Making in MDPs

A prominent problem in artificial intelligence and machine learning is t...
research
11/09/2021

Safe Policy Optimization with Local Generalized Linear Function Approximations

Safe exploration is a key to applying reinforcement learning (RL) in saf...
research
07/02/2020

Verifiably Safe Exploration for End-to-End Reinforcement Learning

Deploying deep reinforcement learning in safety-critical settings requir...
research
08/29/2017

Safe Reinforcement Learning via Shielding

Reinforcement learning algorithms discover policies that maximize reward...
research
06/03/2021

Safe RAN control: A Symbolic Reinforcement Learning Approach

In this paper, we present a Symbolic Reinforcement Learning (SRL) based ...

Please sign up or login with your details

Forgot password? Click here to reset