Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning

04/30/2023
by   Lunet Yifru, et al.
0

In many real-world applications, safety constraints for reinforcement learning (RL) algorithms are either unknown or not explicitly defined. We propose a framework that concurrently learns safety constraints and optimal RL policies in such environments, supported by theoretical guarantees. Our approach merges a logically-constrained RL algorithm with an evolutionary algorithm to synthesize signal temporal logic (STL) specifications. The framework is underpinned by theorems that establish the convergence of our joint learning process and provide error bounds between the discovered policy and the true optimal policy. We showcased our framework in grid-world environments, successfully identifying both acceptable safety constraints and RL policies while demonstrating the effectiveness of our theorems in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2020

Cautious Reinforcement Learning with Logical Constraints

This paper presents the concept of an adaptive safe padding that forces ...
research
10/04/2019

"I'm sorry Dave, I'm afraid I can't do that" Deep Q-learning from forbidden action

The use of Reinforcement Learning (RL) is still restricted to simulation...
research
05/11/2023

Towards Theoretical Understanding of Data-Driven Policy Refinement

This paper presents an approach for data-driven policy refinement in rei...
research
11/14/2021

Explicit Explore, Exploit, or Escape (E^4): near-optimal safety-constrained reinforcement learning in polynomial time

In reinforcement learning (RL), an agent must explore an initially unkno...
research
11/08/2022

Reinforcement Learning with Stepwise Fairness Constraints

AI methods are used in societally important settings, ranging from credi...
research
09/24/2017

An Optimal Online Method of Selecting Source Policies for Reinforcement Learning

Transfer learning significantly accelerates the reinforcement learning p...
research
12/30/2021

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

Online reinforcement learning (RL) algorithms are often difficult to dep...

Please sign up or login with your details

Forgot password? Click here to reset