Dynamic Shielding for Reinforcement Learning in Black-Box Environments

07/27/2022
by   Masaki Waga, et al.
0

It is challenging to use reinforcement learning (RL) in cyber-physical systems due to the lack of safety guarantees during learning. Although there have been various proposals to reduce undesired behaviors during learning, most of these techniques require prior system knowledge, and their applicability is limited. This paper aims to reduce undesired behaviors during learning without requiring any prior system knowledge. We propose dynamic shielding: an extension of a model-based safe RL technique called shielding using automata learning. The dynamic shielding technique constructs an approximate system model in parallel with RL using a variant of the RPNI algorithm and suppresses undesired explorations due to the shield constructed from the learned model. Through this combination, potentially unsafe actions can be foreseen before the agent experiences them. Experiments show that our dynamic shield significantly decreases the number of undesired events during training.

READ FULL TEXT

page 24

page 26

research
04/13/2023

Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Multi-Agent Reinforcement Learning (MARL) discovers policies that maximi...
research
07/27/2023

Approximate Model-Based Shielding for Safe Reinforcement Learning

Reinforcement learning (RL) has shown great potential for solving comple...
research
10/03/2022

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Safety is one of the biggest concerns to applying reinforcement learning...
research
04/15/2022

Safe Reinforcement Learning Using Black-Box Reachability Analysis

Reinforcement learning (RL) is capable of sophisticated motion planning ...
research
03/18/2022

Privacy-Preserving Reinforcement Learning Beyond Expectation

Cyber and cyber-physical systems equipped with machine learning algorith...
research
03/02/2023

Data-efficient, Explainable and Safe Payload Manipulation: An Illustration of the Advantages of Physical Priors in Model-Predictive Control

Machine Learning methods, such as those from the Reinforcement Learning ...
research
11/20/2020

Nested Mixture of Experts: Cooperative and Competitive Learning of Hybrid Dynamical System

Model-based reinforcement learning (MBRL) algorithms can attain signific...

Please sign up or login with your details

Forgot password? Click here to reset