System III: Learning with Domain Knowledge for Safety Constraints

04/23/2023
by   Fazl Barez, et al.
0

Reinforcement learning agents naturally learn from extensive exploration. Exploration is costly and can be unsafe in safety-critical domains. This paper proposes a novel framework for incorporating domain knowledge to help guide safe exploration and boost sample efficiency. Previous approaches impose constraints, such as regularisation parameters in neural networks, that rely on large sample sets and often are not suitable for safety-critical domains where agents should almost always avoid unsafe actions. In our approach, called System III, which is inspired by psychologists' notions of the brain's System I and System II, we represent domain expert knowledge of safety in form of first-order logic. We evaluate the satisfaction of these constraints via p-norms in state vector space. In our formulation, constraints are analogous to hazards, objects, and regions of state that have to be avoided during exploration. We evaluated the effectiveness of the proposed method on OpenAI's Gym and Safety-Gym environments. In all tasks, including classic Control and Safety Games, we show that our approach results in safer exploration and sample efficiency.

READ FULL TEXT

page 4

page 10

research
09/28/2022

Guiding Safe Exploration with Weakest Preconditions

In reinforcement learning for safety-critical settings, it is often desi...
research
09/30/2022

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Recent rapid developments in reinforcement learning algorithms have been...
research
01/02/2021

Context-Aware Safe Reinforcement Learning for Non-Stationary Environments

Safety is a critical concern when deploying reinforcement learning agent...
research
05/19/2019

Leveraging Semantic Embeddings for Safety-Critical Applications

Semantic Embeddings are a popular way to represent knowledge in the fiel...
research
05/01/2022

Deep Learning with Logical Constraints

In recent years, there has been an increasing interest in exploiting log...
research
07/10/2023

Probabilistic Counterexample Guidance for Safer Reinforcement Learning (Extended Version)

Safe exploration aims at addressing the limitations of Reinforcement Lea...
research
04/21/2023

Approximate Shielding of Atari Agents for Safe Exploration

Balancing exploration and conservatism in the constrained setting is an ...

Please sign up or login with your details

Forgot password? Click here to reset