Safety Aware Reinforcement Learning (SARL)

10/06/2020
by   Santiago Miret, et al.
0

As reinforcement learning agents become increasingly integrated into complex, real-world environments, designing for safety becomes a critical consideration. We specifically focus on researching scenarios where agents can cause undesired side effects while executing a policy on a primary task. Since one can define multiple tasks for a given environment dynamics, there are two important challenges. First, we need to abstract the concept of safety that applies broadly to that environment independent of the specific task being executed. Second, we need a mechanism for the abstracted notion of safety to modulate the actions of agents executing different policies to minimize their side-effects. In this work, we propose Safety Aware Reinforcement Learning (SARL) - a framework where a virtual safe agent modulates the actions of a main reward-based agent to minimize side effects. The safe agent learns a task-independent notion of safety for a given environment. The main agent is then trained with a regularization loss given by the distance between the native action probabilities of the two agents. Since the safe agent effectively abstracts a task-independent notion of safety via its action probabilities, it can be ported to modulate multiple policies solving different tasks within the given environment without further training. We contrast this with solutions that rely on task-specific regularization metrics and test our framework on the SafeLife Suite, based on Conway's Game of Life, comprising a number of complex tasks in dynamic environments. We show that our solution is able to match the performance of solutions that rely on task-specific side-effect penalties on both the primary and safety objectives while additionally providing the benefit of generalizability and portability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2019

SafeLife 1.0: Exploring Side Effects in Complex Environments

We present SafeLife, a publicly available reinforcement learning environ...
research
01/27/2021

Safe Multi-Agent Reinforcement Learning via Shielding

Multi-agent reinforcement learning (MARL) has been increasingly used in ...
research
02/19/2022

Learning a Shield from Catastrophic Action Effects: Never Repeat the Same Mistake

Agents that operate in an unknown environment are bound to make mistakes...
research
10/29/2021

Learning to Be Cautious

A key challenge in the field of reinforcement learning is to develop age...
research
07/25/2023

Safety Margins for Reinforcement Learning

Any autonomous controller will be unsafe in some situations. The ability...
research
06/23/2021

Evolving Hierarchical Memory-Prediction Machines in Multi-Task Reinforcement Learning

A fundamental aspect of behaviour is the ability to encode salient featu...
research
10/03/2020

Disentangling causal effects for hierarchical reinforcement learning

Exploration and credit assignment under sparse rewards are still challen...

Please sign up or login with your details

Forgot password? Click here to reset