AI Safety Gridworlds

11/27/2017
by   Jan Leike, et al.
0

We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents. These problems include safe interruptibility, avoiding side effects, absent supervisor, reward gaming, safe exploration, as well as robustness to self-modification, distributional shift, and adversaries. To measure compliance with the intended safe behavior, we equip each environment with a performance function that is hidden from the agent. This allows us to categorize AI safety problems into robustness and specification problems, depending on whether the performance function corresponds to the observed reward function. We evaluate A2C and Rainbow, two recent deep reinforcement learning agents, on our environments and show that they are not able to solve them satisfactorily.

READ FULL TEXT
research
02/18/2019

Parenting: Safe Reinforcement Learning from Human Input

Autonomous agents trained via reinforcement learning present numerous sa...
research
12/03/2019

SafeLife 1.0: Exploring Side Effects in Complex Environments

We present SafeLife, a publicly available reinforcement learning environ...
research
01/14/2021

Evaluating the Robustness of Collaborative Agents

In order for agents trained by deep reinforcement learning to work along...
research
11/03/2018

SafeRoute: Learning to Navigate Streets Safely in an Urban Environment

Recent studies show that 85 avoid harassment and assault. Despite this, ...
research
06/21/2016

Concrete Problems in AI Safety

Rapid progress in machine learning and artificial intelligence (AI) has ...
research
09/14/2017

Towards personalized human AI interaction - adapting the behavior of AI agents using neural signatures of subjective interest

Reinforcement Learning AI commonly uses reward/penalty signals that are ...
research
07/02/2020

Verifiably Safe Exploration for End-to-End Reinforcement Learning

Deploying deep reinforcement learning in safety-critical settings requir...

Please sign up or login with your details

Forgot password? Click here to reset