Parenting: Safe Reinforcement Learning from Human Input

02/18/2019
by   Christopher Frye, et al.
0

Autonomous agents trained via reinforcement learning present numerous safety concerns: reward hacking, negative side effects, and unsafe exploration, among others. In the context of near-future autonomous agents, operating in environments where humans understand the existing dangers, human involvement in the learning process has proved a promising approach to AI Safety. Here we demonstrate that a precise framework for learning from human input, loosely inspired by the way humans parent children, solves a broad class of safety problems in this context. We show that our Parenting algorithm solves these problems in the relevant AI Safety gridworlds of Leike et al. (2017), that an agent can learn to outperform its parent as it "matures", and that policies learnt through Parenting are generalisable to new environments.

READ FULL TEXT
research
11/27/2017

AI Safety Gridworlds

We present a suite of reinforcement learning environments illustrating v...
research
12/03/2019

SafeLife 1.0: Exploring Side Effects in Complex Environments

We present SafeLife, a publicly available reinforcement learning environ...
research
09/27/2019

Safe Reinforcement Learning on Autonomous Vehicles

There have been numerous advances in reinforcement learning, but the typ...
research
03/30/2017

Enter the Matrix: A Virtual World Approach to Safely Interruptable Autonomous Systems

Robots and autonomous systems that operate around humans will likely alw...
research
12/04/2016

Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision

We propose a scheme for training a computerized agent to perform complex...
research
05/06/2020

Exploring Exploration: Comparing Children with RL Agents in Unified Environments

Research in developmental psychology consistently shows that children ex...
research
07/25/2023

Safety Margins for Reinforcement Learning

Any autonomous controller will be unsafe in some situations. The ability...

Please sign up or login with your details

Forgot password? Click here to reset