Safe Reinforcement Learning through Meta-learned Instincts

05/06/2020
by   Djordje Grbic, et al.
0

An important goal in reinforcement learning is to create agents that can quickly adapt to new goals while avoiding situations that might cause damage to themselves or their environments. One way agents learn is through exploration mechanisms, which are needed to discover new policies. However, in deep reinforcement learning, exploration is normally done by injecting noise in the action space. While performing well in many domains, this setup has the inherent risk that the noisy actions performed by the agent lead to unsafe states in the environment. Here we introduce a novel approach called Meta-Learned Instinctual Networks (MLIN) that allows agents to safely learn during their lifetime while avoiding potentially hazardous states. At the core of the approach is a plastic network trained through reinforcement learning and an evolved "instinctual" network, which does not change during the agent's lifetime but can modulate the noisy output of the plastic network. We test our idea on a simple 2D navigation task with no-go zones, in which the agent has to learn to approach new targets during deployment. MLIN outperforms standard meta-trained networks and allows agents to learn to navigate to new targets without colliding with any of the no-go zones. These results suggest that meta-learning augmented with an instinctual network is a promising new approach for safe AI, which may enable progress in this area on a variety of different domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2022

Meta-Reinforcement Learning Using Model Parameters

In meta-reinforcement learning, an agent is trained in multiple differen...
research
07/14/2021

Safer Reinforcement Learning through Transferable Instinct Networks

Random exploration is one of the main mechanisms through which reinforce...
research
12/07/2021

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

Safe exploration is critical for using reinforcement learning (RL) in ri...
research
02/04/2022

A Discourse on MetODS: Meta-Optimized Dynamical Synapses for Meta-Reinforcement Learning

Recent meta-reinforcement learning work has emphasized the importance of...
research
10/04/2021

Behaviour-conditioned policies for cooperative reinforcement learning tasks

The cooperation among AI systems, and between AI systems and humans is b...
research
06/19/2020

NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration

Deep reinforcement learning has been applied more and more widely nowada...
research
09/10/2019

Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning

Prior access to domain knowledge could significantly improve the perform...

Please sign up or login with your details

Forgot password? Click here to reset