MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

12/07/2021
by   Michael Luo, et al.
19

Safe exploration is critical for using reinforcement learning (RL) in risk-sensitive environments. Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety. However, learning such risk measures requires significant interaction with the environment, resulting in excessive constraint violations during learning. Furthermore, these measures are not easily transferable to new environments. We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments to quickly adapt learned risk measures to a new environment with previously unseen dynamics. We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning a risk measure for safe RL. Simulation experiments across 5 continuous control domains suggest that MESA can leverage offline data from a range of different environments to reduce constraint violations in unseen environments by up to a factor of 2 while maintaining task performance. See https://tinyurl.com/safe-meta-rl for code and supplementary material.

READ FULL TEXT

page 5

page 7

research
09/10/2022

Safe Reinforcement Learning with Contrastive Risk Prediction

As safety violations can lead to severe consequences in real-world robot...
research
10/29/2020

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the...
research
10/08/2021

Offline Meta-Reinforcement Learning for Industrial Insertion

Reinforcement learning (RL) can in principle make it possible for robots...
research
05/06/2020

Safe Reinforcement Learning through Meta-learned Instincts

An important goal in reinforcement learning is to create agents that can...
research
07/07/2020

Meta-active Learning in Probabilistically-Safe Optimization

Learning to control a safety-critical system with latent dynamics (e.g. ...
research
03/14/2022

Safe adaptation in multiagent competition

Achieving the capability of adapting to ever-changing environments is a ...
research
07/10/2021

LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks

Reinforcement learning (RL) algorithms have shown impressive success in ...

Please sign up or login with your details

Forgot password? Click here to reset