Conservative Safety Critics for Exploration

10/27/2020
by   Homanga Bharadhwaj, et al.
0

Safe exploration presents a major challenge in reinforcement learning (RL): when active data collection requires deploying partially trained policies, we must ensure that these policies avoid catastrophically unsafe regions, while still enabling trial and error learning. In this paper, we target the problem of safe exploration in RL by learning a conservative safety estimate of environment states through a critic, and provably upper bound the likelihood of catastrophic failures at every training iteration. We theoretically characterize the tradeoff between safety and policy improvement, show that the safety constraints are likely to be satisfied with high probability during training, derive provable convergence guarantees for our approach, which is no worse asymptotically than standard RL, and demonstrate the efficacy of the proposed approach on a suite of challenging navigation, manipulation, and locomotion tasks. Empirically, we show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates during training than prior methods. Videos are at this url https://sites.google.com/view/conservative-safety-critics/home

READ FULL TEXT
research
07/04/2022

Safe Reinforcement Learning via Confidence-Based Filters

Ensuring safety is a crucial challenge when deploying reinforcement lear...
research
10/27/2020

Learning to be Safe: Deep RL with a Safety Critic

Safety is an essential component for deploying reinforcement learning (R...
research
06/06/2022

Enhancing Safe Exploration Using Safety State Augmentation

Safe exploration is a challenging and important problem in model-free re...
research
10/29/2020

Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the...
research
12/14/2022

Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning

Learning a risk-aware policy is essential but rather challenging in unst...
research
12/14/2021

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Reinforcement Learning (RL) agents in the real world must satisfy safety...
research
10/25/2021

Safely Bridging Offline and Online Reinforcement Learning

A key challenge to deploying reinforcement learning in practice is explo...

Please sign up or login with your details

Forgot password? Click here to reset