Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

12/14/2021
by   Yecheng Jason Ma, et al.
0

Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in real-world constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncertainty and adaptively exploiting it to balance the reward and the cost objectives. First, CAP inflates predicted costs using an uncertainty-based penalty. Theoretically, we show that policies that satisfy this conservative cost constraint are guaranteed to also be feasible in the true environment. We further show that this guarantees the safety of all intermediate solutions during RL training. Further, CAP adaptively tunes this penalty during training using true cost feedback from the environment. We evaluate this conservative and adaptive penalty-based approach for model-based safe RL extensively on state and image-based environments. Our results demonstrate substantial gains in sample-efficiency while incurring fewer violations than prior safe RL algorithms. Code is available at: https://github.com/Redrew/CAP

READ FULL TEXT

page 6

page 12

research
03/24/2023

Safe and Sample-efficient Reinforcement Learning for Clustered Dynamic Environments

This study proposes a safe and sample-efficient reinforcement learning (...
research
10/15/2020

Safe Model-based Reinforcement Learning with Robust Cross-Entropy Method

This paper studies the safe reinforcement learning (RL) problem without ...
research
02/20/2020

From Stateless to Stateful Priorities: Technical Report

We present the notion of stateful priorities for imposing precise restri...
research
10/14/2022

Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate

Safe reinforcement learning (RL) that solves constraint-satisfactory pol...
research
04/18/2023

Safe reinforcement learning with self-improving hard constraints for multi-energy management systems

Safe reinforcement learning (RL) with hard constraint guarantees is a pr...
research
08/26/2021

Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian

Safety is essential for reinforcement learning (RL) applied in the real ...
research
10/27/2020

Conservative Safety Critics for Exploration

Safe exploration presents a major challenge in reinforcement learning (R...

Please sign up or login with your details

Forgot password? Click here to reset