Safe Continuous Control with Constrained Model-Based Policy Optimization

04/14/2021
by   Moritz A. Zanger, et al.
0

The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast to the traditional RL objective, safe exploration considers the maximization of expected returns under safety constraints expressed in expected cost returns. We introduce a model-based safe exploration algorithm for constrained high-dimensional control to address the often prohibitively high sample complexity of model-free safe exploration algorithms. Further, we provide theoretical and empirical analyses regarding the implications of model-usage on constrained policy optimization problems and introduce a practical algorithm that accelerates policy search with model-generated data. The need for accurate estimates of a policy's constraint satisfaction is in conflict with accumulating model-errors. We address this issue by quantifying model-uncertainty as the expected Kullback-Leibler divergence between predictions of an ensemble of probabilistic dynamics models and constrain this error-measure, resulting in an adaptive resampling scheme and dynamically limited rollout horizons. We evaluate this approach on several simulated constrained robot locomotion tasks with high-dimensional action- and state-spaces. Our empirical studies find that our algorithm reaches model-free performances with a 10-20 fold reduction of training samples while maintaining approximate constraint satisfaction levels of model-free methods.

READ FULL TEXT

page 1

page 5

research
10/14/2022

Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm

During initial iterations of training in most Reinforcement Learning (RL...
research
08/25/2023

Learn With Imagination: Safe Set Guided State-wise Constrained Policy Optimization

Deep reinforcement learning (RL) excels in various control tasks, yet th...
research
06/16/2018

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

Model-free Reinforcement Learning (RL) offers an attractive approach to ...
research
10/15/2020

Safe Model-based Reinforcement Learning with Robust Cross-Entropy Method

This paper studies the safe reinforcement learning (RL) problem without ...
research
08/01/2020

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Many physical systems have underlying safety considerations that require...
research
03/05/2021

Automatic Exploration Process Adjustment for Safe Reinforcement Learning with Joint Chance Constraint Satisfaction

In reinforcement learning (RL) algorithms, exploratory control inputs ar...
research
07/21/2022

Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning

Optimizing noisy functions online, when evaluating the objective require...

Please sign up or login with your details

Forgot password? Click here to reset