Safe Model-based Reinforcement Learning with Robust Cross-Entropy Method

10/15/2020
by   Zuxin Liu, et al.
0

This paper studies the safe reinforcement learning (RL) problem without assumptions about prior knowledge of the system dynamics and the constraint function. We employ an uncertainty-aware neural network ensemble model to learn the dynamics, and we infer the unknown constraint function through indicator constraint violation signals. We use model predictive control (MPC) as the basic control framework and propose the robust cross-entropy method (RCE) to optimize the control sequence considering the model uncertainty and constraints. We evaluate our methods in the Safety Gym environment. The results show that our approach achieves better constraint satisfaction than baseline safe RL methods while maintaining good task performance. Additionally, we are able to achieve several orders of magnitude better sample efficiency when compared to constrained model-free RL approaches. The code is available at https://github.com/liuzuxin/safe-mbrl.

READ FULL TEXT

page 5

page 6

page 9

research
12/14/2021

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Reinforcement Learning (RL) agents in the real world must satisfy safety...
research
02/17/2020

GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction

In this work we present a new method of black-box optimization and const...
research
04/14/2021

Safe Continuous Control with Constrained Model-Based Policy Optimization

The applicability of reinforcement learning (RL) algorithms in real-worl...
research
06/15/2018

An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method

In this paper, we provide two new stable online algorithms for the probl...
research
04/18/2023

Safe reinforcement learning with self-improving hard constraints for multi-energy management systems

Safe reinforcement learning (RL) with hard constraint guarantees is a pr...
research
04/19/2020

Model-Predictive Control via Cross-Entropy and Gradient-Based Optimization

Recent works in high-dimensional model-predictive control and model-base...
research
10/13/2022

A Mixture of Surprises for Unsupervised Reinforcement Learning

Unsupervised reinforcement learning aims at learning a generalist policy...

Please sign up or login with your details

Forgot password? Click here to reset