Safe Learning-based Gradient-free Model Predictive Control Based on Cross-entropy Method

by   Lei Zheng, et al.

In this paper, a safe and learning-based control framework for model predictive control (MPC) is proposed to optimize nonlinear systems with a gradient-free objective function under uncertain environmental disturbances. The control framework integrates a learning-based MPC with an auxiliary controller in a way of minimal intervention. The learning-based MPC augments the prior nominal model with incremental Gaussian Processes to learn the uncertain disturbances. The cross-entropy method (CEM) is utilized as the sampling-based optimizer for the MPC with a gradient-free objective function. A minimal intervention controller is devised with a control Lyapunov function and a control barrier function to guide the sampling process and endow the system with high probabilistic safety. The proposed algorithm shows a safe and adaptive control performance on a simulated quadrotor in the tasks of trajectory tracking and obstacle avoidance under uncertain wind disturbances.


page 1

page 2

page 3

page 4


Safe Online Learning Tracking Control for Quadrotors under Wind Disturbances

Enforcing safety on precise trajectory tracking is critical for aerial r...

Linear model predictive safety certification for learning-based control

While it has been repeatedly shown that learning-based controllers can p...

Deep Value Model Predictive Control

In this paper, we introduce an actor-critic algorithm called Deep Value ...

The Differentiable Cross-Entropy Method

We study the Cross-Entropy Method (CEM) for the non-convex optimization ...

Sample-Efficient Policy Learning based on Completely Behavior Cloning

Direct policy search is one of the most important algorithm of reinforce...

Differentiable Predictive Control with Safety Guarantees: A Control Barrier Function Approach

We develop a novel form of differentiable predictive control (DPC) with ...

Safe Interactive Model-Based Learning

Control applications present hard operational constraints. A violation o...