DeepAI AI Chat
Log In Sign Up

Suboptimality analysis of receding horizon quadratic control with unknown linear systems and its applications in learning-based control

by   Shengling Shi, et al.
ETH Zurich
Delft University of Technology

For a receding-horizon controller with a known system and with an approximate terminal value function, it is well-known that increasing the prediction horizon can improve its control performance. However, when the prediction model is inexact, a larger prediction horizon also causes propagation and accumulation of the prediction error. In this work, we aim to analyze the effect of the above trade-off between the modeling error, the terminal value function error, and the prediction horizon on the performance of a nominal receding-horizon linear quadratic (LQ) controller. By developing a novel perturbation result of the Riccati difference equation, a performance upper bound is obtained and suggests that for many cases, the prediction horizon should be either 1 or infinity to improve the control performance, depending on the relative difference between the modeling error and the terminal value function error. The obtained suboptimality performance bound is also applied to provide end-to-end performance guarantees, e.g., regret bounds, for nominal receding-horizon LQ controllers in a learning-based setting.


Regret Minimization in Partially Observable Linear Quadratic Control

We study the problem of regret minimization in partially observable line...

Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

We explore fixed-horizon temporal difference (TD) methods, reinforcement...

Optimal bounds for numerical approximations of infinite horizon problems based on dynamic programming approach

In this paper we get error bounds for fully discrete approximations of i...

Shape-constrained Estimation of Value Functions

We present a fully nonparametric method to estimate the value function, ...

Superconvergence of Online Optimization for Model Predictive Control

We develop a one-Newton-step-per-horizon, online, lag-L, model predictiv...

Separating value functions across time-scales

In many finite horizon episodic reinforcement learning (RL) settings, it...