Safe Adaptive Learning-based Control for Constrained Linear Quadratic Regulators with Regret Guarantees

10/31/2021
by   Yingying Li, et al.
0

We study the adaptive control of an unknown linear system with a quadratic cost function subject to safety constraints on both the states and actions. The challenges of this problem arise from the tension among safety, exploration, performance, and computation. To address these challenges, we propose a polynomial-time algorithm that guarantees feasibility and constraint satisfaction with high probability under proper conditions. Our algorithm is implemented on a single trajectory and does not require system restarts. Further, we analyze the regret of our learning algorithm compared to the optimal safe linear controller with known model information. The proposed algorithm can achieve a Õ(T^2/3) regret, where T is the number of stages and Õ(·) absorbs some logarithmic terms of T.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2018

Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

We consider adaptive control of the Linear Quadratic Regulator (LQR), wh...
research
09/26/2018

Safely Learning to Control the Constrained Linear Quadratic Regulator

We study the constrained linear quadratic regulator with unknown dynamic...
research
01/05/2017

Learning local trajectories for high precision robotic tasks : application to KUKA LBR iiwa Cartesian positioning

To ease the development of robot learning in industry, two conditions ne...
research
01/18/2022

Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty

In online marketing, the advertisers' goal is usually a tradeoff between...
research
06/19/2020

Learning Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory

We present the first approach for learning – from a single trajectory – ...
research
11/14/2021

Safe Online Convex Optimization with Unknown Linear Safety Constraints

We study the problem of safe online convex optimization, where the actio...
research
06/17/2022

Thompson Sampling Achieves Õ(√(T)) Regret in Linear Quadratic Control

Thompson Sampling (TS) is an efficient method for decision-making under ...

Please sign up or login with your details

Forgot password? Click here to reset