Optimal Dynamic Regret in LQR Control

06/18/2022
βˆ™
by   Dheeraj Baby, et al.
βˆ™
0
βˆ™

We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. We provide an efficient online algorithm that achieves an optimal dynamic (policy) regret of Γ•(max{n^1/3𝒯𝒱(M_1:n)^2/3, 1}), where 𝒯𝒱(M_1:n) is the total variation of any oracle sequence of Disturbance Action policies parameterized by M_1,...,M_n – chosen in hindsight to cater to unknown nonstationarity. The rate improves the best known rate of Γ•(√(n (𝒯𝒱(M_1:n)+1)) ) for general convex losses and we prove that it is information-theoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster and Simchowitz (2020), as well as a new proper learning algorithm with an optimal Γ•(n^1/3) dynamic regret on a family of β€œminibatched” quadratic losses, which could be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 03/19/2019

Online Non-Convex Learning: Following the Perturbed Leader is Optimal

We study the problem of online learning with non-convex losses, where th...
research
βˆ™ 11/06/2021

Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

We consider the problem of controlling a Linear Quadratic Regulator (LQR...
research
βˆ™ 01/07/2013

Dynamical Models and Tracking Regret in Online Convex Programming

This paper describes a new online convex optimization method which incor...
research
βˆ™ 09/30/2020

Adaptive Online Estimation of Piecewise Polynomial Trends

We consider the framework of non-stationary stochastic optimization [Bes...
research
βˆ™ 06/08/2019

Online Forecasting of Total-Variation-bounded Sequences

We consider the problem of online forecasting of sequences of length n w...
research
βˆ™ 10/19/2019

On Adaptivity in Information-constrained Online Learning

We study how to adapt to smoothly-varying (`easy') environments in well-...
research
βˆ™ 10/17/2022

Regret Bounds for Learning Decentralized Linear Quadratic Regulator with Partially Nested Information Structure

We study the problem of learning decentralized linear quadratic regulato...

Please sign up or login with your details

Forgot password? Click here to reset