Regret Bounds for Learning Decentralized Linear Quadratic Regulator with Partially Nested Information Structure

10/17/2022
by   Lintao Ye, et al.
0

We study the problem of learning decentralized linear quadratic regulator under a partially nested information constraint, when the system model is unknown a priori. We propose an online learning algorithm that adaptively designs a control policy as new data samples from a single system trajectory become available. Our algorithm design uses a disturbance-feedback representation of state-feedback controllers coupled with online convex optimization with memory and delayed feedback. We show that our online algorithm yields a controller that satisfies the desired information constraint and enjoys an expected regret that scales as √(T) with the time horizon T.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset