Online Control of Unknown Time-Varying Dynamical Systems

by   Edgar Minasyan, et al.

We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model. At a high level, we demonstrate that this setting is qualitatively harder than that of either unknown time-invariant or known time-varying dynamics, and complement our negative results with algorithmic upper bounds in regimes where sublinear regret is possible. More specifically, we study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies. While these three classes are essentially equivalent for LTI systems, we demonstrate that these equivalences break down for time-varying systems. We prove a lower bound that no algorithm can obtain sublinear regret with respect to the first two classes unless a certain measure of system variability also scales sublinearly in the horizon. Furthermore, we show that offline planning over the state linear feedback policies is NP-hard, suggesting hardness of the online learning problem. On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term. In fact, our algorithm enjoys sublinear adaptive regret bounds, which is a strictly stronger metric than standard regret and is more appropriate for time-varying systems. We sketch extensions to Disturbance Action policies and partial observation, and propose an inefficient algorithm for regret against linear state feedback policies.


page 1

page 2

page 3

page 4


Adaptive Regret for Control of Time-Varying Dynamics

We consider regret minimization for online control with time-varying lin...

Black-Box Control for Linear Dynamical Systems

We consider the problem of controlling an unknown linear time-invariant ...

Learning to Control under Time-Varying Environment

This paper investigates the problem of regret minimization in linear tim...

Implications of Regret on Stability of Linear Dynamical Systems

The setting of an agent making decisions under uncertainty and under dyn...

Non-asymptotic System Identification for Linear Systems with Nonlinear Policies

This paper considers a single-trajectory system identification problem f...

Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization

In this paper, we consider online convex optimization (OCO) with time-va...

Towards a Dimension-Free Understanding of Adaptive Linear Control

We study the problem of adaptive control of the linear quadratic regulat...

Please sign up or login with your details

Forgot password? Click here to reset