Competitive Control

07/28/2021 ∙ by Gautam Goel, et al. ∙ California Institute of Technology 2

We consider control from the perspective of competitive analysis. Unlike much prior work on learning-based control, which focuses on minimizing regret against the best controller selected in hindsight from some specific class, we focus on designing an online controller which competes against a clairvoyant offline optimal controller. A natural performance metric in this setting is competitive ratio, which is the ratio between the cost incurred by the online controller and the cost incurred by the offline optimal controller. Using operator-theoretic techniques from robust control, we derive a computationally efficient state-space description of the the controller with optimal competitive ratio in both finite-horizon and infinite-horizon settings. We extend competitive control to nonlinear systems using Model Predictive Control (MPC) and present numerical experiments which show that our competitive controller can significantly outperform standard H_2 and H_∞ controllers in the MPC setting.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The central question in control theory is how to regulate the behavior of an evolving system which is perturbed by an external disturbance by dynamically adjusting a control signal. Traditionally, controllers have been designed to optimize performance under the assumption that the disturbance is drawn from some specific class of disturbances. For example, in control the disturbance is assumed to be generated by a stochastic process and the controller is designed to minimize the expected cost, while in control the disturbance is assumed to be generated adversarially and the controller is designed to minimize the worst-case cost. This approach suffers from an obvious drawback: if the controller encounters a disturbance which falls outside of the class the controller was to designed to handle, the controller’s performance may be poor. In fact, the loss in performance can be arbitrarily large, as shown in [4].

This observation naturally motivates the design of adaptive controllers which dynamically adjust their control strategy as they sequentially observe the disturbances instead of blindly following a prescribed strategy. The design of such controllers has attracted much recent attention in the online learning community (e.g. [1, 14, 5]), mostly from the perspective of policy regret. In this framework, the online controller is designed to minimize regret against the best controller selected in hindsight from some time-invariant comparator class, such as the class of static state-feedback policies or the class of disturbance-action policies introduced in [1]. The resulting controllers are adaptive in the sense that they seek to minimize cost without making a priori assumptions about how the disturbances are generated.

In this paper, we take a somewhat different approach to adaptive control: we focus on designing a controller which minimizes the competitive ratio

where is the control cost incurred by the online controller in response to the disturbance and is the cost incurred by a clairvoyant offline optimal controller. The clairvoyant offline optimal controller is the controller which selects the globally optimal sequence of control actions given perfect knowledge of the disturbance in advance; the cost incurred by the offline optimal controller is a lower bound on the cost incurred by any controller, causal or noncausal. A controller whose competitive ratio is bounded above by offers the following guarantee: the cost it incurs is always at most a factor of higher than the cost that could have been counterfactually incurred by any other controller, irrespective of the disturbance is generated. Competitive ratio is a multiplicative analog of dynamic regret; the problem of obtaining controllers with optimal dynamic regret was recently considered in [8, 17, 9].

We emphasize the key distinction between policy regret and competitive ratio: policy regret compares the performance of the online controller to the best fixed controller selected in hindsight from some class, whereas competitive ratio compares the performance of the online controller to the optimal dynamic sequence of control actions, without reference to any specific class of controllers. We believe the competitive ratio formulation of online control we consider in this paper compares favorably to the policy regret formulation in two ways. First, it is more general: instead of imposing a priori

some parametric structure on the controller we learn (e.g. state feedback policies, disturbance action policies, etc), which may or may not be appropriate for the given control task, we compete with the globally optimal clairvoyant controller, with no artificial constraints. Secondly, and more importantly, the controllers we obtain are more robust to changes in the environment. Consider, for example, a scenario in which the disturbances are picked from a probability distribution whose mean varies over time. When the mean is near zero, an

controller will perform well, since controllers are tuned for zero-mean stochastic noise. Conversely, when the mean is far from zero, an controller will perform well, since controllers are designed to be robust to large disturbances. No fixed controller will perform well over the entire time horizon, and hence any online algorithm which tries to converge to a single, time-invariant controller will incur high cumulative cost. A controller which competes against the optimal dynamic sequence of control actions, however, is not constrained to converge to any fixed controller, and hence can potentially outperform standard regret-minimizing control algorithms when the environment is non-stationary.

I-a Contributions of this paper

We derive the controller with optimal competitive ratio, resolving an open problem in the learning and control literature first posed in [11]. Our competitive controller is a drop-in replacement for standard and controllers and can be used anywhere these controllers are used; it also uses the same computational resources as the -optimal controller, up to a constant factor. The key idea in our derivation is to reduce competitive control to control. Given an -dimensional linear dynamical system driven by a disturbance , we show how to construct a synthetic -dimensional linear system and a synthetic disturbance such that the -optimal controller in the synthetic system driven by selects the control actions which minimize competitive ratio in the original system.

We synthesize the competitive controller in a linearized Boeing 747 flight control system; in this system, our competitive controller obtains the competitive ratio 1.77. In other words, it is guaranteed to incur at most 77% more cost than the clairvoyant offline optimal controller, irrespective of how the input disturbance is generated. Numerical experiments show that the competitive controller exhibits “best-of-both-worlds” behavior, often beating standard and controllers on best-case and average-case input disturbances while maintaining a bounded loss in performance even in the worst-case. We also extend our competitive control framework to nonlinear systems using Model Predictive Control (MPC). Experiments in a nonlinear system show that the competitive controller consistently outperforms standard and controllers across a wide variety of input disturbances, often by a large margin.

Our results can be viewed as injecting adaptivity and learning into traditional robust control; instead of designing controllers which blindly minimize worst-case cost irrespective of the disturbance sequence they encounter, we show how to extend control to obtain controllers which dynamically adapt to the disturbance sequence by minimizing competitive ratio.

I-B Related work

Integrating ideas from machine learning into control has has attracted much recent attention across several distinct settings. In the “non-stochastic control” setting proposed in

[14], the online controller seeks to minimize regret against the class of disturbance-action policies in the face of adversarially generated disturbances. An regret bound was given in [14]; this was improved to in [1] and in [5]. These works focus on minimizing regret against a fixed controller from some parametric class of control policies (policy regret); a parallel line of work studies the problem of designing an online controller which minimizes regret against a time-varying comparator class (dynamic regret). Dynamic regret is a very similar metric to competitive ratio, which we consider in this paper, except that it is the difference between the cost of the online and offline controllers, rather than the ratio of the costs. The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, time-varying setting in [8], in the infinite-horizon LTI setting in [17], and in the measurement-feedback setting in [9]. Gradient-based algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [12, 19].

In this paper, we design controllers through the lens of competitive analysis, e.g. we seek to design online algorithms which compete against a clairvoyant offline algorithm. This idea has a rich history in theoretical computer science and we refer to [2] for an overview. In [11], Goel and Wierman showed that competitive ratio guarantees in a narrow class of linear-quadratic (LQ) systems could be obtained using the Online Balanced Descent (OBD) framework proposed in [3]. A series of papers [10, 18] extended this reduction; a similar reduction was explored in [6] in the context of multi-timescale control. We emphasize that all prior work failed to obtain a controller with optimal competitive ratio, and relied on making nonstandard structural assumptions about the dynamics; for example, [18] assumes that the disturbance affects the control input rather than the state. This paper is the first to obtain controllers with optimal competitive ratio in general LQ systems, in both finite-horizon and infinite-horizon settings.

Ii Preliminaries

In this paper we consider the design of competitive controllers in the context of linear-quadratic (LQ) control. This problem is generally studied in two distinct settings: finite-horizon control in time-varying systems and infinite-horizon control in linear time-invariant (LTI) systems. We briefly review each in turn:

Finite-horizon Control. In this setting, the dynamics are given by the linear evolution equation

(1)

Here is a state variable we seek to regulate, is a control variable which we can dynamically adjust to influence the evolution of the system, and is an external disturbance. We focus on control over a finite horizon and often use the notation , , . We assume for notational convenience the initial condition , though it is trivial to extend our results to arbitrary initialization. We formulate control as an online optimization problem, where the goal is to select the control actions so as to minimize the quadratic cost

(2)

where for . We assume that the dynamics and costs are known, so the only uncertainty in the evolution of the system comes from the external disturbance . For notational convenience, we assume that the system is parameterized such that for ; we emphasize that this imposes no real restriction, since for all we can always rescale so that . More precisely, we can define and ; with this reparameterization, the evolution equation (1) becomes

while the state costs appearing in (2) remain unchanged and the control costs are all equal to the identity. This choice of parametrization greatly simplifies notation and is common in the control literature, see e.g. [13].

Infinite-horizon Control. In this setting, the dynamics are given by the time-invariant linear evolution equation

where , , and . We focus on control over a doubly-infinite horizon and often use the notation , , . We define the energy of a disturbance to be

As in the finite-horizon setting, we formulate control as an optimization problem, where the goal is select the control actions so as to minimize the quadratic cost

(3)

where ; as in the finite-horizon setting, we assume without loss of generality that the system is parameterized so that . We assume are known in advance, so the only uncertainty in the evolution of the system comes from the external disturbance .

We distinguish between several different kinds of information patterns that may be available to a controller. We say a controller is causal if in each timestep it is able to observe all previous disturbances up to and including the current timestep, e.g. for some function . Similarly, a controller is strictly causal if in each timestep it is able to observe all previous disturbances up to but not including the current timestep, e.g. . We often use the term online to describe causal or strictly causal controllers. A controller is noncausal if it is not causal; in particular, the clairvoyant offline optimal controller (sometimes called the noncausal controller) selects the control actions in each timestep with access to the full disturbance sequence so as to minimize the cost (2), in the finite-horizon setting, or (3), in the infinite-horizon setting.

As is standard in the input-output approach to control, we encode controllers as linear transfer operators mapping the disturbances to the quadratic cost we wish to minimize. Define . With this notation, the quadratic costs (2) and (3) can be written in a very simple form:

The dynamics (1) are captured by the relation

where and are strictly causal operators encoding in the finite-horizon setting and in the infinite-horizon setting. We refer the reader to [13] for more background on transfer operators and the input-output approach to control.

Ii-a Competitive Control

The central focus of this paper is designing a controller with optimal competitive ratio:

Problem 1 (Competitive control).

Find an online controller which minimizes the competitive ratio

where is the cost incurred by the online controller in response to the disturbance and is the cost incurred by the clairvoyant offline optimal controller.

This problem can be studied in both finite-horizon setting and infinite-horizon setting; in the infinite-horizon setting we assume has bounded energy. The offline optimal controller has a well-known description at the level of transfer operators (Theorem 11.2.1 in [13]):

(4)

Similarly, the offline optimal cost is

(5)

We note that a state-space description of the offline optimal controller was recently obtained in [7].

We call the controller with the smallest possible competitive ratio the competitive controller. Instead of minimizing the competitive ratio directly, we instead solve the following relaxation:

Problem 2 (Suboptimal competitive control).

Given , find an online controller such that

for all disturbances , or determine whether no such controller exists.

We call such a controller the competitive controller at level . It is clear that if we can solve this suboptimal problem then we can easily recover the competitive controller via bisection on .

Ii-B Robust control

Our results rely heavily on techniques from robust control. In particular, we show that the problem of obtaining the competitive controller can be reduced to an control problem:

Problem 3 (-optimal control).

Find an online controller that minimizes

where is the cost incurred by the online controller in response to the disturbance .

This problem can be studied in both finite-horizon setting and infinite-horizon setting; in the infinite-horizon setting we assume has bounded energy. The -optimal control problem has the natural interpretation of minimizing the worst-case gain from the energy in the disturbance to the cost incurred by the controller. In general, it is not known how to derive a closed-form for the -optimal controller, so instead is it common to consider a relaxation:

Problem 4 (Suboptimal control at level ).

Given , find an online controller such that

for all disturbances , or determine whether no such controller exists.

We call such a controller the controller at level . It is clear that if we can solve this suboptimal problem then we can easily recover the -optimal controller via bisection on . The finite-horizon controller at level has a well-known state-space description:

Theorem 1 (Theorems 9.5.1 and 9.5.2 in [13]).

Given , a causal finite-horizon controller at level exists if and only if

for all , where we define

and is the solution of the backwards-time Riccati recurrence

where we initialize , and we define

In this case, one possible causal finite-horizon controller at level is given by

A strictly causal finite-horizon controller at level exists if and only if

for . In this case, one possible strictly causal finite-horizon controller at level is given by

The infinite-horizon controller at level also has a well-known state-space description:

Theorem 2 (Theorem 13.3.3 in [13]).

Suppose is stabilizable and is observable on the unit circle. A causal controller at level exists if and only if there exists a solution to the Ricatti equation

with

such that

  1. is stable;

  2. and have the same inertia;

  3. .

In this case, the infinite-horizon controller at level has the form

where . A strictly causal controller at level exists if and only if conditions 1 and 3 hold, and additionally

and

In this case, one possible strictly causal controller at level is given by

Iii The Competitive Controller

In this section we present our main results: a computationally efficient state-space description of the competitive controller, i.e. the online controller with the smallest possible competitive ratio, in both the finite-horizon setting and the infinite-horizon setting. In both settings, the key technique we employ is a reduction from the competitive control problem (Problem 1) to an control problem (Problem 3). To perform this reduction, we construct a synthetic dynamical system whose dimension is twice that of the original system. We also construct a new synthetic disturbance which can be computed online as the disturbance is observed. The controller in our synthetic system, when fed the synthetic disturbance , selects the control actions which minimize competitive ratio in the original system. As is standard in control, we first synthesize the suboptimal controller at level ; by the nature of our construction, this controller is guaranteed to have competitive ratio at most in the original system. We can then obtain the -optimal controller in the synthetic system (and hence the competitive controller in the original system) by minimizing subject to the constraints outlined in Theorems 1 and 2.

Recall that

It follows that Problem 2 can be expressed as finding an online controller such that

(6)

for all disturbances , or determining whether no such controller exists. Let be the unique casual operator such that . Then condition (6) can be rewritten as an condition:

where we define . The dynamical system , which is driven by the disturbance , can be transformed into a system driven by :

(7)

We have shown that the problem of finding a competitive controller at level in the system is equivalent to finding an controller at level in the system ; the key is to obtain the factorization

. In the finite-horizon setting, we obtain this factorization using state-space models and the whitening property of the Kalman filter; in the infinite-horizon setting we first pass to the frequency domain and employ algebraic techniques to factor

, and then reconstruct the controller in time domain from its frequency domain model.

Iii-a Finite-horizon competitive control

We first consider finite-horizon control in linear time-varying systems as described in Section II. We prove:

Theorem 3 (Finite-horizon competitive control).

A causal finite-horizon controller with competitive ratio bounded above by exists if and only if

(8)

for , where we define

we define to be the solution of the backwards-time Riccati recursion

(9)

where we initialize and define

and are defined in (13). In this case, a causal controller with competitive ratio bounded above by is given by

where the dynamics of are

(10)

and we initialize . The synthetic disturbance can be computed using the recursion

where we initialize . A strictly causal finite-horizon controller with competitive ratio bounded above by exists if and only if

for . In this case, a strictly causal controller with competitive ratio bounded above by is given by

We make a few observations. First, comparing with Theorem 1, we see that the competitive controller at level has a similar structure to the controller at level ; indeed, the competitive controller is just the controller in the system (10). Second, we emphasize that the synthetic disturbance appearing in (10) is a strictly causal function of ; in particular, is a linear combination of the disturbances . This is crucial, since it means that we can construct online, using only the observations available up to time . Third, we note that the dimension of the control input in the synthetic system (10) is the same as the dimension of the control input in the original system (1); this allows us to use to steer the original system. Lastly, since the competitive controller is simply the standard controller in a system of dimension , it is clear that the computational resources required to implement the competitive controller are identical to those required to implement the controller, up to a constant factor.

The proof of Theorem 3 is presented in the appendix.

Iii-B Infinite-horizon competitive control

We next consider infinite-horizon control in linear time-invariant systems as described in Section II. We prove:

Theorem 4 (Infinite-horizon competitive control).

Suppose is stabilizable and is detectable. A causal infinite-horizon controller with competitive ratio bounded above by exists if and only if there exists a solution to the Ricatti equation

(11)

with

and defined in (17), such that

  1. is stable;

  2. and have the same inertia;

  3. .

In this case, a causal infinite-horizon controller at level is given by

where and the dynamics of are

and we initialize . The synthetic disturbance can be computed using the recursion

where we initialize . A strictly causal infinite-horizon controller with competitive ratio bounded above by exists if and only if conditions 1 and 3 hold, and additionally

and

In this case, a strictly causal controller with competitive ratio bounded above by is given by

We note that the infinite-horizon controller described in Theorem 4 is identical to the finite-horizon controller described in Theorem 3, except that the Ricatti recursion (9) is replaced by a Ricatti equation (11), and all the matrices appearing in the controller are time-invariant; this is consistent with our intuition that the infinite-horizon controller is the finite-horizon controller in steady-state, in the asymptotic limit as the time-horizon tends to infinity. It is clear that an infinite-horizon competitive controller with competitive ratio bounded by is stabilizing (whenever such a controller exists), because its cost is always at most a factor of more than the offline optimal cost, and the offline controller is stabilizing.

The proof of Theorem 4 is presented in the appendix.

Iv Numerical Experiments

We benchmark the causal infinite-horizon competitive controller against the -optimal, -optimal, and offline optimal controllers in both a linear system and a nonlinear system.

Iv-a Boeing 747 Flight Control

We consider the longitudinal flight control system of a Boeing 747 with linearized dynamics. Assuming level flight at 40,000ft at a speed of 774ft/sec and a discretization interval of 1 second, the dynamics are given by

where

The state consists of kinematic variables such as velocity and orientation and the control inputs are thrust and elevator angle; we refer to [15] for more information. We assume the initial condition and take .

We synthesize the infinite-horizon competitive controller using Theorem 4 and find that the smallest choice of satisfying the constraints (8) is , so the competitive ratio of the competitive controller is . In other words, the cost incurred by our competitive controller is guaranteed to always be within 77% of the cost incurred by the clairvoyant offline optimal controller, no matter the input disturbance. We emphasize that our competitive controller is guaranteed to obtain the smallest possible competitive ratio among all online controllers, therefore no online controller can achieve a competitive ratio less than 1.77.

In Figure 1 we plot the magnitude of at various frequencies ; this measures how much energy is transferred from the input disturbance to the control cost at the frequency . The controller is designed to be robust to disturbances at all frequencies and hence has the lowest peak. Both the competitive controller and the -optimal controller closely track the offline optimal controller. In Figure 2 we plot the competitive ratio of the various controllers across various frequencies. We see that the competitive ratio of the controller can be as high as 43.3 at certain frequencies, while the competitive ratio of the -optimal controller is 2.8 at every frequency. We note that the competitive ratio of the competitive controller is the smallest at 1.77, as expected.

Fig. 1: Frequency response in the Boeing 747 flight control system.
Fig. 2: Competitive ratio of LQ controllers in a Boeing 747 flight control system across various noise frequencies.

We next compare the performance of the competitive controller and the -optimal, -optimal, and clairvoyant offline optimal controllers across several input disturbances which capture average-case, best-case, and worst-case scenarios for the competitive controller. In Figure 3 we plot the controllers’ performance when the driving disturbance is white Gaussian noise; unsurprisingly, the -optimal controller incurs the lowest cost. The competitive controller is almost able to match the performance of the

controller, despite not being designed specifically for stochastic disturbances. We next calculate the best-case and worst-case DC disturbances by computing the eigenvectors corresponding to the smallest and largest eigenvalues of

at , where is the transfer operator associated to the competitive controller. In Figure 4, we plot the controller’s performance when the noise is taken to be the best-case DC component. The competitive controller exactly matches the performance of the clairvoyant noncausal controller, outperforming the -optimal controller and greatly outperforming the -optimal controllers. We next plot the controller’s performance when the noise is taken to be the worst-case DC component in Figure 5. The competitive controller incurs the highest cost; this is unsurprising, since the noise is chosen specifically to penalize the competitive controller. We note that the ratio of the competitive controller’s cumulative cost to that of the offline optimal controller slowly approaches 1.77 as predicted by our competitive ratio bound. Lastly, in Figure 6, we plot the controllers’ performance when the noise is a mixture of white and worst-case DC components; we see that the competitive controller almost matches the performance of the controller. Together, these plots highlight the best-of-both-worlds behavior of the competitive controller: in best-case or average-case scenarios it matches or outperforms standard and controllers, while in the worst-case scenario it is never worse by more than a factor of 1.77.

Fig. 3:

Relative performance of LQ controllers in a Boeing 747 flight control system driven by white noise. The competitive controller almost matches the performance of the H2 controller, without being tuned for stochastic noise.

Fig. 4: Relative performance of LQ controllers in a Boeing 747 flight control system driven by best-case DC noise. The competitive controller exactly matches the performance of the offline optimal controller; the H2 controller incurs substantially more cost.
Fig. 5: Relative performance of LQ controllers in a Boeing 747 flight control system driven by worst-case DC noise. The competitive controller incurs the most cost, but its cost is guaranteed to be at most 77% more than the cost incurred by the offline optimal controller.
Fig. 6: Relative performance of LQR controllers in a Boeing 747 flight control system driven by noise which is a mixture of white and worst-case DC components. The competitive controller almost matches the H2 controller and outperforms the Hinf controller.

Iv-B Inverted Pendulum

We also benchmark our competitive controller in a nonlinear inverted pendulum system. This system has two scalar states, and , representing angular position and angular velocity, respectively, and a single scalar control input . The state evolves according to the nonlinear evolution equation

where is an external disturbance, and are physical parameters describing the system. Although these dynamics are nonlinear, we can benchmark the regret-optimal controller against the -optimal, -optimal, and clairvoyant offline optimal controllers using Model Predictive Control (MPC). In the MPC framework, we iteratively linearize the model dynamics around the current state, compute the optimal control signal in the linearized system, and then update the state in the original nonlinear system using this control signal. In our experiments we take and initialize and to zero. We assume that units are scaled so that all physical parameters are 1. We set the discretization parameter and sample the dynamics at intervals of .

Fig. 7: Relative performance of LQ controllers in an inverted pendulum system driven by zero-mean Gaussian noise.
Fig. 8: Relative performance of LQ controllers in an inverted pendulum system driven by Gaussian noise whose mean varies sinusoidally over time.
Fig. 9: Relative performance of LQ controllers in an inverted pendulum system driven by a high-frequency sinusoidal disturbance.
Fig. 10: Relative performance of LQ controllers in an inverted pendulum system driven by a low-frequency sinusoidal disturbance.
Fig. 11: Relative performance of LQ controllers in an inverted pendulum system driven by step-function noise.

In Figure 7

, we plot the relative performance of the various controllers when the noise is drawn i.i.d. from a standard Gaussian distribution in each timestep. Surprisingly, the competitive controller significantly outperforms the

-optimal controller, which is tuned for i.i.d zero-mean noise; this may be because the competitive controller is better able to adapt to nonlinear dynamics. The cost incurred by the -optimal controller is orders of magnitude larger than that of the other controllers and is not shown. In Figur 8

, the noise is drawn from a Gaussian distribution whose variance is fixed but whose mean varies over time; we take

, for . The -optimal controller incurs roughly three times the cost of the competitive controller, while the competitive controller closely tracks the performance of the offline optimal controller. As before, the cost incurred by the -optimal controller is orders of magnitude larger than that of the other controllers and is not shown. In Figures 9 and 10 we compare the competitive controller to the -optimal and offline optimal controllers with both high frequency and low frequency sinusoidal disturbances, with no Gaussian component, e.g and . In both plots the competitive controller easily beats the controller and nearly matches the performance of the offline optimal controller.

Lastly, in Figure 11 we plot the controllers’ performance when the noise is generated by a step-function: for the first 500 timesteps the input disturbance is equal to 1, and for the next 500 timesteps it is equal to . The sudden transition at presents a challenge for controllers which adapt online, since the new set of input disturbances is completely different than those which had been observed previously. We see that the competitive-controller closely tracks the offline optimal controller and easily outperforms the -optimal and -optimal controllers.

V Conclusion

We introduce a new class of controllers, competitive controllers, which dynamically adapt to the input disturbance so as to track the performance of the clairvoyant offline optimal controller as closely as possible. The key idea is to extend classical control, which seeks to design online controllers so as to minimize the ratio of their control cost to the energy in the disturbance, to instead minimize competitive ratio. We derive the competitive controller in both finite-horizon, time-varying systems and in infinite-horizon, time-invariant systems. In both settings, the key idea is to construct a synthetic system and a synthetic driving disturbance such that the -optimal controller in the synthetic system selects the control outputs which minimize competitive ratio in the original system. The main technical hurdle in our construction is the factorization of certain algebraic expressions involving the transfer operator associated to the offline optimal controller. In the finite-horizon setting, we perform this factorization in time domain using the whitening property of the Kalman filter, whereas in the infinite-horizon setting we perform the factorization in frequency domain and then reconstruct the controller in time domain.

We benchmark our competitive controller in a linearized Boeing 747 flight control system and show that it exhibits remarkable “best-of-both-worlds” behavior, often beating standard and controllers on best-case and average-case input disturbances while maintaining a bounded loss in performance even in the worst-case. We also extend our competitive control framework to nonlinear systems using Model Predictive Control (MPC). Numerical experiments in a nonlinear system show that the competitive controller consistently outperforms standard and controllers across a wide variety of input disturbances, often by a large margin. This may be because the competitive controller, which is designed to adapt to arbitrary disturbance sequences, is better able to adapt to changing system dynamics; we plan to investigate this phenomenon more thoroughly in future work.

In this paper, we focus on designing online controllers which compete against clairvoyant offline controllers; it is natural to extend the idea of competitive control to other classes of comparator controllers. For example, it would be interesting to design distributed controllers which make decisions using only local information while competing against centralized controllers with a more global view. We anticipate that such distributed controllers could prove useful in a variety of networked control problems arising in congestion control, distributed resource allocation and smart grid.

Vi Appendix

Vi-a Proof of Theorem 3

Proof.

A state-space model for is given by

Given this state-space model, we wish to obtain the factorization where is causal. We interpret

as the covariance matrix of an appropriately defined random variable and use the Kalman filter to obtain a state-space model for

. Suppose that and are zero-mean random variables such that , and . Define ; notice that . As is well-known in the signal processing community, the Kalman filter can be used to construct a causal matrix such that , where is a zero-mean random variable such that ; this is the so-called “whitening” property of the Kalman filter. Notice that since , ; on the other hand, , so . Therefore , as desired.

Using the Kalman filter as described in Theorem 9.2.1 in [16], we obtain a state-space model for :

(12)

where we define

(13)

and is defined recursively as

where we initialize .

Now that we have state-space models for and , we can form a state-space model for the overall system (7). Letting , we see that a state-space model for this system is

This system can be rewritten as

(14)

where we define

and we initialize . Recall that our goal is to find a controller in the synthetic system (14) such that for all disturbances , or to determine whether no such controller exists; such a controller has competitive ratio at most in the original system (1). Theorem 1 gives necessary and sufficient conditions for the existence of such a controller, along with an explicit state-space description of the controller, if it exists.

We emphasize that the driving disturbance in the synthetic system (14) is not , but rather the synthetic disturbance . Notice that is strictly causal, since is causal and is strictly causal. Exchanging inputs and outputs in (12), we see that a state-space model for is

A state-space model for is

Equating and , we see that a state-space model for is

Setting and simplifying, we see that a minimal representation for is

We reiterate that is a strictly casual function of ; in particular, depends only on . ∎

Vi-B Proof of Theorem 4

Proof.

Taking the -transform of the linear evolution equations

we obtain

Letting and be the transfer operators mapping and to , respectively, we see that

and

Our goal is to obtain a canonical factorization

With this factorization, we can easily recover the optimal infinite-horizon competitive controller; it simply the -optimal infinite-horizon controller in the system whose dynamics in the frequency domain are

(15)

where the synthetic disturbance is

(16)

Before we factor , we state a key identity which plays a pivotal role in the factorization: for all Hermitian matrices , we have

where we define

This identity is easily verified via direct calculation.

We expand as