# Algorithms for Optimal Control with Fixed-Rate Feedback

We consider a discrete-time linear quadratic Gaussian networked control setting where the (full information) observer and controller are separated by a fixed-rate noiseless channel. The minimal rate required to stabilize such a system has been well studied. However, for a given fixed rate, how to quantize the states so as to optimize performance is an open question of great theoretical and practical significance. We concentrate on minimizing the control cost for first-order scalar systems. To that end, we use the Lloyd-Max algorithm and leverage properties of logarithmically-concave functions and sequential Bayesian filtering to construct the optimal quantizer that greedily minimizes the cost at every time instant. By connecting the globally optimal scheme to the problem of scalar successive refinement, we argue that its gain over the proposed greedy algorithm is negligible. This is significant since the globally optimal scheme is often computationally intractable. All the results are proven for the more general case of disturbances with logarithmically-concave distributions and rate-limited time-varying noiseless channels. We further extend the framework to event-triggered control by allowing to convey information via an additional "silent symbol", i.e., by avoiding transmitting bits; by constraining the minimal probability of silence we attain a tradeoff between the transmission rate and the control cost for rates below one bit per sample.

• 10 publications
• 4 publications
• 32 publications
• 5 publications
• 42 publications
02/06/2022

### Continuous-Time Channel Gain Control for Minimum-Information Kalman-Bucy Filtering

We consider the problem of estimating a continuous-time Gauss-Markov sou...
05/07/2022

### Rate-Constrained Shaping Codes for Finite-State Channels With Cost

Shaping codes are used to generate code sequences in which the symbols o...
04/06/2020

### Scalable Synthesis of Minimum-Information Linear-Gaussian Control by Distributed Optimization

We consider a discrete-time linear-quadratic Gaussian control problem in...
04/11/2022

### Maximum entropy optimal density control of discrete-time linear systems and Schrödinger bridges

We consider an entropy-regularized version of optimal density control of...
09/25/2021

### Reducing the LQG Cost with Minimal Communication

We study the linear quadratic Gaussian (LQG) control problem, in which t...
12/18/2018

### Value of Information in Feedback Control

In this article, we investigate the impact of information on networked c...
05/30/2022

### A practical optimal control approach for two-speed actuators

This paper addresses the closed-loop control of an actuator with both a ...

## I Introduction

The demand for new and improved control techniques over unreliable communication links is constantly growing, due to the rise of emerging opportunities in the Internet of Things realm, as well as due to new and surprising applications in Biology and Neuroscience.

One of the most widely studied such networked control system (NCS) setups is that of control over discretized packeted communication channels [1, 2, 3, 4, 5]. This setup can be further divided into two regimes: fixed-rate feedback—where exactly bits can be noiselessly conveyed from the observer/encoder to the controller/decoder [6, 7] (see Fig. 1), and variable-rate feedback—where bits are available on average and the observer/encoder can decide how many bits to allocate at each time instant [8].

For each of these scenarios, additional information can be conveyed through event-triggering by allowing to remain silent, i.e., not to send any information; see, e.g., [9, 10, 11, 12] and the references therein.

Although much effort has been put into determining the conditions for the stabilizability of such systems, less so has been done for determining the optimal attainable control costs—which are of great importance in practice—with several notable exceptions [13, 14, 15].

Other effects that are encountered in practice when using packet-based protocols are those of packet erasures (or packet drops) and delayed packet arrivals. Consequently, much attention has been devoted to studying the impact these effects have on the performance of networked systems in an idealized setup where the quantization rate is infinite [16, 3, 2].

A noteworthy effort to treat the case of finite-rate packets with packet drops was made by Minero et al. [17]. To that end, they considered an even more general case where a time-varying rate “budget” (see Fig. 1) is provided at every time step, and is determined and revealed just before transmission; a packet erasure corresponds to a zero-rate budget, implying that this scenario encompasses the packet-erasure setting.

In this work, we construct algorithms for the setting of time-varying feedback rate budget, presented in Sec. II, along with its important special case of fixed-rate feedback.

However, in contrast to the works of Minero et al. [17] and Yüksel [7], which concentrated on the conditions for system stabilizability using adaptive uniform and logarithmic quantizers,111It is impossible to stabilize an unstable system using fixed-rate static quantization if the distributions of the disturbances or the initial state have unbounded supports [4, Sec. III-A]. respectively, we attempt to optimize the control cost.

To that end, we concentrate our attention in Sec. III

on the class of disturbances that have logarithmically-concave (log-concave) probability density functions (PDFs) (the Gaussian PDF being an important special case), for which the Lloyd–Max algorithm

[18, Ch. 6] is known to converge to the optimal quantizer [19, 20, 21].222Assuming contiguous cells; see Rem. 7. Using Lloyd–Max quantization at every step, proposed previously by Bao et al. [22] and by Nakahira [23] (albeit without any optimality claims), and proving (a la sequential Bayesian filtering [24]) that the resulting system state—which is composed of the scaled sums of quantization errors of the previous steps and the new disturbances—continues to have a log-concave PDF, leads us to an optimal greedy algorithm.

To support rates below one bit per sample, we extend the algorithm to the event-triggered control scenario in Sec. IV. By adding another cell that corresponds to “silence” and constraining the probability of this cell to a minimal value, we are able to control the average rate of the scheme (which is equal to the sum of the probabilities of the remaining cells).

To tackle the more challenging task of designing a globally optimal quantizer, we recast the problem as that of designing an optimal quantizer for the problem of sequential coding of correlated sources [25] (see also [26, 27] and references therein).

An extreme variant of this problem is provided by that of linear quadratic regulator (LQR) control, in which the only randomness in the system happens in the intial state (which is again assumed to have log-concave PDF). We show in Sec. V that this problem is equivalent to that of successive refinement [28], which can be regarded as a special case of sequential coding of correlated sources.

Surprisingly, for the latter, a computationally plausible variant of the Lloyd–Max algorithm exists [29] that is known to achieve globally optimal performance for log-concave functions [30].2 Furthermore, using the classical Bennett approximated quantization law [18, Ch. 6.3], [31], we argue, in Sec. V-C, that in the limit of high rates, the greedy algorithm is in fact optimal.

Although greedy optimization was demonstrated to be suboptimal [32] (outside of the high-rate regime), simulations for the LQR case show that the gain of the globally optimal algorithm over the optimal greedy one is modest even at low rates (for which the gain is expected to be the largest). This, in turn, suggests that the optimal greedy algorithm will remain close in performance to the optimum for the more general case where the state is driven by i.i.d. log-concave disturbances, which includes linear quadratic Gaussian (LQG) control.

We provide numerical performance evaluations in Sec. VI, and conclude the paper in Sec. VII.

## Ii Problem Setup

In this work, we consider the control–communication setup depicted in Fig. 1. We use a discrete-time model spanning the time interval for , where for , such that . The plant is a discrete-time linear scalar stochastic system

 Xt+1 =aXt+Wt+Ut, t∈[0:T−1], (1)

where are the system state, disturbance and control action at time , respectively. We consider two setups for the disturbance sequence :

• Independent and identically distributed (i.i.d.): are i.i.d. according to a known log-concave PDF .

• LQR: is distributed according to a known log-concave PDF ; for all .

We further denote the variance of

by and assume, w.l.o.g., that it has zero mean.

###### Definition 1 (Log-concave function; see [33]).

A function is said to be log-concave if its logarithm is concave:

 logf(λx+(1−λ)y)≥λlogf(x)+(1−λ)logf(y), (2)

for all and ; we use the extended definition that allows to assign zero values, i.e., is an extended real-value function that can take the value .

###### Remark 1.

The Gaussian PDF is a log-concave function over and constitutes an important special case.

We assume the observer has perfect access to at time . However, in contrast to classical control settings, the observer is not co-located with the controller and communicates with it instead via a noiseless channel of data rate . That is, at each time , the observer, which also takes the role of the encoder , can perfectly convey a message (or “index”) of bits, , of the past states, to the controller:

 ℓt=Et(Xt), (3)

where we denote and use the convention that for . We further set .

The controller at time , which also takes the role of the decoder , recovers the observed codeword and uses it to generate the control action

 Ut=Dt(ℓt). (4)

The exact value of is revealed to the encoder prior to the computation of and is inferred by the decoder upon receiving . The statistics of impact system performance but do not affect the greedy optimality guarantees of the proposed algorithm of Sec. III.

###### Remark 2 (Packet-erasure channel).

A packet-erasure can be modeled by . Hence, the time-varying data rate model subsumes the packet-erasure scenario [17].

Our goal is to minimize the following average-stage linear quadratic (LQ) cost upon reaching the time horizon :

 ¯JT ≜1TE[qTX2T+T−1∑t=1(qtX2t+rtU2t)] (5a) =1TT∑t=1Jt, (5b)

where are the instantaneous costs

 Jt ≜E[qtX2t+rtU2t], t∈[1:T−1], (6a) JT ≜E[qTX2T]. (6b)

The weights and penalize the state deviation and actuation effort, respectively.

## Iii Optimal Greedy Control

In this section we consider the i.i.d. disturbance setting. We recall the Lloyd–Max algorithm and its optimality guarantees in Sec. III-A, which are subsequently used in Sec. III-B to construct a greedy optimal control policy.

### Iii-a Quantizer Design

###### Definition 2 (Scalar quantizer).

A scalar quantizer of rate is described by an encoder  and a decoder . We define the quantization operation as the composition of the encoding and decoding operations: .333The encoder and decoder that give rise to the same parameter are unique up to a permutation of the labeling of the index . The reproduction points are assumed to be ordered, without loss of generality:444If some inequalities are not strict, then the quantizer can be reduced to a lower-rate quantizer.

 c[0]

We denote by the collection of all points that are mapped to index (equivalently to the reproduction point ):

 I[ℓ] ≜{x|x∈R,EQ=ℓ} (8) ={x|x∈R,Q=c[ℓ]}. (9)

We shall concentrate on the class of regular quantizers, defined next.

###### Definition 3 (Regular quantizer).

A scalar quantizer is regular if every cell , is a contiguous interval that contains its reproduction point :

 c[ℓ]∈I[ℓ] =[p[ℓ],p[ℓ+1]), ℓ∈[0:2R−1], (10)

where is the set of partition levels—the boundaries of the cells. Hence, a regular scalar quantizer can be represented by the input partition-level set and the reproduction-point set . We further take and to be the left-most and right-most values of the support of the source’s PDF.

The cost we wish to minimize is the mean squared error distortion between the source with a given PDF and its quantization :

 D ≜E[{W−Q(W)}2] (11a) =2R−1∑ℓ=0∫p[ℓ+1]p[ℓ](w−c[ℓ])2fW(w)dw. (11b)

Denote by the minimal achievable distortion ; the optimal quantizer is the one that achieves .

###### Remark 3.

We shall concentrate on log-concave PDFs , which are therefore continuous [33]. Hence, the inclusion or exclusion of the boundary points in each cell does not affect the distortion of the quantizer, meaning that the boundary points can be broken systematically.

###### Remark 4.

If the input PDF has an infinite/semi-infinite support, then the leftmost and/or rightmost intervals of the quantizer are open ( and/or take infinite values).

The optimal quantizer satisfies the following necessary conditions [18, Ch. 6.2].

###### Proposition 1 (Centroid condition).

For a fixed partition-level set (fixed encoder), the reproduction-point set (decoder) that minimizes the distortion  (11) is

 c[ℓ] =E[w∣∣p[ℓ]
###### Proposition 2 (Nearest neighbor condition).

For a fixed reproduction-point set (fixed decoder), the partition-level set (encoder) that minimize the distortion  (11) is

 p[ℓ] =c[ℓ−1]+c[ℓ]2, ℓ∈[1:2R−1], (13)

where the leftmost/rightmost boundary points are equal to the smallest/largest values of the support of .

The optimal quantizer must simultaneously satisfy both (12) and (13); iterating between these two necessary conditions gives rise to the Lloyd–Max algorithm.

###### Algorithm 1 (Lloyd–Max quantization).

Initial step. Pick an initial partition-level set .

Iterative step. Repeat the two steps

1. Fix and set as in (12),

2. Fix and set as in (13),

interchangeably, until the decrease in the distortion per iteration goes below a desired accuracy threshold.

Props. 1 and 2 suggest that the distortion at every iteration decreases; since the distortion is bounded from below by zero, the Lloyd–Max algorithm is guaranteed to converge to a local optimum.

Unfortunately, multiple local optima may exist in general (e.g., Gaussian mixtures with well separated components), rendering the algorithm sensitive to the initial choice .

Nonetheless, sufficient conditions for the existence of a unique global optimum were established in [19, 20, 21]. These guarantee that the algorithm converges to the global optimum for any initial choice of . An important class of PDFs that satisfy these conditions is that of log-concave PDFs.

###### Theorem 1 ( [19, 20, 21]).

Let the source PDF be log-concave. Then, the Lloyd–Max algorithm converges to a unique solution that minimizes the mean squared error distortion (11).

### Iii-B Controller Design

We now describe the optimal greedy control policy. To that end, we make use of the following lemma that extends the separation principle of estimation and control to networked control.

###### Lemma 1 (Control–estimation separation [34], [13]).

Consider the general cost problem (5) with independent disturbance elements of variances . Then, the optimal controller has the structure

 Ut =−kt^Xt, (14)

where

 kt =st+1st+1+rta (15)

is the optimal LQR control gain and , and satisfies the dynamic Riccati backward recursion [35]:

 st =qt+st+1rtst+1+rta2, (16)

with and . Moreover, this controller achieves a cost of555Recall that and for the definition of , as no transmission or control action are performed at time .

 ¯JT=1TT∑t=1(stσ2Wt+gtE[(Xt−^Xt)2]), (17)

with

 gt=st+1a2−st+qt. (18)
###### Remark 5.

Lem. 1 holds true for any memoryless channel, with , where is the channel output at time .

The optimal greedy algorithm minimizes the estimation distortion at time , without regard to its effect on future distortions. To that end, at time , the encoder and the decoder calculate the PDF of conditioned on , via sequential Bayesian filtering [24], and apply the Lloyd–Max quantizer to this PDF. We refer to and to as the prior and posterior PDFs, respectively.

###### Algorithm 2 (Optimal greedy control).

Initialization. Both the encoder and the decoder set

1. as in Lem. 1, for the given , , and .

2. .

3. The prior PDF: .

Observer/Encoder. At time :

1. Observes the current state .

2. Runs the Lloyd–Max algorithm (Alg. 1) with respect to the prior PDF to obtain the quantizer of rate ; denote its partition and reproduction sets by and , respectively, and the cell corresponding to —by .

3. Quantizes the system state [recall Def. 2]:

 lt =EQt(xt)=:Et(xt), (19a) ^xt =Qt(xt)=DQt(lt), (19b)

where is the overall action of the observer/encoder at time as defined in (3).

4. Transmits the quantization index .

5. Calculates the posterior PDF :

 fXt|ℓt(xt|lt) ={fXt|ℓt−1(xt|lt−1)/γ,xt∈It[lt],0otherwise, (20)

where666We use here the regularity assumption.

 γ ≜∫pt[lt+1]pt[lt]fXt|ℓt−1(α|lt−1)dα

is a normalization factor.

6. Determines the prior PDF of time using (1) and the control action (4) :

 fXt+1|ℓt(xt+1|lt)=1|a|fXt|ℓt(xt+1−uta∣∣∣lt)∗fW(xt+1), (21)

where ‘’ denotes the convolution operation, and the two convolved terms correspond to the PDFs of the quantization error and the disturbance .

Controller/Decoder. At time :

1. Runs the Lloyd–Max algorithm (Alg. 1) with respect to the prior PDF as in Step 2 of the observer/encoder protocol.

3. Reconstructs the quantized value: .

4. Generates the control actuation

 ut=−kt^xt:=Dt(^xt). (22)
5. Calculates the posterior PDF and the next prior PDF as in Steps 5 and 6 of the observer/encoder protocol.

###### Theorem 2.

Let be a log-concave PDF (recall Def. 1). Then, Alg. 2 is the optimal greedy control policy.

The following is an immediate consequence of the log-concavity of the Gaussian PDF.

###### Corollary 1.

Let be a Gaussian PDF. Then, Alg. 2 is the optimal greedy control policy.

Recall that the Lloyd–Max Algorithm converges to the global minimum for log-concave PDFs. Consequently, in order to prove Thm. 2, it suffices to show that all the prior PDFs are log-concave. This, in turn, relies on the following log-concavity properties.

###### Assertion 1 (Log-concave function properties [33]).

Let and be log-concave functions over . Then, the following are also log-concave functions:

• Affinity: for any constants .

• Truncation:  for any interval , possibly (semi-)infinite.

• Convolution: .

Now we are ready to prove Thm. 2.

###### Proof:

We use mathematical induction to show that both of the following conditions hold for any time :

1. The prior PDF is log-concave in for any realization .

2. Given the past policies and of (19a) and (22) of Alg. 2, it minimizes the instantaneous cost (6) at time .

Basic step (). From the initial condition , the optimal control action for is , and hence . Since is log-concave from the model assumption, also has a log-concave PDF, yielding Cond. (i). Consequently, the quantizer generated by the Lloyd–Max algorithm and the controller minimizes the instantaneous cost , yielding Cond. (ii).

Inductive step. Assuming Conds. (i)-(ii) hold at time , we show below that they also hold at time . By the induction hypothesis, is log-concave. Consequently, by Thm. 1, the Lloyd–Max Algorithm generates the quantizer that minimizes the cost . This leads to Cond. (ii).

It only remains to show that Cond. (i) holds. Since log-concavity is preserved under truncation and affinity, and is log-concave by the induction hypothesis, the posterior PDF of (20) is also log-concave for any realization of ; this, along with the log-concavity of and the log-concavity preservation under affine transformations and convolution, guarantees the log-concavity of the next prior (21), , and completes the proof. ∎

## Iv Event-triggered Control

In this section, we extend the greedy algorithm to the event-triggered control scenario. Under this scenario, the encoder may either send a packet of a fixed predetermined rate or avoid transmission altogether. Avoiding transmission helps alleviating network congestion by conveying information “by silence”.

We concentrate on the case of packets of a single bit, as in this regime the advantage of the algorithm is most pronounced and the exposition of the algorithm is the simplest. The two cells corresponding to the single-bit packet along with the silence symbol form a three-level algorithm. We add a constraint on the minimal probability of the silent symbol; clearly, the average transmission rate is equal to in this case. To minimize the average transmission rate, the silence symbol needs to be assigned to the cell with the maximal probability:

 maxℓ=0,1,2∫p[ℓ+1]p[ℓ]fW(w)dw≥δ, (23)

where the cell-index that achieves the maximum in (23) corresponds to the silent cell; we denote this index by .

Hence, the standard Lloyd–Max quantizer of Alg. 1 in each time step should be replaced by the following algorithm, which first checks whether standard three-level Lloyd–Max quantization satisfies the constraint (23) and, if not, runs the algorithm with the constraint (23) imposed on a different cell each time, and chooses the one that achieves minimal average distortion. With the constraint imposed on a particular cell, the algorithm iterates between two steps: choosing the optimum for a fixed and choosing the optimum for a fixed . The first step is the same as the standard Lloyd-Max quantizer. For the second step, the Karush–Khun–Tucker (KKT) conditions are employed [36, Ch. 5].

###### Algorithm 3 (Min. cell-probability constrained quantization).

Unconstrained algorithm. Apply Alg. 1. If the constraint (23) is satisfied for the resulting quantizer, use this quantization law. Else, set and to the smallest and largest values of the support of , and run the following.

1. .

1. Set such that

 ∫p[1]p[0]fW(w)dw=δ. (24)
2. Compute as in (12).

3. Run Alg. 1 for the remaining two cells (with remain fixed), to determine and .

4. Denote the resulting overall quantizer and distortion by and , respectively.

2. .

Initial step. Pick an initial partition-level set .

Iterative step. Repeat the following steps

1. Fix and set as in (12),

2. Fix and set as in (13),

3. If does not satisfy the constraint (23), set , in accordance with the KKT conditions, as the solution of

 δ=∫p[2]p[1]fW(w)dw (25a) p[2]=c[0]−c[1]c[2]−c[1]p[1]+c[2]2−c[0]22(c[2]−c[1]) (25b)
4. If no solution to (25b) exists, replace (25b) with the choice that gives the smaller distortion out of and ,

until the decrease in the distortion per iteration is below a desired accuracy threshold. Denote the resulting quantizer and distortion by and , respectively.

3. .

1. Set such that

 ∫p[3]p[2]fW(w)dw=δ. (26)
2. Compute as in (12).

3. Run Alg. 1 for the remaining two cells (with remain fixed), to determine and .

4. Denote the resulting overall quantizer and distortion by and , respectively.

4. Set the quantizer to , where .

Replacing the Lloyd–Max quantizer of Alg. 1 with the constrained variant of Alg. 3 gives rise to the following event-triggered variant of Alg. 2.

###### Algorithm 4 (Greedy event-triggered control).

Initialization. Both the encoder and the decoder

1. Run steps 13 of the initialization of Alg. 2.

2. Set .777Recall that we assume .

Observer/Encoder. At time :

1. Observes .

2. Runs Alg. 3 with respect to the prior PDF and the maximal probability constraint to obtain the quantizer ; denote its partition and reproduction sets by and , respectively, the index of the silent cell—by , and the cell corresponding to —by .

3. Quantizes the system state as in Step 3 of the observer/encoder protocol of Alg. 2.

4. If , transmits the index ; otherwise, remains silent.

5. Calculates the posterior PDF and the next prior PDF as in Steps 5 and 6 of the observer/encoder protocol of Alg. 2, respectively.

Controller/Decoder. At time :

1. Runs Alg. 3 with respect to the prior PDF as in Step 2 of the observer/encoder protocol.

2. Receives the index : in case of silence, recovers .

3. Reconstructs the quantized value: .

4. Generates the control actuation .

5. Calculates the posterior PDF and the next prior PDF as in Steps 5 and 6 of the observer/encoder protocol of Alg. 2, respectively.

## V Globally Optimal LQR Control

In this section, we study the LQR control setting, namely, the case where has a log-concave PDF and for all . Clearly, this is equivalent to the case of a random initial condition and for all , and is therefore referred to as LQR control.

We construct a globally optimal control policy in Sec. V-B by connecting the problem to that of scalar successive refinement [29, 30], which is formulated and reviewed in Sec. V-A. The resulting quantizers are commonly referred to as multi-resolution scalar quantizers (MRSQs).

### V-a Successive Refinement

A -step MRSQ successively quantizes a single source sample with PDF using a series of quantizers of rates : At stage , bits are available for the re-quantization of the source , and are encoded into an index . , along with all previous indices , is then used for the construction of a refined description .

###### Definition 4 (Mrsq).

A -step MRSQ of rates is described by a series of encoders and a series of decoders , with and serving as the encoder and decoder at time , respectively. We define the quantization operation , at time step , as the composition of all the encodings until time with the decoding at time : .

This definition means that, although the overall effective rate of the quantizer at time is , only the last bits, corresponding to , are determined during time step . At the decoder, these bits are appended to the previously determined and received bits (corresponding to ), for the construction of a description of at time , .

###### Definition 5 (Regular MRSQ).

A -step MRSQ is regular if the quantizer at each step is regular and the partitions of subsequent stages are nested, as follows. For each time :

 pt[ℓ⋅2Rt] =pt−1[ℓ], ℓ∈[0:(t−1∑i=1Ri)−1], (27)

where is the partition-level set of the quantizer at time .

###### Remark 6.

The relation in (27) implies that, given , the partitions of all the previous stages can be deduced.

###### Remark 7 (Optimality of regular MRSQs).

Counterexamples for both discrete and continuous PDFs have been devised, for which regular MRSQs are strictly suboptimal [37, 38]. However, none such are known for the case of log-concave input PDFs [39]. Furthermore, we shall see that such quantizers become optimal in the limit of high rates in Sec. V-C.

Our goal here is to design an MRSQ that minimizes the weighted time-average squared quantization error of an input with a given PDF and positive weights :

 ¯D =T∑t=1~gtDt, (28a) Dt ≜E[(W−^Wt)2]. (28b)

Unfortunately, greedy-optimal quantizers are not globally optimal in general [30, 32], since there might be a tension between optimizing and for . When such a tension does not exist, the source is said to be successively refinable [28], [40, Ch. 13.5.13].

We next present a Generalized Lloyd–Max Algorithm due to Brunk and Farvardin [29] for constructing MRSQs, which is in turn an adaptation of an algorithm for scalar multiple descriptions by Vaishampayan [41]. Similarly to the standard Lloyd–Max algorithm (Alg. 1), the generalized variant iterates between structuring the reproduction point sets given the partition (recall Rem. 6), and vice versa.

Furthermore, the centroid condition of Prop. 1 remains unaltered, as it does not have any direct effect on other stages, and is calculated separately for each stage. The partition of earlier stages, on the other hand, has a direct effect on the boundaries of newer stages, due to the nesting property (27). Consequently, the nearest neighbor condition of Prop. 2 is replaced by a weighted variant [29, 41].

###### Proposition 3 (Weighted nearest neighbor).

The optimal partition for a given sequence of reproduction-point sets is determined by the weighted nearest neighbor condition:

 pT[ℓ] =max0≤i≤2∑Tt=1Rt−1:αi<αℓβℓ−βi2(αℓ−αi) (29a) pT[ℓ+1] =min0≤i≤2∑Tt=1Rt−1:αi>αℓβℓ−βi2(αℓ−αi) (29b) for 0≤ℓ≤2∑Tt=1Rt−1, where mt[ℓ] (29c) αℓ ≜T∑t=1~gtct[mt[ℓ]], (29d) βℓ ≜T∑t=1~gtc2t[mt[ℓ]]. (29e)
###### Remark 8.

and can be viewed as weighted centroid and squared centroid, respectively. In these terms, the partition points in (29a) and (29b) reduce to the midpoints of adjacent centroids of the standard Lloyd–Max algorithm (13).

Similarly to the optimal one-stage quantizer of Sec. III-A, the optimal MRSQ has to satisfy both the centroid condition of Prop. 1 and the weighted nearest neighbor condition of Prop. 3, simultaneously. Furthermore, iterating between these conditions gives rise to the Generalized Lloyd–Max algorithm.

###### Algorithm 5 (Generalized Lloyd–Max).

Initial step. Pick an initial partition .

Iterative step. Repeat the two steps

1. Fix and evaluate as in (12),

2. Fix and evaluate as in (29),

interchangeably, until the decrease in the weighted distortion is below a desired accuracy threshold.

As in the standard Lloyd–Max algorithm, Alg. 5 may converge to different local minima for different initializations . And similarly, sufficient conditions can be derived for the existence of a unique local—and thus also global—minimum [30]. Log-concave PDFs satisfy these conditions, suggesting that Alg. 5 is globally optimal for such PDFs.

###### Theorem 3 ([30]).

Let the input PDF be log-concave and —a positive weight sequence. Then, the Generalized Lloyd–Max algorithm converges to a unique solution that minimizes the weighted mean square error distortion (28) with weights .

### V-B Controller Design

By Lem. 1, in order to construct a globally optimal control policy, we need to find a quantizer that minimizes

 T∑t=1gtE[(Xt−^Xt)2]. (30)

The following simple result connects this problem with that of designing an MRSQ that minimizes (28).

###### Lemma 2.

Let be the quantized description of the source sample at time , produced by the MRSQ that minimizes (28) with weights

 ~gt=a2(t−1)gt. (31)

Then, the estimate that minimizes (30) is given by

 ^Xt =a^Xt−1+Ut−1+at−1(^Wt−^Wt−1), (32)

with and given in (16).

###### Proof:

Recall that is given by the recursion

 Xt+1 =aXt+Ut, t∈[1:T−1], (33a) X1 =W0. (33b)

The corresponding explicit expression for in this case is

 Xt=at−1W0+t−1∑i=0at−1−iUi. (34)

This suggests, in turn, that the estimate of at time can be expressed as

 ^Xt =E[Xt∣∣ℓt] (35a) =E[at−1W0+t−1∑i=0at−1−iUi∣∣ ∣∣ℓt] (35b) =at−1(^Wt+t−1∑i=0a−iUi) (35c) =at−1(^Wt−^Wt−1)+Ut−1 (35d) +a⋅at−2(^Wt−1+t−2∑i=0a−iUi) (35e) =at−1(^Wt−^Wt−1)+Ut−1+a^Xt−1, (35f)

which proves the relation in (32), where (35a) follows from the definition of , (35b) holds due to (34), (35c) follows from the definition of and the action being a function of , and (35f) holds by substituting the relation established in (35c) for , namely,

 ^Xt−1=at−2(^Wt+t−2∑i=0a−iUi). (36)

Subtracting (36) from (33a) with the appropriate index adjustments concludes the proof. ∎

We are now ready to present the globally optimal control policy for the LQR problem.

###### Algorithm 6 (Globally optimal LQR control).

Initialization. Both the encoder and the decoder:

1. Construct as in Lem. 1 for the given , , and .

2. Set as in (31).

3. Construct the -step MRSQ sequence using Alg. 5 for the source and weights .

Observer/Encoder. Observes . At time :

1. Generates the quantizer index: .

2. Transmits .

Controller/Decoder. At each time :

2. Generates the description: .

3. Generates as in (32).

4. Generates the control actuation .

Combining Lemmata 1 and 2 with Thm. 3 leads to the global optimality of Algorithm 6.

###### Theorem 4.

Let be a log-concave PDF. Then, Alg. 6 achieves the minimum possible average-stage LQ cost (5).

### V-C High-Rate Limit

We now consider the high resolution case, viz. the case in which the rates are large.888The exact notion of a large rate will become clear in the sequel. We start by treating the case of a single rate ().

We follow the exposition in [18, Ch. 6.3] of Bennett’s approximated quantization law for a single target rate (recall Sec. III-A).

For a large enough rate , and consequently small enough cell widths (except for maybe the extreme cells), the sum in (11b) can be approximated by a Riemann integral by defining a reproduction-point PDF

 ν(x)≜limN→∞N(x)N, (37)

where is the number of reproduction points , and is the approximate number of points in for a small . In this limit, the size of cell is approximated by

 p[ℓ+1]−p[ℓ]≈1Nν(cℓ). (38)
###### Theorem 5 (Bennett’s law).

The optimal reproduction-point PDF of a source with a log-concave PDF ,999This holds true for a much wider class of sources with smooth PDFs [31]. in the limit of large , is given by

 ν(x)≈f1/3W(x)∫ξ∈Supp{f}f1/3W(ξ)dξ, (39)

and achieves a distortion (11)

 D ≈112×22R⎡⎢ ⎢⎣∫x∈Supp{fW}f1/3W(x)dx⎤⎥ ⎥⎦3. (40)

An immediate consequence of this theorem is that, in the limit of high rates (), the source is successively refinable, since the approximation of (38) and (39) is tightened by each subsequent additional rate [30, Thm. 5].

###### Corollary 2 (Succesive refinability).

A source with a log-concave PDF is approximately successively refinable in the limit of large (and consequently large ), with a reproduction-point PDF as in (39), and distortions with given as in (40) with rate .

###### Remark 9.

Bennett’s law holds true for a wider class of PDFs and distortions; see Sec. VII-D for further discussion.

## Vi Numerical Calculations

### Vi-a Greedy LQG Control

We now evaluate the instantaneous costs (6) of Alg. 2 for a standard Gaussian i.i.d. disturbance sequence , , , and . These costs are depicted in Fig. 2 along with for all admissible transmit sequences . We compare them to the following upper and lower bounds, also depicted in Fig. 2, which are valid for the less restrictive case of variable-rate feedback [18, Ch. 9.9], where the average rate across time is constrained by .

###### Proposition 4 ([14, 15, 26, 27]).

Consider the setting of a variable-rate subject to an expected-rate constraint , i.i.d. Gaussian disturbances of variance , and . Then, the instantaneous cost is bounded as , with and

 JLBt+1 =a2JLBt2−2R+σ2W, (41) JUBt+1 =2πe12a2JUBt2−2R+σ2W