# On General Lattice Quantization Noise

The problem of constructing lattices such that their quantization noise approaches a desired distribution is studied. It is shown that asymptotically is the dimension, lattice quantization noise can approach a broad family of distribution functions with independent and identically distributed components.

## Authors

• 1 publication
• 10 publications
02/19/2022

### On the best lattice quantizers

A lattice quantizer approximates an arbitrary real-valued source vector ...
05/09/2008

### Random projection trees for vector quantization

A simple and computationally efficient scheme for tree-structured vector...
12/24/2021

### Stochastic Learning Equation using Monotone Increasing Resolution of Quantization

In this paper, we propose a quantized learning equation with a monotone ...
11/06/2018

### Quantizers with Parameterized Distortion Measures

In many quantization problems, the distortion function is given by the E...
07/28/2021

### On Optimal Quantization in Sequential Detection

The problem of designing optimal quantization rules for sequential detec...
07/23/2020

### Improving distribution and flexible quantization for DCT coefficients

While it is a common knowledge that AC coefficients of Fourier-related t...
06/24/2020

### Lattice Representation Learning

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Lattices play a key role in digital communication and specifically in quantization theory. In high-resolution quantization theory, it is common to assume (see, e.g, [5]

and references therein) that the quantization error of a lattice quantizer is uniformly distributed over the basic cell of the lattice. This assumption can be made completely accurate at any resolution by means of subtractive dithered quantization, where a random vector (dither) which is uniformly distributed over the lattice cell is added prior to quantization and then subtracted from the quantizer output (see

[6]). Following [10] we thus use the term lattice quantization noise (LQN) to refer to a random vector uniformly distributed over a basic cell of the lattice. For a given lattice however there are many possible partitions into cells. Each such partition will result in a different LQN with different statistical properties (see [7]).

In many cases of interest, the criterion for quantization is that of minimum mean-square error (MSE). That is, a lattice is deemed good if the MSE of the quantization noise (resulting from nearest neighbor encoding) is minimal for a given lattice density (or cell volume). When the criterion is MSE, the basic region associated with nearest neighbor encoding (in the Euclidian sense) is called the Voronoi region. The statistical properties of LQN of lattices which are good in this sense has been thoroughly investigated by Zamir and Feder [10]. Indeed, it was shown in [10]

that there exist sequences of lattices which are asymptotically optimal in an MSE sense. That is, for such sequences, the normalized second moment of the Voronoi region of the lattice goes to

as the dimension goes to infinity and the distribution of quantization noise (over a Voronoi region) approaches (in the Kullback-Leibler divergence sense) that of an i.i.d. white Gaussian noise.

In certain cases, the criterion for quantization may be different from MSE. For instance one may be interested in some other metric, e.g., an -th power norm. More generally, one may ask whether one can construct a lattice and associate with it a lattice partition such that the corresponding LQN approaches any i.i.d. distribution. The interest of the authors in this question arose when general LQN was needed in the context of designing a lattice precoding scheme for the binary dirty paper problem [4].

The results of [10] were derived using previously known results [9] on the existence of lattices that are good for the classical problem of covering. This approach unfortunately does not lend itself to extending the results to more general distributions. On the other hand, typicality arguments and rate distortion theory suggest that a random code drawn uniformly over a large region should have the desired properties. Since linear codes and lattices have proved to be able to attain the performance of a (uniform) random code in many problems in information theory, it is natural to suspect that the same would hold for the problem at hand.

Indeed, one may use random coding (or averaging arguments) to obtain existence results for lattices, an approach dating back at least as far as Hlawka’s proof of the Minkowsk-Hlawka theorem, see [8] for an historical account and further details. In [8], Loeliger defined an ensemble of lattices based on Construction A (see [1]) which is very amenable to analysis. Loeliger then used averaging and typicality arguments to obtain channel coding theorems for lattice codes.

In [3] the Loeliger ensemble (with a careful choice of parameters) was used to establish the existence of lattices that are simultaneously good under various different notions. It was further noted in [3] that the results can be extended to show that there exist lattices that are good for quantization under any -th norm. In this work we extend these results to show that under quite general conditions, LQN can approach general i.i.d. distributions. In the proof we use the same ensemble of lattices as in [8] and [3]. However, the proof technique diverges from that of [3] in that it relies on typicality as in [8] rather than geometric arguments to form the lattice cells. In this sense the present work is dual to Loeliger’s work [8] , using typicality arguments to obtain results for source coding (rather than for channel coding as in [8]).

The paper is organized as follows. Section II provides a very brief introduction to lattices and lattice quantization noise as well as states the main result of the paper for both the discrete and continuous cases. Section III describes the ensemble of lattices to be used and defines the typicality-based lattice partition. Sections IV and V provides the proof for the existence of a lattices whose quantization noise approaches a desired distribution. Finally, the results are demonstrated in Section VI by simulation, finding lattices and partitions with quite arbitrary LQN noise.

## Ii Preliminaries and statement of main result

### Ii-a Lattices and Lattice Quantization Noise

We begin by recalling a few basic notions pertaining to lattices. An -dimensional lattice is an infinite discrete subgroup of the Euclidean space . Thus, if and are in , then their sum and difference are also in . An -dimensional lattice may be defined by an generating matrix (whose choice is not unique) such that

 Λ={y:y=x⋅~G for some x∈Zn}.

We may associate with a lattice a lattice partition, partitioning into disjoint cells. We denote by the fundamental cell associated with . We further associate with every lattice point the cell . There are many possible choices for . For to be valid however, we require that every point can be uniquely written as where is a lattice point and is the “remainder”. We may thus write . The lattice partition is therefore fully determined by the specification of the fundamental region . The volume of a fundamental region is the same for any valid partition and we denote it by or simply by .

We note that when the partition is such that a point is mapped to the nearest lattice point in in the Euclidean sense, we obtain the usual Voronoi partition. An example of lattices and lattice partitions is given in Figure 1.

We further associate with a lattice and a chosen partition a lattice quantizer . For any , since it may be written as in one and only way, we define to be the quantization of . We further define a modulo operation by,

 y\,mod\,ΛΔ=y−QV(y)=r.

For any input vector , we may view the remainder as the “quantization noise” associated with . A random vector uniformly distributed over is referred to as (random) LQN.

### Ii-B Statement of main result

Let

be an i.i.d. random vector with marginal Probability Density Function (PDF) denoted by

. Then we would like to find a sequence of lattices and corresponding partitions such that the associated LQN approaches an i.i.d. distribution with marginal PDF .

###### Definition 1

with a continuous PDF will be called permissible if is bounded from below by a positive number over a closed interval , and is zero outside of , i.e,

 fW(w):A⟶[amin,amax] (1)

where is some arbitrarily small fixed value, and is some arbitrarily large fixed value.

###### Theorem 1

Let be a permissible noise with PDF . Let be drawn i.i.d. . Then, there exists a sequence of lattices and associated partitions such that the resulting lattice quantization noise satisfies,

 limsupn→∞1nD(U||W)=ξ (2)

where is the Kullback-Leibler divergence and can be taken arbitrarily small.

This theorem also results in the following corollary that shows convergence of the marginal PDFs to the desired one (in an average sense).

###### Corollary 1

Let be a permissible noise with PDF . Let be drawn i.i.d. . Then there exists a sequence of lattices and associated partitions such that the resulting lattice quantization noise, satisfies,

 limsupn→∞1nn∑i=1D(Ui||W)=ξ (3)

where can be taken arbitrarily small.

The key to proving this theorem lies in proving a similar claim for the discrete case which we state next. The proof for the continuous case follows from the discrete case by standard arguments, dividing the interval into small enough intervals and is given in Section V.

### Ii-C Discrete case

We now restrict our attention to the discrete space . We begin by recalling some basic definitions which are analogous to those provided above for the continuous setting.

A fundamental cell associated with the lattice is a finite set such that any point can be written in one and only one way as where and . We further define the corresponding quantizer and LQN as before.

Let be a prime number. The following notation will be used in the paper to denote componentwise modulo operations:

• For any scalar random variable , .

• For any vector random variable , is the result of reducing each component of modulo .

The discrete counterparts of Definition 1, Theorem 1, and Corollary 1 are:

###### Definition 2

A random variable with a discrete probability function will be called -permissible if takes values in and is a strictly positive probability function, i.e.,

 PW(w):Zp⟶[amin,1),

where (with abuse of notation) is some arbitrarily small fixed value.

###### Theorem 2

Let be a -permissible noise with a PDF where is a prime number. Let be drawn i.i.d. . Then, there exists a sequence of integer valued lattices and associated partitions such that the resulting lattice quantization noise, , satisfies,

 limn→∞1nD(U∗||W)=0. (4)
###### Corollary 2

Let be a -permissible noise with a PDF where is a prime number. Let be drawn i.i.d. . Then there exists a sequence of lattices and associated lattice partitions such that the resulting lattice quantization noise, , satisfies

 limn→∞1nn∑i=1D(U∗i||W)=0. (5)

The proof of Theorem 2 is based on defining an appropriate ensemble of lattices as in [3], and then defining quantization cells based on a typicality “metric”. The proof of Corollary 2 is given in Appendix C.

## Iii Ensemble of Lattices and Lattice Partition

We make use of the Loeliger ensemble of lattices [8] based on Construction A (see [1]). Let , , be integers such that and let be a prime number. Let be a generating matrix with elements in . Then a lattice may be obtained from using Construction A as depicted in Figure 2. The construction consists of the following steps:

• Define the codebook , where all the operations are modulo-. Thus . The rate of the code111All logarithms in this paper are taken to base . (in bits per sample), , is defined by

 R=logpkn (6)
• Replicate over to form the lattice . It is easy to show that is indeed a lattice, see, e.g., [1].

The random ensemble of lattices is generated by drawing each entry of the generating matrix according to a uniform i.i.d. distribution over , resulting in a random codebook222We note that with this notation (numbering), some of the codewords may be identical (when is not full rank). That is of no consequence to the analysis. , and applying the steps described above.

We note that Construction A results in a lattice that is periodic with respect to the lattice as can be seen in Figure 2. Thus, when analyzing the properties of the lattice, one may restrict attention to the the basic cube (the region highlighted in Figure 2), i.e., to the region . In particular, for any lattice and fundamental region , we define the “folded” fundamental region as depicted in Figure 3. It is easy to see that plays the same role with respect to the code as does with respect to . That is, for each one may write where and . Note also that in the same manner that specifying induces the folded region , the converse is also true. Specifying the region with respect to the code , naturally induces the region with respect to the lattice. Further, let be a vector in . We observe that (where the first modulo operation is performed with respect to ) is equal to (where the modulo operation is performed with respect to ). That is, the order of the modulo operations can be exchanged. We conclude that for a lattice obtained by Construction A, defining is equivalent to defining . We may therefore focus on the latter task.

We introduce now the technique by which we partition for a given lattice and target PDF . We rely on typicality (see, e.g., [2]) and use the following notation:

• : The set of vectors in that are -typical to .

• : The set of all pairs of vectors in such that their difference modulo is -typical to the PDF of .

We go over all points . For any not yet associated to a cell we associate it according to the following two possibilities:

1. There is no such that : We arbitrarily associate to and .

2. There exists at least one codeword such that : We choose one such codeword and add to .

For any such vector , we also associate all the “coset members” to the their respective cells . Thus, in each step we first associate the vector to the basic cell and then map the coset members. We then apply the procedure again until all vectors have been associated. This procedure is depicted in Figure 3.

## Iv Proof of Theorem 2

We now show that indeed we obtain an ensemble of lattices and associated partitions such that the LQN has the desired properties, for almost all members of the ensemble. The main steps are as follows. Lemma 1 shows that for any vector in the probability to find a codeword such that their difference (modulo ) is typical to goes to one as the dimension goes to infinity, with a proper choice for the rate of the codebooks. We conclude that there exists a specific sequence of codebooks that can match almost every point of and then show that such a sequence yields the desired LQN. In the sequel we form the partitioning letting decrease with dimension as:

 ϵ=1n. (7)

Any other vanishing function of would be appropriate.

### Iv-a Almost all points are matchable

For every code , let be a randomly shifted version of the codebook (i.e., is a random coset code) where is a random vector uniformly distributed over . We denote the codewords of by , . Thus, .

For a given , we call the event in which there is no codeword in such that its difference from modulo is typical to

, as a “bad event”. Defining the indicator random variable,

 ζ(y)={0,∄X′i∈C′  s.t  (X′i,y)∈A(n)ϵ,W(X,Y)1,o.w,

a bad event amounts to the event . The next lemma shows that with a proper choice of code rate, such “bad events” are rare.

###### Lemma 1

Let be a linear codebook of rate drawn from the random ensemble defined above and let be the induced random coset code. Let be any given vector. Then, for a rate satisfying

 R≥logp−H(W)+2ϵ (8)

we have

 limn→∞Pr(ζ(y)=0)=0, (9)

where the probability is averaged over all codebooks and over all shifting random vectors, and as defined in (7).

The proof is given in Appendix A.

Denote by the number of vectors that can be matched. We note that

 E[NY]=Pr(ζ(y)=1),

where the expectation is over all codebooks and all shifting random vectors. By Lemma 1, taking to infinity we get

 limn→∞E[NY]|Znp| = limn→∞Pr(ζ(y)=1)=1.

Note that this result applies also to the original (non-shifted) ensemble (and for any other constant-shifted ensemble) due to symmetry. Thus, for any shift vector , where the expectation on the r.h.s is only over the lattice ensemble.

An immediate consequence is that there exists a specific sequence of codebooks for which

 limn→∞NY|Znp|=1. (10)

We focus our attention on such a sequence and consider the corresponding sequence of fundamental regions. Let us denote the set of matchable sequences in by and the non-matchable by . From the symmetrical construction of the cells and from (10) it follows that:

 limn→∞|V∗b||V∗|=0. (11)

Thus almost all points in are typical to .

### Iv-B Convergence in divergence

We now show that the resulting LQN reduced modulo , , asymptotically approaches the desired distribution. The construction suggested above creates cells, each with elements. Thus, assumes one of values with equal probability. We now relate the entropy of to the volume of a cell. We take the rate of the code to satisfy (8) with equality, i.e.,

 R=logp−H(W)+2ϵ, (12)

where is defined in (7). We thus have

 1nlogpk=logp−H(W)+2ϵ,

or equivalently

 2n(H(W)−2ϵ)=pn−k.

We observe that

• For each ,

 PU∗(U∗=y)=2−n(H(W)−2ϵ). (13)
• For ,

 PW(W=y)≥2−n(H(W)+ϵ), (14)

by the definition of (weak) typicality (i.e.Definition (3.6) in [2]).

• For , . Defining

 αΔ=−log2amin−H(W), (15)

it follows that

 Pr(W=y)≥2−n(H(W)+α) (16)

for any .

We thus have,

 D(U∗||W) = ∑y∈V∗PU∗(U∗=y)log(PU∗(U∗=y)PW(W=y)) ≤ ∑y∈V∗g2−n(H(W)−2ϵ)log(2−n(H(W)−2ϵ)2−n(H(W)+ϵ)) + ∑y∈V∗b2−n(H(W)−2ϵ)log(2−n(H(W)−2ϵ)2−n(H(W)+α)) = (2n(H(W)−2ϵ)−|V∗b|)⋅2−n(H(W)−2ϵ)⋅n(2ϵ+ϵ) + |V∗b|⋅2−n(H(W)−2ϵ)⋅n(α+2ϵ) = |V∗b|⋅2−n(H(W)−2ϵ)⋅n(α−ϵ)+3ϵn.

Dividing both sides by , we get

 1nD(U∗||W) ≤ 3ϵ+|V∗b||V∗|⋅(α−ϵ) (17) = ϵ∗, (18)

where

 ϵ∗ = 3ϵ+|V∗b||V∗|⋅(α−ϵ). (19)

From (11) it follows that the second term of the r.h.s. of (19) vanishes as goes to infinity. In addition, it is clear (by its definition) that vanishes as well as goes to infinity and hence the same applies to . This completes the proof of Theorem 2.

## V Tying it all together

We turn to proving Theorem 1. We show how we may generate any desired continuous distribution, subject to mild regularity conditions, by building on the results derived for the discrete case. Let us denote the desired permissible continuous LQN by and its PDF by . First, divide into small enough -size intervals, where will be a constant value such that

 |A|Δ=2AΔ=p (20)

is a prime number. The larger the value of , the more refined the approximation for will be. We then use the following steps to construct the lattice and lattice partition:

• Define the folded random variable by,

 Wc,∗=Wc\,mod\,[0,2A]. (21)
• Define the quantized random variable by,

 W=y  if  yΔ≤Wc,∗<(y+1)Δ, (22)

where takes values in . Denote the PDF of by .

• Generate the sequence of lattices and lattice partitions, , as described in the previous section such that the associated LQN approaches an i.i.d. distribution with marginal PDF .

• Scale the lattice by , i.e

 Λc=Δ⋅Λ. (23)
• Define the continuous fundamental region, , by,

 Vc=Δ⋅V+InΔ. (24)

where

 InΔ=[0,Δ)n. (25)

That is, we scale by a factor of and add a -size cube to each element.

These steps are exemplified in Figures 4 and 5. Figure 4 depicts a discrete lattice with where the codewords are designated with ‘x‘. The cell of some codeword is designated with dots. Figure 5 depicts the equivalent continuous lattice with . Here the codewords are designated with ‘x‘ while the cells are drawn with a solid line.

Let be the resulting LQN. In Appendix B we show that indeed the construction yields an LQN which PDF that can approximate to any desired degree in a Kulback-Leibler divergence sense, thus completing the proof of Theorem 1.

## Vi Simulation Results

In this section we demonstrate the theoretical results via simulation. Since for complexity reasons we are limited to using only small dimensions, we replaced the typicality criterion with the maximum likelihood criterion. In addition, instead of choosing a lattice at random, we generated at random codebooks and picked the codebook that maximizes .

We considered the following cases:

• ,

 fW1(w)={0.999/6,w∈{0,1,2,34,35,36}0.001/31,o.w
• ,

 fW2(w)=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩0.1427,w∈{0,1,2,35,36}0.0951,w∈{3,34}0.0476,w∈{4,33}0.001/28,o.w
• ,

 fW3(w)=⎧⎪⎨⎪⎩0.6,w=10.15,w=40.05,w∈{0,2,3,5,6}
• , as depicted in Figure 10.

Figure 6 shows the lattice partition that was obtained for the “step like” distribution . The codewords are designated with ‘x‘. The cell of some codeword is designated with dots. As expected, the lattice cell has a shape of a square and the divergence from the desired distribution is .

Figure 7 shows the lattice partition obtained for .

Figures 8 and 9 refer to the distribution of the third case. Figure 8 depicts the relative entropy corresponding to each value of the rate (corresponding to ). The optimal value was obtained for the rate that was the closest to as should be expected.

Figure 9 depicts the marginal distribution of each element of the vector that was obtained using the optimal rate, which is in good agreement with .

Finally, Figure 10 depicts the desired PDF . The simulation was run for dimension and with (for this is the optimal value for ). Figure 11 depicts the marginal distribution of the obtained LQN which is in good agreement with the desired one.

## Vii Summary

It was demonstrated that subject to mild regularity conditions, lattice quantization noise may approach (asymptotically in the dimension) quite general distributions, with a proper choice of lattice and partitioning.

## A Proof Of Lemma 1

We note that by standard arguments the codewords of are pairwise independent and uniformly distributed over . Define the indicator random variables

 γi(y)={1,(Xi′,y)∈A(n)ϵ,W(X,Y)0,o.w

and note that they are also pairwise independent. We have

 E[γi(y)]=b, σ2[γi(y)]=b(1−b)

and

 E[γi(y)γj(y)]={llσ2+b2,i=jb2,i≠j

where

 b=Pr((Xi′,y)∈A(n)ϵ,W(X,Y))

Note that is independent of the vector . Since plays no role in the analysis, we thus omit it from the notation and use below.

We note that is drawn uniformly over . Therefore, the difference is also distributed uniformly over . Thus,

 b = |A(n)ϵ(W)||Znp| ≥ (1−ϵ)2n(H(W)−ϵ)pn = (1−ϵ)2−n(logp−H(W)+ϵ),

where the inequality follows from the definition of typicality (i.e., by Theorem 3.1.2 in [2]).

We denote by the total number of codewords in and bound the probability of a “bad event” (no match) by

 Pr(ζ(y)=0) = Pr(M∑i=1γi=0) (26) = Pr(1MM∑i=1γi−b=−b) ≤ Pr(∣∣ ∣∣1MM∑i=1γi−b∣∣ ∣∣≥b) ≤ 1b2E⎡⎣(1MM∑i=1γi−b)2⎤⎦ = 1b2(1M2M∑i=1M∑j=1E[γiγj]−b2) = 1b2M2M∑i=1M∑j=1(E[γiγj]−b2) = 1b2M2[M(E[γ21]−b2) + (M2−M)(E[γ1γ2]−b2)] = Mσ2b2M2=b(1−b)b2M=1−bbM ≤ 1bM≤1(1−ϵ)2nR2(−nlogp−H(W)+ϵ) = (1−ϵ)−12−n[R−(logp−H(W)+ϵ)]

where the fourth transition is due to Chebyshev’s inequality using the fact that . Thus, for any , as goes to infinity.

## B Proof Of Continuous Case

In this Appendix we show that the construction of lattices and lattice partitions as has been described in Section V allows us to approach the desired distribution as closed as desired.

We note that each can be uniquely written as where and . For every we denote the unique associated with it by .

Consider the random vector defined in (22). That is, each component is the quantization of . We further note that

 fWc,∗(t) = Pr(Wc,∗∈Δyt+InΔ) (27) × fWc,∗(t|Wc,∗∈Δyt+InΔ) = Pr(W=yt)⋅fWc,∗(t|W=yt).

Let be the “folded” continuous fundamental region and let be the “folded” LQN. Note that . Define the quantized random variable by,

 U=y  if  yΔ≤Uc,∗<(y+1)Δ, (28)

where takes values in and consider the random vector where each component is the quantization of . Note that

 fUc,∗(t)=Pr(U=yt)⋅fUc,∗(t|U=yt). (29)

Define by:

 η−1 = argmint,yfWc,∗(t|W=y) = argmint,yfWc,∗(t)∫y+ΔyfWc,∗(t)dt ≥ argminy1Δ⋅argmint∈J(t,Δ)fWc,∗(t)argmaxt∈J(t,Δ)fWc,∗(t) = 1Δ⋅r (31)

where

 y∈{0,...,p−1} (32)

and where

 J(t,Δ)=[yΔ,(y+1)Δ) (33)

and

 r=argminyargmint∈J(t,Δ)fWc,∗(t)argmaxt∈J(t,Δ)fWc,∗(t)

We observe that

• For each ,

 fUc,∗(t) = Pr(U=yt)⋅fUc,∗(t|U=yt) = 2−n(H(W)−2ϵ)⋅Δ−n = 2−n(H(W)−2ϵ+logΔ)

where is defined in (7) and where we used (13) and the fact that given is uniformly distributed over .

• For each such that ,

 fWc,∗(t) = Pr(W=yt)⋅fWc,∗(t|W=yt) ≥ 2−n(H(W)+ϵ)⋅η−n = 2−n(H(W)+ϵ+logη)

where is defined in (7) and where we used (14) and (B) to get the inequality.

• For each such that ,

 fWc,∗(t) = Pr(W=yt)⋅fWc,∗(t|W=yt) ≥ 2−n(H(W)+α)⋅η−n = 2−n(H(W)+α+logη)

where is defined in (15) and where we used (16) and (B).

Using the sequence of lattices proposed in Theorem 2 we obtain,

 D(Uc||Wc) (35) = ∫VcfUc(t)logfUc(t)fWc(t)dt = ∫Vc,∗fUc,∗(t)logfUc,∗(t)fWc,∗(t)dt = ∑y∈V∗∫l∈InΔfUc,∗(Δy+l)logfUc,∗(Δy+l)fWc,∗(Δy+l)dl ≤ ∑y∈V∗g∫l∈InΔ2−n(H(W)−2ϵ+logΔ) × log2−n(H(W)−2ϵ+logΔ)2−n(H(W)+ϵ+logη)dl + ∑y∈V∗b∫InΔ2−n(H(W)−2ϵ+logΔ) × log2−n(H(W)−2ϵ+logΔ)2−n(H(W)+α+logη)dl = (2n(H(W)−2ϵ)−|V∗b|)2−n(H(W)−2ϵ) × n(2ϵ−logΔ+ϵ+logη) + |V∗b|⋅2−n(H(W)−2ϵ)⋅n(2ϵ−logΔ+α+logη) = n(3ϵ+log(ηΔ)) + |V∗b|⋅2−n(H(W)−2ϵ)⋅n(α−ϵ)

Dividing both sides by , we get

 1nD(Uc||Wc) ≤ ϵ∗+log(ηΔ), (36)

where is defined in (18). Therefore,

 limn→∞1nD(Uc||Wc) ≤ log(ηΔ) (37) ≤ log(ΔΔr)=−log(r)

It remains to choose such that is close enough to . Note that since is permissible, it is continuous and limited to the closed interval . Therefore, is uniformly continuous, i.e, for any exists such that for any satisfying , it follows that . Therefore, can be lower bounded by

 r≥aminamin+θ (38)

where is defined in (1). If we choose

 θ=amin(2ξ−1) (39)

and set , then and by (37),

 limsupn→∞1nD(Uc||Wc)=ξ. (40)

## C Proof Of Corollary

We prove Corollary 2 for the discrete case.

We first use the chain rule for relative entropy (see Theorem 2.5.3 in

[2]):

 D(U||W) = D(PU(x)||PW(x)) (41) = n∑i=1D(PU(xi|xi−11)||PW(xi|xi−11))

Using Theorem 2.7.2 in [2] we get

 D(PU(xi|xi−1