Remote Source Coding under Gaussian Noise : Dueling Roles of Power and Entropy Power

05/16/2018 ∙ by Krishnan Eswaran, et al. ∙ EPFL 0

The distributed remote source coding (so-called CEO) problem is studied in the case where the underlying source has finite differential entropy and the observation noise is Gaussian. The main result is a new lower bound for the sum-rate-distortion function under arbitrary distortion measures. When specialized to the case of mean-squared error, it is shown that the bound exactly mirrors a corresponding upper bound, except that the upper bound has the source power (variance) whereas the lower bound has the source entropy power. Bounds exhibiting this pleasing duality of power and entropy power have been well known for direct and centralized source coding since Shannon's work.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In the CEO problem, there is an underlying source and encoders [1, 2, 3]. Each encoder gets a noisy observation of the underlying source. The encoders provide rate-limited descriptions of their noisy observations to a central decoder. The central decoder produces an approximation of the underlying source to the highest possible fidelity. This work studies the special case where the observation noise is additive Gaussian and independent between different encoders. When the underlying source is also Gaussian and the fidelity criterion is the mean-squared error, this problem is referred to as the quadratic Gaussian CEO problem and is well studied in the literature [4, 5, 6, 7, 8]. In the work presented here, we still consider additive Gaussian observation noises, but we allow the underlying source to be any continuous distribution constrained to having a finite differential entropy. We refer to this as the AWGN CEO problem. The contributions of the work are the following:

  • A new general lower bound is presented for the AWGN CEO problem with an arbitrary underlying source, not necessarily Gaussian, and subject to an arbitrary distortion measure. (Theorems 1 and 2.)

  • When specialized to the case of the mean-squared error distortion measure, the new lower bound is shown to closely match a known upper bound. In fact, both bounds assume the same shape, except that the lower bound has the entropy power whereas the upper bound has the source power (variance). This parallels the well-known Shannon lower bound for the standard rate-distortion function under mean-squared error. (Corollaries 1 and 2.)

  • The strength of the new bounds is that they reflect the correct behavior as a function of the number of agents This fact is leveraged and illustrated in two follow-up results. The first characterizes the rate loss in the CEO problem, i.e., the rate penalty of distributed versus centralized encoding (Theorem 3). The second pertains to a network joint source-channel coding problem (more specifically, a simple model of a sensor network), given in Theorem 4.

The underpinnings of the new bounds leverage and extend work by Oohama [6], by Wagner and Anantharam [9, 10] and by Courtade [11].

We also note that there is a wealth of work about further versions of the CEO problem. Strategies are explored in [12]. The case of so-called log-loss is addressed in [13]. There is also an interesting connection between the CEO problem and the problem of so-called “nomadic” communication and oblivious relaying, where one strategy is for intermediate nodes to compress their received signals [14, 15].

Notation

All logarithms in this paper are natural, and Random variables will be denoted by upper case letters

Random vectors will be denoted by boldface upper case letters

For every subset we will use to denote the subset of those components of whose indices are in Moreover, denotes the complement of the set in Given a random variable with density , its variance is denoted by its differential entropy is and its entropy power is

(1)

and we recall that for Gaussian random variables we have that Finally, we will use the notation

to denote Markov chains, i.e., the statement that

and are conditionally independent given

Ii CEO Problem Statement

Ii-a The CEO Problem

The CEO problem is a standard problem in multi-terminal information theory. For completeness, we include a brief formal problem statement here. An underlying source is modeled as a string of length

of independent and identically distributed (i.i.d.) continuous random variables

following the terminology in [16, p. 243]. Throughout this study, we assume that the corresponding entropy power is non-zero and finite. The source is observed by encoding terminals through a broadcast channel The observation sequences

are separately encoded with the goal of finding an estimate

of with distortion







enc 1
enc 2

enc
dec
Fig. 1: The -agent AWGN CEO problem. is an arbitrary source with variance (power) (not necessarily Gaussian) and entropy power The observation noises are independent and Gaussian.

A code for the CEO problem consists of

  • encoders, where encoder assigns an index to each sequence for and

  • a decoder that assigns an estimate to each index tuple

A rate-distortion tuple is said to be achievable if there exists a sequence of codes with

(2)

where is a (single-letter) distortion measure (see [16, p. 304]). In much of the present paper, we restrict attention to the case of the mean-squared error distortion measure, i.e.,

(3)

The rate-distortion region for the CEO problem is the closure of the set of all tuples such that is achievable. In the present study, we are mostly interested in the minimum sum-rate, i.e., the quantity defined as

(4)

Ii-B The special case

In the special case the CEO problem is referred to as the remote source coding problem. This problem dates back to Dobrushin and Tsybakov [17] as well as, for the case of additive noise, to Wolf and Ziv [18]. We will use the notation in place of in this case. Here, it is well known that (see e.g. [19, Sec. 3.5])

(5)

Ii-C The AWGN CEO Problem

In much of the present study, we are concerned with the case where the source observation process is given by a Gaussian broadcast channel. In that case, we have that for where is distributed as a zero-mean Gaussian of variance We will refer to this as the AWGN CEO problem, illustrated in Figure 1. Moreover, when the distortion measure of interest is the mean-squared error, we mimic standard terminology and refer to the quadratic AWGN CEO problem.

For the AWGN CEO problem, it will be convenient to use the following shorthand. For any subset the sufficient statistic for given can be expressed as

(6)
(7)

where

(8)

is a zero-mean Gaussian random variable of variance and

denotes the harmonic mean of the noise variances in the set

that is,

(9)

In the special case where we will use the notation

(10)
(11)

where denotes the harmonic mean of all the noise variances and

(12)

respectively. Hence, is a zero-mean Gaussian random variable of variance

Iii The Shannon Lower Bound and Its Extensions

The Shannon lower bound concerns the rate-distortion function for an arbitrary (not necessarily Gaussian) source subject to mean-squared error distortion. It states that

(13)

At the same time, a maximum entropy argument provides an upper bound to the same rate-distortion function:

(14)

These results date back to [20] (see also [19, Eqns. (4.3.32) and (4.3.42)] or [16, p. 338]). Part of their appeal is the interesting duality played by the source power and its entropy power. This also directly implies their tightness in the case where the underlying source is Gaussian, since power and entropy-power are equal in that case. As a side note, tangential to the discussion presented here, we point out that the (generalized) Shannon lower bound is not generally tight for Gaussian vector sources, see e.g. [21].

One can extend this result rather directly to the case of the remote rate-distortion function, i.e., the CEO problem with as defined above in Section II-B. Specifically, letting the remote rate-distortion function subject to mean-squared error satisfies the bounds (see Appendix A)

(15)

for where For the special case of additive source observation noise, that is, where and are independent, one can obtain a more explicit pair of bounds by observing that (see Appendix A)

(16)

Combining Inequalities (15) and (16), we obtain the slightly weakened lower bound, for

(17)

and the upper bound, for

(18)

A second type of lower bounds of a similar flavor can be derived from entropy power inequalities (EPI). For these bounds to work, we restrict attention to the case of the AWGN CEO problem as defined above, i.e., the scenario where the underlying source is observed under independent zero-mean Gaussian noise of variance Again, we let be the noisy source observation. Moreover, let us consider an arbitrary distortion measure, and let denote the (regular) rate-distortion function of the source subject to that distortion measure. Then, a lower bound to the remote rate-distortion function subject to that arbitrary distortion measure is (see Appendix A)

(19)

for satisfying

Moreover, if the following inequality can be satisfied

(20)

where is an independent zero-mean Gaussian random variable with variance , and the minimum is over all real-valued, measurable functions , then for , an upper bound is (see Appendix A)

(21)

When we restrict attention to the case of mean-squared error distortion, we can obtain the following more explicit form for the lower bound, for

(22)

and for the upper bound, for

(23)

Proofs of Inequalities (22)-(23) are provided in Appendix A. It is tempting to compare the lower bounds in Inequalities (17) and (22), but there does not appear to be a simple relationship.

Iv Main Results

Iv-a General Lower Bound

Our main result is the following lower bound:

Theorem 1.

For the -agent AWGN CEO problem with an arbitrary continuous underlying source constrained to having finite differential entropy, subject to an arbitrary distortion measure if a rate-distortion tuple is achievable, i.e., if it satisfies then there must exist non-negative real numbers such that for every (strict) subset we have

(24)

and for the full set we have where and are defined in Equations (7) and (9), respectively, and denotes the (regular) rate-distortion function of the source with respect to the distortion measure

The proof of this theorem is given in Appendix B-A.

Remark 1.

Note that the argument inside the logarithm in Equation (24) is lower bounded by for all non-negative choices of as explained in the proof in Appendix B-A, making the expression well-defined.

In the next corollary, we specialize Theorem 1 to the case of the mean-squared error distortion measure, a case for which we have a closely matching upper bound.

Corollary 1.

For the -agent AWGN CEO problem with an arbitrary continuous underlying source constrained to having finite differential entropy, subject to the mean-squared error distortion measure, if a rate-distortion tuple is achievable, i.e., if it satisfies then there must exist non-negative real numbers such that for every (strict) subset we have

(25)

and for the full set we have where and are defined in Equations (7) and (9), respectively.

For achievability, if there exist non-negative real numbers such that for every (strict) subset we have

(26)

and for the full set we have then we have that

For the proof of this corollary, we note that Inequality (25) follows directly by combining Theorem 1 with the Shannon lower bound, Inequality (13). The proof of the achievability part, Inequality (26), follows from the work of Oohama [5, 6]. We briefly comment on this in Appendix B-C.

Comparing Inequalities (25) and (26), we observe a pleasing duality of the source power and its entropy power: to go from the lower bound to the upper bound, it suffices to replace all entropy powers by the corresponding power (variance) of the same random variable. This fact directly implies tightness for the case where the underlying source is Gaussian, which of course is well known [6]. The bounds also imply that for fixed source entropy power, the Gaussian is a best-case source, and for fixed source power (variance), it is a worst-case source.

The same kind of duality can be observed in the Shannon lower bound in Inequalities (13)-(14). It also appears in the extensions given in Inequality (15), in Inequalities (17)-(18), and again in Inequalities (22)-(23).

Iv-B Sum-rate Lower Bound For Equal Noise Variances

From Theorem 1, we can obtain the following more explicit bound on the sum rate in the case when all observation noise variances are equal:

Theorem 2.

For the -agent AWGN CEO problem with an arbitrary continuous underlying source constrained to having finite differential entropy, with observation noise variance for and subject to an arbitrary distortion measure the sum-rate distortion function is lower bounded by

(27)

for satisfying where is defined in Equation (11), and denotes the (regular) rate-distortion function of the source with respect to the distortion measure

The proof of this theorem is given in Appendix B-B.

When we further specialize to the case of the mean-squared error distortion measure, then our lower bound takes the same shape as a well-known achievable coding strategy, except that the lower bound has entropy powers where the upper bound has powers (variances). Specifically, we have the following result:

Corollary 2.

For the -agent AWGN CEO problem with an arbitrary continuous underlying source constrained to having finite differential entropy, with observation noise variance for and subject to mean-squared error distortion, the CEO sum-rate distortion function is lower bounded by

(28)

for Moreover, in this case, the CEO sum-rate distortion function is upper bounded by

(29)

for where is defined in Equation (11).

The proof of this corollary is given in Appendices B-B and B-C.

Remark 2.

We point out that but we prefer to leave it in the shape given in the above corollary in order to emphasize the duality of the upper and the lower bound.

To illustrate the power of the presented bounds in a formal way, we will restrict attention to the class of source distributions for which where

(30)

where is a zero-mean unit-variance Gaussian random variable, independent of Note that in the special case where itself is Gaussian, we have For starters, let us suppose that the distortion is a constant, independent of In this case, it can be verified that both Equations (28) and (29) tend to constants as becomes large, and we have (see Appendix B-D)

(31)

Note that the right-hand side can also be expressed as where

is a Gaussian probability density function with the same mean and variance as

and

denotes the Kullback-Leibler divergence. This illustrates how the gap between the upper and the lower bound narrows as

gets closer to a Gaussian distribution. Arguably a more interesting regime in the CEO problem is when the distortion

decreases as a function of the more observations we have, the lower a distortion we should ask for. A natural scaling is to require the distortion to decay inversely proportional to Specifically, let us consider a distortion where is a constant independent of Then, it is immediately clear that both the upper and the lower bound in Corollary 2 increase linearly with But how does their gap behave with ? This is a slightly more subtle question. We can show that for all the difference between Equations (28) and (29) is upper bounded by (see Appendix B-D)

(32)

which does not depend on Hence, when interpreted as a function of the number of agents the bounds of Corollary 2 capture the behavior rather tightly.

Iv-C Rate Loss for the quadratic AWGN CEO problem

In this section, we restrict attention to the case of the quadratic AWGN CEO problem. The rate loss is the difference in coding rate needed in the distributed coding scenario of Figure 1 and the coding rate that would be required if the encoders could fully cooperate. If the encoders fully cooperate, the resulting problem is precisely a remote rate-distortion problem as defined in Section II-B, where the source is observed in zero-mean Gaussian noise of variance This follows directly from the observation that as defined in Equation (11) is a sufficient statistic for the underlying source given all the noisy observations. As before, we denote the remote rate-distortion function by and hence, the rate loss is the difference It is known that the rate loss is maximal when the underlying source is Gaussian [22, Proposition 4.3]. For example, in the case where the distortion is required to decrease inversely proportional to the rate loss increases linearly as a function of the number of agents and is thus very substantial. If is not Gaussian, may we end up with a much more benign rate loss? Restricting again to sources of non-zero entropy power and for which (see the definition given in Equation (30)), we can show that the answer to this question is no. This follows directly from the bounds established in this paper. Specifically, we have the following statement:

Theorem 3.

For the -agent AWGN CEO problem with an arbitrary continuous underlying source constrained to having finite differential entropy and with observation noise variance for and subject to mean-squared error distortion, letting the distortion be parameterized as

(33)

where satisfies

(34)

and where is defined in Equation (11), the rate loss of distributed coding versus centralized coding is at least

(35)

where where we note that

The proof is given in Appendix C. Note that the rate loss has to be non-negative, hence our formula can be slightly improved by only keeping the positive part. We prefer not to clutter our notation with this since it becomes immaterial as soon as gets large.

While the bound of Theorem 3 is valid for all choices of the parameters, it is arguably most interesting when interpreted as a function of the number of agents When is a constant independent of and thus, the distortion decreases inversely proportional to it is immediately clear that the rate loss increases linearly with

V Joint Source-Channel Coding







enc 1


enc 2

enc






dec
Fig. 2: A network joint source–channel coding problem inspired by the CEO problem. is an arbitrary source with variance (power) (not necessarily Gaussian) and entropy power Each encoder can produce a codeword of average power no more than which is then transmitted over a standard symmetric additive white Gaussian noise multiple-access channel.

One important application of the new bound presented here is to network joint source-channel coding.

V-a Problem Statement

The “sensor” network considered in this section is illustrated in Figure 2. The underlying source and the source observation process are exactly as in the AWGN CEO problem defined above, and we will only consider the simple symmetric case where all observation noise variances are equal, that is, for Additionally, in the present section, we restrict attention to those source distributions for which where is as defined in Equation (30).

With reference to Figure 2, encoder can apply an arbitrary sequence of real-valued coding functions for to the observation sequence such as to generate a sequence of channel inputs,

(36)

The only constraint is that the functions be chosen to ensure that

(37)

for For the channel outputs are given by

(38)

where is an i.i.d. sequence of Gaussian random variables of mean zero and variance Upon observing the channel output sequence the decoder (or fusion center) must produce a sequence A power-distortion pair is said to be achievable if there exists a sequence of sets of mappings for and (a sequence as a function of ) with

(39)

The power-distortion region for this network joint source-channel coding problem is the closure of the set of all achievable power-distortion pairs.

V-B Main Result

The main result of this section is an assessment of the performance of digital communication strategies for the communication problem illustrated in Figure 2. To put this in context, it is important to recall the so-called source-channel separation theorem due to Shannon, see e.g. [16, Sec. 7.13]. For stationary ergodic point-to-point communication, this theorem establishes that it is without fundamental loss of optimality to compress the source to an index (that is, a bit stream) and then to communicate this index in a reliable fashion across the channel using capacity-approaching codes. Such strategies are commonly known as digital communication and are the underpinnings of most of the existing communication systems.

It is well-known that source-channel separation is suboptimal in network communication settings, see e.g. [16, p. 592]. This suboptimality can be very substantial. Specifically, for the example scenario as in Figure 2, but where the underlying source is Gaussian, it was shown in [23, Sec. 5.4.6] that the suboptimality manifests itself as an exponential gap in scaling behavior when viewed as a function of the number of nodes in the network.111In fact, for this special case, the optimal performance was characterized precisely in [24]. Could this gap be less dramatic for sources that are not Gaussian? The new bounds established in the present paper allow to answer this question in the negative. Specifically, we have the following result:

Theorem 4.

For the joint source-channel network considered in this section, if each encoder first compresses its noisy source observations into an index using the optimal CEO source code, and this index is then communicated reliably over the multiple-access channel, the resulting power-distortion region must satisfy

(40)

By contrast, there exists an (analog) communication strategy that incurs a distortion of

(41)

A proof of this theorem is given in Appendix D.

The insight of Theorem 4 lies in the comparison of Inequality (40) with Equation (41). Namely, the dependence of the attainable distortion on the number of agents As one can see, for digital architectures, characterized by Inequality (40), the distortion decreases inversely proportional to the logarithm of By contrast, from Equation (41), there is a scheme for which the decrease is inversely proportional to This represents an exponential gap in the scaling-law behavior. In other words, in order to attain a certain fixed desired distortion level the number of agents needed in a digital architecture is exponentially larger than the corresponding number for a simple analog scheme. Hence, the bounds presented here imply that the exponential suboptimality of digital coding strategies observed in [24, Thm. 1 versus Thm. 2] continues to hold for a large class of underlying sources with non-zero entropy power.

Acknowledgements

The authors acknowledge helpful discussions with Aaron Wagner, Vinod Prabhakaran, Paolo Minero, Anand Sarwate, and Bobak Nazer, who were all part of the Wireless Foundations Center at the University of California, Berkeley, when this work was started. They also thank Michèle Wigger for comments on the manuscript.

Appendix A Proofs for Section Iii

A-a Proofs of Inequalities (15) and (16)

For Inequality (15), we start by considering the (remote) distortion-rate function, that is, the dual version of the minimization problem in Equation (5), which can be expressed as

by the properties of the conditional expectation. We thus obtain

(42)

where for the last step, the data processing inequality implies that and hence, the second minimum cannot evaluate to something larger than the first. Since is a deterministic function of we have that for the minimizing in the second minimum, it holds that Hence, the two minima are equal. Conversely, we can thus write

(43)

where denotes the rate-distortion function (under mean-squared error) of the source The claimed lower and upper bounds now follow from Equations (13) and (14), applied to .

For Inequality (16), the upper bound is simply the distortion incurred by the best linear estimator. For the lower bound, observe that since by assumption, we can recover to within distortion from we must have

(44)

Under mean-squared error distortion, we know from Inequality (13) that Combining this with the above, we obtain

(45)

First, let us restrict to the case where In this case, we can further conclude that

(46)

Observing that we can rewrite this as

(47)

which is exactly the claimed bound. Conversely, suppose that By the entropy power inequality, we have that meaning that the left-hand side of Inequality (16) evaluates to something no larger than Since we assumed that the claimed lower bound applies in this case, too.

A-B Proofs of Inequalities (19)-(23)

Lower Bounds

Recall that here, we are assuming that the observation noise is Gaussian. Then, the lower bound in Inequality (19) can be established e.g. as a consequence of [11, Thm.1], as follows.

(48)
(49)

where the inequality is due to [11, Thm. 1] and the fact that by construction, we have that the Markov chain holds. Next, we observe that by definition, As long as is such that the denominator stays non-negative. For such values of we thus have

(50)
(51)

Finally, since for all values of we have we obtain

(52)

For the lower bound in Inequality (22), it suffices to lower bound in Inequality (19) using Inequality (13).

Upper Bounds

For the upper bound in Inequality (21), let us consider where is Gaussian Now, let us suppose that can be chosen in such a way that

(53)

Then, from the definition of the remote rate-distortion function (Equation (5)), we find

(54)
(55)
(56)
(57)
(58)

where the second inequality is a standard maximum-entropy argument. To bring out the similarity to the corresponding lower bound, we reparameterize as For the upper bound in Inequality (23), we now observe that under mean-squared error distortion, as long as we may choose

(59)

or, equivalently, To see that this is a valid choice satisfying the restriction of Equation (20), it suffices to observe that

(60)

and thus we satisfy Finally, for the upper bound in Inequality (23) evaluates to zero, which is trivially a correct bound, too.

Appendix B Proofs for Section Iv

B-a Proof of Theorem 1

The starting point for our lower bound is an outer bound introduced by Wagner and Anantharam [9, 10]. To state this bound, we write the vector of noisy observations as and we collect the elements with in a subset of the set into a vector

(61)

and likewise, we introduce the auxiliary random vector and again collect the elements with in a subset of the set into a vector

(62)

Then, the following statement applies.

Theorem 5.

Let denote the rate of the description provided by agent There must exist a set of random variables such that for all subsets

(63)

where is the set of sets of random variables satisfying and

is independent of ,

for all ,

, and

the conditional distribution of given and is discrete for each .

For a proof of this theorem, see [9, p. 109] or [10, Theorem 1, Appendix D, and start of the proof of Proposition 6]. Strictly speaking, in that proof, both the source and the observation noises are assumed to be Gaussian, but all arguments continue to hold for sources of finite differential entropy observed in Gaussian noise.

From this theorem, the following corollary will be of specific interest to our development:

Corollary 3.

There must exist a set of random variables such that for all subsets

(64)
Proof.

Condition in Theorem 5 implies that Moreover, observe that

(65)

and since we have

To establish our lower bound, we start by considering the following lemma. This is a generalization of the lemma proved by Oohama [6] to the case of non-Gaussian sources.

Lemma 6.

Let and . Then

(66)
Proof.

Since is independent of when condition in Theorem 5 holds, we know that we preserve the Markov chain when we condition on any realization of Therefore, we can again use Theorem 1 of [11] to infer