Robust Distributed Compression of Symmetrically Correlated Gaussian Sources

07/18/2018 ∙ by Yizhong Wang, et al. ∙ 0

Consider a lossy compression system with ℓ distributed encoders and a centralized decoder. Each encoder compresses its observed source and forwards the compressed data to the decoder for joint reconstruction of the target signals under the mean squared error distortion constraint. It is assumed that the observed sources can be expressed as the sum of the target signals and the corruptive noises, which are generated independently from two symmetric multivariate Gaussian distributions. Depending on the parameters of such distributions, the rate-distortion limit of this system is characterized either completely or at least for sufficiently low distortions. The results are further extended to the robust distributed compression setting, where the outputs of a subset of encoders may also be used to produce a non-trivial reconstruction of the corresponding target signals. In particular, we obtain in the high-resolution regime a precise characterization of the minimum achievable reconstruction distortion based on the outputs of k+1 or more encoders when every k out of all ℓ encoders are operated collectively in the same mode that is greedy in the sense of minimizing the distortion incurred by the reconstruction of the corresponding k target signals with respect to the average rate of these k encoders.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Consider a wireless sensor network where potentially noise-corrupted signals are collected and forwarded to a fusion center for further processing. Due to the communication constraints, it is often necessary to reduce the amount of the transmitted data by local pre-processing at each sensor. Though the multiterminal source coding theory, which aims to provide a systematic guideline for the implementation of such pre-processing, is far from being complete, significant progress has been made over the past few decades, starting from the seminal work by Slepian and Wolf on the lossless case [1] to the more recent results on the quadratic Gaussian case [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. Arguably the greatest insight offered by this theory is that one can capitalize on the statistical dependency among the data at different sites to improve the compression efficiency even when such data need to be compressed in a purely distributed fashion. However, this performance improvement comes at a price: the compressed data from different sites might not be separably decodable, instead they need to be gathered at a central decoder for joint decompression. As a consequence, losing a portion of distributedly compressed data may render the remaining portion completely useless. Indeed, such situations are often encountered in practice. For example, in the aforementioned wireless sensor network, it could happen that the fusion center fails to gather the complete set of compressed data needed for performing joint decompression due to unexpected sensor malfunctions or undesirable channel conditions. A natural question thus arises whether a system can harness the benefits of distributed compression without jeopardizing its functionality in adverse scenarios. Intuitively, there exists a tension between compression efficiency and system robustness. A good distributed compression system should strike a balance between these two factors. The theory intended to characterize the fundamental tradeoff between compression efficiency and system robustness for the centralized setting is known as multiple description coding, which has been extensively studied [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]. In contrast, its distributed counterpart is far less developed, and the relevant literature is rather scarce [37, 38, 39].

In the present work we consider a lossy compression system with distributed encoders and a centralized decoder. Each encoder compresses its observed source and forwards the compressed data to the decoder. Given the data from an arbitrary subset of encoders, the decoder is required to reconstruct the corresponding target signals within a prescribed mean squared error distortion threshold (dependent on the cardinality of that subset). It is assumed that the observed sources can be expressed as the sum of the target signals and the corruptive noises, which are generated independently from two (possibly different) symmetric111This symmetry assumption is not essential for our analysis. It is adopted mainly for the purpose of making the rate-distortion expressions as explicit as possible. multivariate Gaussian distributions. This setting is similar to that of the robust Gaussian CEO problem studied in [37, 38]. However, there are two major differences: the robust Gaussian CEO problem imposes the restrictions that 1) the target signal is a scalar process, and 2) the noises across different encoders are independent. Though these restrictions could be justified in certain scenarios, they were introduced largely due to the technical reliance on Oohama’s bounding technique for the scalar Gaussian CEO problem [3, 6]

. In this paper we shall tackle the more difficult case where the target signals jointly form a vector process by adapting recently developed analytical methods in Gaussian multiterminal source coding theory

[10, 13, 14, 15] to the robust compression setting. Moreover, we show that the theoretical difficulty caused by correlated noises can be circumvented through a fictitious signal-noise decomposition of the observed sources such that the resulting noises are independent across encoders. In fact, it will become clear that this decomposition can be useful even for analyzing those distributed compression systems with independent noises. Our main results are summarized below.

  1. For the case where the decoder is only required to reconstruct the target signals based on the outputs of all encoders, the rate-distortion limit is characterized either completely or partially, depending on the parameters of signal and noise distributions,

  2. For the case where the outputs of a subset of encoders may also be used to produce a non-trivial reconstruction of the corresponding target signals, the minimum achievable reconstruction distortion based on the outputs of or more encoders is characterized either completely or partially, depending on the parameters of signal and noise distributions, when every out of all encoders are operated collectively in the same mode that is greedy in the sense of minimizing the distortion incurred by the reconstruction of the corresponding target signals with respect to the average rate of these encoders.

The rest of this paper is organized as follows. We state the problem definitions and the main results in Section II. The proof is presented in Section III. We conclude the paper in Section IV.

Notation: The expectation operator, the transpose operator, the trace operator, and the determinant operator are denoted by , , , and , respectively. A -dimensional all-one row vector is written as . We use to represent a diagonal matrix with diagonal entries , and use as an abbreviation of . For a set with elements , means . The cardinality of a set is denoted by . Throughout this paper, the base of the logarithm function is .

Ii Problem Definitions and Main Results

Let the target signals and the corruptive noises be two mutually independent -dimensional () zero-mean Gaussian random vectors, and the observed sources be their sum (i.e., ). Their respective covariance matrices are given by

and satisfy . Moreover, we construct an i.i.d. process

such that the joint distribution of

, , and is the same as that of , , and for .

By the eigenvalue decomposition, every

(real) matrix

can be written as

(1)

where

is an arbitrary (real) unitary matrix with the first column being

, and

For , let , , and denote the leading principal submatrices of , , and , respectively; in view of (1), we have

where

with

Note that , , and are positive semidefinite (and consequently are well-defined covariance matrices) if and only if , , , , , and . Furthermore, we assume that since otherwise the target signals are not random. It follows by this assumption that , , and .

Definition 1

Given , a rate-distortion tuple is said to be achievable if, for any , there exist encoding functions , , such that

(2)
(3)

where . The set of all such achievable is denoted by .

Remark 1

Due to the symmetry of the underlying distributions, it can be shown via a timesharing argument that is not affected if we replace (2) with either of the following constraints

and/or replace (3) with either of the following constraints

Remark 2

We show in Appendix A that, for ,

where

It is clear that , , for any . Moreover, if for some , then the corresponding distortion constraint is redundant. Henceforth we shall focus on the case , .

Definition 2

For , let

In order to state our main results, we introduce the following quantities. For any and , let

where is the unique positive number satisfying

(4)

Our first result is a partial characterization of .

Theorem 1

For ,

if either of the following conditions is satisfied:

  1. and

    (5)

    where

    (6)
  2. and

    (7)

    where

    (8)
Remark 3
  1. Consider the case . When , the inequality (5) always holds, and is characterized for all . When , the equation has two real roots in the interval :

    Therefore, the inequality (5) holds if

    (9)

    It is easy to verify that (9) is satisfied when (which implies ) or (which implies ). When , is a strictly decreasing function of , converging to as and to as ; hence, it suffices to analyze the following four scenarios.

    1. : is satisfied for all .

    2. and : is satisfied for all sufficiently close to .

    3. and : is satisfied for all sufficiently close to while is satisfied for all sufficiently close to .

    4. and : This can happen only when .

    In view of the above discussion, under the condition , is characterized at least for all sufficiently close to unless and (note that implies ).

  2. Consider the case . When , the inequality (7) always holds, and is characterized for all . When , the equation has two real roots in the interval :

    Therefore, the inequality (7) holds if

    (10)

    It is easy to verify that (10) is satisfied when (which implies ) or (which implies ). When , is a strictly decreasing function of , converging to as and to as ; hence, it suffices to analyze the following four scenarios.

    1. : is satisfied for all .

    2. and : is satisfied for all sufficiently close to .

    3. and : is satisfied for all sufficiently close to while is satisfied for all sufficiently close to .

    4. and : This can happen only when .

    In view of the above discussion, under the condition , is characterized at least for all sufficiently close to unless and (note that implies ).

Theorem 1 is a special case of the following more general result.

Theorem 2
  1. For ,

  2. For with ,

    if either of the following conditions is satisfied:

    1. and

      (11)

      where is defined in (6) with replaced by .

    2. and

      (12)

      where is defined in (8) with replaced by .

  3. For and with and , we have

    if either of the following conditions is satisfied:

    1. Condition i).

    2. , , and

      (13)
      (14)

      where

Proof:

See Section III.

Remark 4
  1. The argument in Remark 3 can be leveraged to prove that, for the case , the inequality (11) holds at least for all sufficiently close to unless (which can happen only when ) and (note that implies ); similarly, for the case , the inequality (12) holds at least for all sufficiently close to unless and (note that implies ).

  2. For the case , the condition can be potentially violated (i.e., ) only when .

  3. Consider the case and . If , then the inequality (13) holds at least for sufficiently close to ; if , which implies , then the inequality (13) always holds. The inequality (14) holds at least for sufficiently close to unless and .

Iii Proof of Theorem 2

The following lemma can be obtained by adapting the classical result by Berger [40] and Tung [41] to the current setting.

Lemma 1

For any auxiliary random vector jointly distributed with such that

form a Markov chain,

, and any such that

where