Distributed Lossy Source Coding Using Real-Number Codes

11/02/2011
by   Mojtaba Vaezi, et al.
McGill University
0

We show how real-number codes can be used to compress correlated sources, and establish a new framework for lossy distributed source coding, in which we quantize compressed sources instead of compressing quantized sources. This change in the order of binning and quantization blocks makes it possible to model correlation between continuous-valued sources more realistically and correct quantization error when the sources are completely correlated. The encoding and decoding procedures are described in detail, for discrete Fourier transform (DFT) codes. Reconstructed signal, in the mean squared error sense, is seen to be better than that in the conventional approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

01/07/2021

Distributed Arithmetic Coding for Sources with Hidden Markov Correlation

Distributed arithmetic coding (DAC) has been shown to be effective for S...
09/27/2020

A Distributed Computationally Aware Quantizer Design via Hyper Binning

We design a distributed function aware quantization scheme for distribut...
12/14/2017

Achievability Performance Bounds for Integer-Forcing Source Coding

Integer-forcing source coding has been proposed as a low-complexity meth...
11/02/2019

Sparse Regression Codes

Developing computationally-efficient codes that approach the Shannon-the...
10/06/2020

Distributed Coding of Quantized Random Projections

In this paper we propose a new framework for distributed source coding o...
10/03/2018

Lattice-based Robust Distributed Source Coding

In this paper, we propose a lattice-based robust distributed source codi...
11/23/2011

Distributed Representation of Geometrically Correlated Images with Compressed Linear Measurements

This paper addresses the problem of distributed coding of images whose c...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

footnotetext: This work was supported by Hydro-Québec, the Natural Sciences and Engineering Research Council of Canada and McGill University in the framework of the NSERC/Hydro-Québec/McGill Industrial Research Chair in Interactive Information Infrastructure for the Power Grid.

The distributed source coding (DSC) deals with compression of correlated sources which do not communicate with each other [1]. Lossless DSC (Slepian-Wolf coding), has been realized by different binary channel codes, including LDPC [2] and turbo codes [3]. The Wyner-Ziv coding problem [4], deals with lossy data compression with side information at the decoder, under a fidelity criterion. Current approach in the DSC of a continuous-valued source is to first convert it to a discrete-valued source using quantization, and then to apply Slepian-Wolf coding in the binary field. Similarly, a practical Wyner-Ziv encoder is realized by cascading a quantizer and Slepian-Wolf encoder [5, 6]. In other words, the quantized source is compressed. There are, hence, source coding (or quantization) loss and channel coding (or binning) loss. This approach is based on the assumption that there is still correlation remaining in the quantized version of correlated sources.

In this paper, we establish a new framework for the Wyner-Ziv coding. We propose to first compress the continuous-valued source and then quantize it, as opposed to the conventional approach. The compression is thus in the real field, aiming at representing the source with fewer samples.

To do compression, we generate either syndrome or parity samples of the input sequence using a real-number channel code, similar to what is done to compress a binary sequence of data using binary channel codes. Then, we quantize these syndrome or parity samples and transmit them. There are still coding (binning) and quantization losses; however, since coding is performed before quantization, error correction is in the real field and quantization error can be corrected when two sources are completely correlated over a block of code. A second and more important advantage of this approach is the fact that the correlation channel model can be more realistic, as it captures the correlation between continuous-valued sources rather than quantized sources. In the conventional approach, it is implicitly assumed that quantization of correlated signals results in correlated sequences in the discrete domain which is not necessarily correct due to nonlinearity of quantization operation. In addition, most of previous works assume that this correlation, in the binary field, can be modeled by a binary symmetric channel (BSC) with a known crossover probability. To avoid the loss due to inaccuracy of correlation model, we exploit correlation between continuous-valued sources before quantization.

Specifically, we use real BCH-DFT codes [7], for compression in the real field. Owing to the DFT codes, the loss due to quantization can be decreased by a factor of for an DFT code [8], [9]. Additionally, if the two sources are perfectly correlated over one codevector, reconstruction loss vanishes. This is achieved in view of modeling the correlation between the two sources in the continuous domain. Finally, the proposed scheme seems more suitable for low-delay communication because using short DFT codes a reconstruction error better than quantization error is achievable.

The rest of this paper is organized as follows. In Section II, we motivate and introduce a new framework for lossy DSC. In Section III, we briefly review encoding and decoding in real DFT codes. Then in Section IV, we present the DFT encoder and decoder for the proposed system, both in the syndrome and parity approaches. These two approaches are also compared in this section. Section V discusses the simulation results. Section VI provides our concluding remarks.

Ii Proposed System and Motivations

Slepian-Wolf Encoder

Slepian-Wolf Decoder

Fig. 1: The Wyner-Ziv coding using real-number codes.

We introduce the use of real-number codes in lossy compression of correlated signals. Specifically, we use DFT codes [7]

, a class of real Bose-Chaudhuri-Hocquenghem (BCH) codes, to preform compression. Similar to error correction in finite fields, the basic idea of error correcting codes in the real field is to insert redundancy to a message vector of

samples to convert it to a codevector of samples () [7]. But unlike that, the insertion of redundancy in the real field is performed before quantization and entropy coding. The insertion of soft redundancy in the real-number codes has advantages over hard redundancy in the binary field. By using soft redundancy, one can go beyond quantization error, and thus reconstruct continuous-valued signals more accurately. This makes real-number codes more suitable than binary codes for lossy distributed source coding.

The proposed system is depicted in Fig. 1. Although it consists of the same blocks as existing practical Wyner-Ziv coding scheme [5, 6], the order of these blocks is changed here. That is, we perform Slepian-Wolf coding before quantization. This change in the order of the DSC and quantization blocks brings some advantages as described in the following.

  • Realistic correlation model: In the existing framework for lossy DSC, correlation between two sources is modeled after quantization, i.e., in the binary domain. More precisely, correlation between quantized sources is usually modeled as a BSC, mostly with known crossover probability. Admittedly though, due to nonlinearity of quantization operation, correlation between the quantized signals is not known accurately even if it is known in the continuous domain. This motivates investigating a method that exploits correlation between continuous-valued sources to perform DSC.

  • Alleviating quantization error: In lossy data compression with side information at the decoder, soft redundancy, added by DFT codes, can be used to correct both quantization errors and (correlation) channel errors. The loss due to quantization error thus can be recovered, at least partly if not wholly. More precisely, if the two sources are exactly the same over a codevector, quantization error can be corrected completely. That is, perfect reconstruction is achieved over corresponding samples. The loss due to quantization error is decreased even if correlation is not perfect, i.e., when (correlation) channel errors exist.

  • Low-delay communication: If communication is subject to low-delay constraints, we cannot use turbo or LDPC codes, as their performance is not satisfactory for short code length. Whether low-delay requirement exists or not depends on the specific applications. However, even in the applications that low-delay transmission is not imperative, it is sometimes useful to consider low-dimensional systems for their low computational complexity.

Iii Encoding and Decoding with BCH-DFT Codes

Real BCH-DFT codes, a subset of complex BCH codes [7], are linear block codes over the real field. Any BCH-DFT code satisfies two properties. First, as a DFT code, its parity-check matrix is defined based on the DFT matrix. Second, similar to other BCH codes, the spectrum of any codevector is zero in a block of cyclically adjacent components, where is the designed distance of that code [10]. A real BCH-DFT codes, in addition, has a generator matrix with real entries, as described below.

Iii-a Encoding

An real BCH-DFT code is defined by its generator and parity-check matrices. The generator matrix is given by

(1)

in which and respectively are the DFT and IDFT matrices of size and , and is an matrix with zero rows [11, 12, 13, 14]

. Particularly, for odd

, has exactly nonzero elements given as , , [11], [12]. This guarantees the spectrum of any codeword to have consecutive zeros, which is required for any BCH code [10]. The parity-check matrix , on the other hand, is constructed by using the columns of corresponding to the zero rows of . Therefore, due to unitary property of , .

In the rest of this paper, we use the term DFT code in lieu of real BCH-DFT code. Besides, we only consider odd numbers for and ; thus, the error correction capability of the code is .

Iii-B Decoding

For decoding, we use the extension of the well-known Peterson-Gorenstein-Zierler (PGZ) algorithm to the real field [10]

. This algorithm, aimed at detecting, localizing, and estimating errors, works based on the syndrome of error. We summarize the main steps of this algorithm, adapted for a DFT code of length

, in the following.

  1. Compute vector of syndrome samples

  2. Determine the number of errors by constructing a syndrome matrix and finding its rank

  3. Find coefficients of error-locating polynomial whose roots are the inverse of error locations

  4. Find the zeros of ; the errors are then in locations where and

  5. Finally, determine error magnitudes by solving a set of linear equations whose constants coefficients are powers of .

As mentioned, the PGZ algorithm works based on the syndrome of error, which is the syndrome of the received codevector, neglecting quantization. Let be the received vector, then

(2)

where is a complex vector of length . In practice however, the received vector is distorted by quantization () and its syndrome is no longer equal to the syndrome of error because

(3)

where and . While the “exact” value of errors is determined neglecting quantization, the decoding becomes an estimation problem in the presence of quantization. Then, it is imperative to modify the PGZ algorithm to detect errors reliably [10, 11, 12, 13]. Error detection, localization, and also estimation can be largely improved using least squares methods [14].

Iii-C Performance Compared to Binary Codes

DFT codes by construction are capable of decreasing quantization error. When there is no error, an DFT code brings down the mean-squared error (MSE), below the level of quantization error, with a factor of [9, 8]. This is also shown to be valid for channel errors, as long as channel can be modeled as by additive noise. To appreciate this, one can consider the generator matrix of a DFT code as a tight frame [9]; it is known that frames are resilient to any additive noise, and tight frames reduce the MSE times [15]. Hence, DFT codes can result in a MSE even better than quantization error level whereas the best possible MSE in a binary code is obviously lower-bounded by quantization error level.

Iv Wyner-Ziv Coding Using DFT Codes

The concept of lossy DSC and Wyner-Ziv coding in the real field was described in Section II. In this section, we use DFT codes, as a specific means, to do Wyner-Ziv coding in the real field. This is accomplished by using DFT codes for binning, and transmitting compressed signal, in the form of either syndrome or parity samples.

Let

be a sequence of i.i.d random variables

, and be a noisy version of such that , where is continuous, i.i.d., and independent of . Since is continuous, this model precisely captures any variation of , so it can model correlation between and accurately. For example, the Gaussian, Gaussian Bernoulli-Gaussian, and Gaussian-Erasure correlation channels can be modeled using this model [16]

. These correlation models are practically important in video coders that exploit Wyner-Ziv concepts, e.g., when the decoder builds side information via extrapolation of previously decoded frames or interpolation of key frames

[16]. In this paper, the virtual correlation channel is assumed to be a Bernoulli-Gaussian channel, inserting at most random errors in each codeword; thus, is a sparse vector.

Iv-a Syndrome Approach

Iv-A1 Encoding

Given , to compress an arbitrary sequence of data samples, we multiply it with to find the corresponding syndrome samples . The syndrome is then quantized (), and transmitted over a noiseless digital communication system, as shown in Fig. 2. Note that , are both complex vectors of length .

Iv-A2 Decoding

The decoder estimates the input sequence from the received syndrome and side information . To this end, it needs to evaluate the syndrome of channel (correlation) errors. This can be simply done by subtracting the received syndrome from syndrome of side information. Then, neglecting quantization, we obtain,

(4)

and can be used to precisely estimate the error vector, as described in Section III-B. In practice, however, the decoder knows rather than . Therefore, only a distorted syndrome of error is available, i.e.,

(5)

Hence, using the PGZ algorithm, error correction is accomplished based on (5). Note that, having computed the syndrome of error, decoding algorithm in DSC using DFT codes is exactly the same as that in the channel coding problem. This is different from DSC techniques in the binary field which usually require a slight modification in the corresponding channel coding algorithm to customize for DSC.

Fig. 2: The Wyner-Ziv coding using DFT codes: Syndrome approach.

Iv-B Parity Approach

Syndrome-based Wyner-Ziv coding is straightforward but not very efficient because, in a real DFT code, syndrome samples are complex numbers. This means that to transmit each sample we need to send two real numbers, one for the real part and one for the imaginary part. Thus, the compression ratio, using an DFT code, is whereas it is for a similar binary code. This also imposes a constraint on the rate of code, i.e., or , since otherwise there is no compression. In the sequel, we explore parity-based approach to the Wyner-Ziv coding.

Iv-B1 Encoding

To compress , the encoder generates the corresponding parity sequence with samples. The parity is then quantized and transmitted, as shown in Fig. 3, instead of transmitting the input data. The first step in parity-based system is to find the systematic generator matrix, as in (1) is not in the systematic form. Let be partitioned as , where is a matrix of size , and is a square matrix of size . Since is a Vandermonde matrix, exist and we can write

(6)

in which is an matrix, and

is an identity matrix of size

.

The systematic generator matrix corresponding to is given by

(7)

Clearly, . It is also easy to check that

(8)

Therefore, we do not need to calculate and the same parity-check matrix can be used for decoding in the parity approach.

An even easier way to come up with systematic generator matrix is to partition as where is a square matrix of size . Then, from and the fact that is invertible one can see ; thus, we have

(9)

Note that is invertible because using (1) any submatrix of can be represented as product of a Vandermonde matrix and the DFT matrix . This is also proven using a different approach in [9], where it is shown that any subframe of is a frame and its rank is equal to . Hence, since is invertible, the systematic generator matrix is given by

(10)

Again because . Therefore, the same parity-check matrix can be used for decoding in the parity approach. It is also easy to see that is a real matrix. The question that remains to be answered is whether corresponds to a BCH code? To generate a BCH code, must have consecutive zeros in the transform domain. , the Fourier transform of this matrix satisfies this condition because , the Fourier transform of original matrix, satisfies that.

Note that, since parity samples, unlike syndrome samples, are real numbers, using an DFT code a compression ratio of is achieved. Obviously, a compression ratio of is achievable if we use a DFT code.

Fig. 3: The Wyner-Ziv coding using DFT codes: Parity approach.

Iv-B2 Decoding

A parity decoder estimates the input sequence from the received parity and side information . Similar to the syndrome approach, at the decoder, we need to find the syndrome of channel (correlation) errors. To do this, we append the parity to the side information and form a vector of length whose syndrome, neglecting quantization, is equal to the syndrome of error. That is,

(11)

hence,

(12)

Similarly, when quantization is involved (), we get

(13)

and

(14)

in which, . Therefore, we obtain a distorted version of error syndrome. In both cases, the rest of the algorithm, which is based on the syndrome of error, is similar to that in the channel coding problem using DFT codes.

Iv-C Comparison Between the Two Approaches

As we saw earlier, using an code the compression ratio in the syndrome and parity approaches, respectively, is and . Hence, the parity approach is times more efficient than the syndrome approach. Conversely, we can find two different codes that result in same compression ratio, say . We know that in the parity approach, a code can be used for this matter. It is also easy to verify that, in the syndrome approach, a code with rate results in the same compression. For odd and , the DFT code gives the desired compression ratio. Thus, for a given compression ratio the parity approach implies a code with smaller rate compared to the code required in the syndrome approach.

V Simulation Results

We evaluate the performance of the proposed systems using a Gauss-Markov source with zero mean, unit variance, and correlation coefficient 0.9; the effective range of the input sequences is thus

. The sources sequences are binned using a DFT code. The compressed vector, either syndrome or parity, is then quantized with a 6-bit uniform quantizer, and transmitted over a noiseless communication media. The correlation channel randomly inserts one error

, generated by a Gaussian distribution. The decoder localizes and decodes errors. We compare the MSE between transmitted and reconstructed codevectors, to measurers end to end distortion. In all simulations, we use 20,000 input frames for each channel-error-to-quantization-noise ratio (CEQNR). We vary the CEQNR and plot the resulting MSE. The result are presented in Fig. 

4, and compared against the quantization error level in the existing lossy DSC methods.

It can be observed that the MSE in the syndrome approach is lower than quantization error except for a small range of CEQNR. Similarly, in the parity approach, the MSE is less than quantization error for a wide range of CEQNR. Note that in lossy DSC using binary codes, the MSE can be equal to quantization error only if the probability of error is zero. The performance of both algorithms improves as CEQNR is very high. This improvement is due to better error localization, since the higher the CEQNR the better the error localization, as shown in Fig. 5 and [11]. At very low CEQNRs, although error localization is poor, the MSE is still very low because, compared to quantization error, the errors are so small that the algorithm may localize and correct some of quantization errors instead. Additionally, reconstruction error is always reduced with a factor of , in an DFT code.

In terms of compression, the parity approach is times more efficient than the syndrome approach, as discussed earlier in Section IV-C. Not surprisingly though, the performance of the parity approach is not as good as that of the syndrome approach, because it contains fewer redundant samples. On top of that, in this simulation, of samples are corrupted in the parity approach while this figure is for the syndrome approach. The parity approach, however, suffers from the fact that dynamic range of parity samples, generated by (10), could be much higher than that of syndrome samples as increases. This implies more precision bits to achieve the same accuracy. Finally, it is worth mentioning that when data and side information are the same over a block of code, reconstruction error becomes zero in both approaches.

Fig. 4: Reconstruction error in the syndrome and parity approaches, using a DFT code in Fig. 2, 3. For both schemes, the virtual correlation channel inserts one error at each channel error to quantization noise ratio.

Vi Conclusions

We have introduced a new framework for distributed lossy source coding in general, and Wyner-Ziv coding specifically. The idea is to do binning before quantizing the continuous-valued signal, as opposed to the conventional approach where binning is done after quantization. By doing binning in the real field, the virtual correlation channel can be modeled more accurately, and quantization error can be corrected when there is no error. In the new paradigm, Wyner-Ziv coding is realized by cascading a Slepian-Wolf encoder with a quantizer. We employ real BCH-DFT codes to do the Slepian-Wolf in the real field. At the decoder, by introducing both syndrome-based and parity-based systems, we adapt the PGZ decoding algorithm accordingly. From simulation results, we conclude that our systems, specifically with short codes, can improve the reconstruction error, so that they may become viable in real-world scenarios, where low-delay communication is required.

Fig. 5: Relative frequency of correct localization of correlation channel error in the syndrome and parity approaches, using a DFT code.

References

  • [1] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Transactions on Information Theory, vol. IT-19, pp. 471–480, July 1973.
  • [2] A. D. Liveris, Z. Xiong, and C. N. Georghiades, “Compression of binary sources with side information at the decoder using LDPC codes,” IEEE Communications Letters, vol. 6, pp. 440–442, Oct. 2002.
  • [3] A. Aaron and B. Girod, “Compression with side information using turbo codes,” in Proc. IEEE Data Compression Conference, pp. 252–261, 2002.
  • [4] A. D. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Transactions on Information Theory, vol. 22, pp. 1–10, Jan. 1976.
  • [5] B. Girod, A. M. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributed video coding,” Proceedings of the IEEE, vol. 93, pp. 71–83, Jan. 2005.
  • [6] Z. Xiong, A. D. Liveris, and S. Cheng, “Distributed source coding for sensor networks,” IEEE Signal Processing Magazine, vol. 21, pp. 80–94, Sept. 2004.
  • [7] T. Marshall Jr., “Coding of real-number sequences for error correction: A digital signal processing problem,” IEEE Journal on Selected Areas in Communications, vol. 2, pp. 381–392, March 1984.
  • [8] V. K. Goyal, J. Kovaevic, and J. A. Kelner, “Quantized frame expansions with erasures,” Applied and Computational Harmonic Analysis, vol. 10, no. 3, pp. 203–233, 2001.
  • [9] G. Rath and C. Guillemot, “Frame-theoretic analysis of DFT codes with erasures,” IEEE Transactions on Signal Processing, vol. 52, pp. 447–460, Feb. 2004.
  • [10] R. E. Blahut, Algebraic Codes for Data Transmission. New York: Cambridge Univ. Press, 2003.
  • [11] G. Rath and C. Guillemot, “Subspace-based error and erasure correction with DFT codes for wireless channels,” IEEE Transactions on Signal Processing, vol. 52, pp. 3241–3252, Nov. 2004.
  • [12] G. Takos and C. N. Hadjicostis, “Determination of the number of errors in DFT codes subject to low-level quantization noise,” IEEE Transactions on Signal Processing, vol. 56, pp. 1043–1054, March 2008.
  • [13] A. Gabay, M. Kieffer, and P. Duhamel, “Joint source-channel coding using real BCH codes for robust image transmission,” IEEE Transactions on Image Processing, vol. 16, pp. 1568–1583, June 2007.
  • [14] M. Vaezi and F. Labeau, “Least squares solution for error correction on the real field using quantized DFT codes,” to appear in EUSIPCO 2012.
  • [15] J. Kovacevic and A. Chebira, An introduction to frames. Now Publishers, 2008.
  • [16] F. Bassi, M. Kieffer, and C. Weidmann, “Source coding with intermittent and degraded side information at the decoder,” in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2941–2944, 2008.