Lossless Source Coding in the Point-to-Point, Multiple Access, and Random Access Scenarios

02/09/2019 ∙ by Shuqing Chen, et al. ∙ 0

This paper treats point-to-point, multiple access and random access lossless source coding in the finite-blocklength regime. A random coding technique is developed, and its power in analyzing the third-order-optimal coding performance is demonstrated in all three scenarios. Results include a third-order-optimal characterization of the Slepian-Wolf rate region and a proof showing that for dependent sources, the independent encoders used by Slepian-Wolf codes can achieve the same third-order-optimal performance as a single joint encoder. The concept of random access source coding, which generalizes the multiple access scenario to allow for a number of participating encoders that is unknown a priori to both the encoders and the decoder, is introduced. Contributions include a new definition of the probabilistic model for a random access-discrete multiple source (RA-DMS), a general random access source coding scheme that employs a rateless code with sporadic feedback, and an analysis for the proposed scheme demonstrating via a random coding argument that there exists a deterministic code of the proposed structure that simultaneously achieves the third-order-optimal performance of Slepian-Wolf codes for all possible numbers of encoders. While the analysis for random access codes focuses on a class of permutation-invariant RA-DMSs, it also extends to more general sources.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

This paper studies the fundamental limits of fixed-length lossless source coding in three scenarios:

  1. Point-to-point: A single source is compressed and decompressed.

  2. Multiple access: A fixed set of sources are separately compressed and jointly decompressed.

  3. Random access: An arbitrary subset of a set of possible sources are separately compressed and jointly decompressed.

The information-theoretic limit in these three operational scenarios is the set of code sizes or rates at which a desired level of reconstruction error is achievable. Shannon’s theory [1]

analyzes this fundamental limit by taking an arbitrarily long encoding blocklength with a vanishing error probability. Since most real-world applications are delay and computation sensitive, it is of practical interest to analyze finite-blocklength fundamental limits. Following

[2, 3, 4, 5], we allow a non-vanishing error probability and study refined asymptotics of the achievable rates in encoding blocklength .

In point-to-point almost-lossless source coding, non-asymptotic bounds and asymptotic expansions of the minimum achievable rate have been given in [6], [2], [7], [8], [9], [4]. In particular, Kontoyiannis and Verdú [4] have given a third-order-optimal characterization of the minimum achievable rate at blocklength and target error probability by analyzing the optimal code. For a finite-alphabet stationary memoryless source with single-letter distribution , entropy , and varentropy ,

(1)

with any higher-order term bounded by ; here

denotes the complementary Gaussian distribution function.

In multiple access lossless source coding, also known as the Slepian-Wolf (SW) setting [10], the object of interest is the set of achievable rate tuples, known as the rate region. The first-order-optimal rate region for general sources was studied in [11, 7]; the results in [11, 7] reduce to Slepian and Wolf’s result in [10] for a stationary memoryless multiple source. The best prior asymptotic expansion of the SW rate region in terms of the encoding blocklength for a stationary memoryless multiple source is the second-order-optimal rate region, established independently in [12] and [13]. The characterization by Tan and Kosut [12]

describes the rate region in a vector form that parallels the first two terms in (

1). In this case, a quantity known as the entropy dispersion matrix plays a role similar to . Their result suggests that the third-order term is bounded from above by and from below by .

In the setting of point-to-point almost-lossless source coding, our contribution is to provide a precise characterization of the performance of random coding in terms of tight non-asymptotic bounds as well as their asymptotic expansions. By deriving the exact performance of random coding with the best possible threshold decoder, we conclude that random coding with threshold decoding cannot achieve in the third-order term in (1), and thus is strictly suboptimal. We show that random coding with maximum likelihood decoding, however, achieves the first three terms in (1). We do this by deriving and carefully analyzing a source coding counterpart of the random-coding union (RCU) bound in channel coding [3, Th. 16]. The fact that our asymptotic expansion is achieved by a random code rather than the optimal code from [4] has a number of important implications. First, it demonstrates that there is no loss (up to the third-order term) due to random coding, which implies the existence of a large number of codes that have near-optimal performance. In particular, our RCU bound for source coding holds when restricted to linear compressors, implying that there are linear codes with near-optimal performance. Second, it enables generalization of this technique to source coding scenarios where the optimal code is not known; this is crucial since knowledge of the optimal code in the case of point-to-point almost-lossless source coding is quite exceptional.

While finding optimal SW codes is intractable in general, our derivation of the source coding RCU bound generalizes to multiple access scenarios. The resulting achievability bound and a converse result from [7, Lemma 7.2.2] together yield a third-order-optimal characterization of the SW rate region for a stationary memoryless multiple source (Theorem 9), revealing a linear third-order term of . This tightens the result over the third-order bound from [12], which grows linearly with the alphabet size and exponentially with the number of encoders. Our third-order-optimal characterization implies that for dependent sources, the SW code’s independent encoders suffer no loss up to the third-order performance relative to joint encoding with a point-to-point code.

The prior information theory literature studies multiple access source coding for scenarios where the number and identity of encoders are fixed and known. However, in applications like sensor networks, the internet of things, and random access communication, the number of transmitters communicating with a given access point may be unknown or time-varying. The information theory of random access channel coding is investigated in papers such as [14, 15, 16]. Here, we introduce the notion of random access source coding, which extends multiple access source coding to the scenario where the number of encoders is unknown a priori.

To begin our study, we first establish a probabilistic model for the object being compressed in random access source coding. We call that object a random access-discrete multiple source (RA-DMS), which consists of all possible collections of sources to be compressed. We then develop a robust coding scheme to achieve reliable compression of an arbitrary subset of the sources despite a lack of a priori knowledge of that subset. Since the SW rate region varies with the source set being compressed, each encoder must vary its coding rate accordingly. Considering that the encoders do not know that set, we achieve the desired rate using a rateless code, which accommodates variable decoding times. The encoders transmit their codewords symbol-by-symbol until they are informed to stop, which is realized by using sporadic feedback from the decoder. The decoder selects a decoding time from a set of predetermined potential decoding times based on which encoders it sees in the network. Single-bit feedback from the decoder at each potential decoding time informs all encoders of when the decoder is able to decode. Thus, unlike commonly considered rateless codes that allow arbitrary decoding times [17, 18, 19, 20], our coding scheme only allows a fixed set of decoding times, thereby requiring only sporadic feedback in operation.

In the asymptotic analysis of our proposed coding scheme, we focus on the class of stationary memoryless permutation-invariant RA-DMSs. In this case, we are able to reduce the design complexity by employing identical encoding for all encoders. We demonstrate (Theorem 

20 in Section V below) that there exists a single deterministic code that simultaneously achieves, for all possible number of active encoders, the optimal performance (up to the third order) of the SW code. Since traditional random coding arguments are not sufficient to show the existence of a single deterministic code that meets every constraint in a collection of constraints, prior code designs for multiple-constraint scenarios employ shared randomness (see, for example, [19]). Inspired by Tchamkerten and Telatar’s work in [18], we here propose an alternative to that approach, deriving a refined random coding argument (Lemma 21 in Section V-E) that can be used to demonstrate the existence of a single deterministic code; this technique can be applied more broadly.

Except where noted, the source coding results presented in this paper do not require finite source alphabets but only countable ones.

The organization of this paper is as follows. Section II defines notation. Section III, IV, and V are devoted to (point-to-point) almost-lossless source coding, multiple access (Slepian-Wolf) source coding and random access source coding, respectively. The contents in these three sections are organized in parallel:

  1. Section III-A gives definitions for almost-lossless source coding, Section III-B provides background, and Section III-C presents our new achievability random coding bounds and the asymptotic analysis.

  2. Section IV-A gives definitions for Slepian-Wolf source coding, Section IV-B provides background, and Section IV-C presents our third-order-optimal characterization of the Slepian-Wolf rate region.

  3. In Section V-A, we define the random access-discrete multiple source and describe our random access coding scheme. Prior work related to random access source coding is discussed in Section V-B. In Sections V-C, V-D, and V-E, we analyze the proposed coding scheme and give both the achievability and the converse characterizations of its finite-blocklength performance on the class of permutation-invariant RA-DMSs. Extensions of our coding strategy to broader classes of RA-DMSs are discussed in Section V-F.

We give the concluding remarks in Section VI, with proofs of auxiliary results given in the appendices.

Ii Notation

We use uppercase letters (i.e.,

) to denote random variables, lowercase letters (i.e.,

) to denote realizations of the random variables, calligraphic uppercase letters (i.e., ) to denote subsets of a sample space (events) or index sets, and script uppercase letters to denote subsets of a Euclidean space (i.e., ). Vectors are denoted by , with the all-ones vector denoted by 1. For any sequence and any ordered index set , vector . Matrices are denoted by bold uppercase letters (i.e., V), and the -th element of matrix V is denoted by . Relations “” and “” between two vectors of the same dimension correspond to elementwise inequalities. For a vector and a set , “” denotes the set formed by moving every point in by the displacement specified by u in (the Minkowski sum of and ). For any positive integers , such that , denotes the set and denotes . If , . We use . All ’s and ’s, if not specified, employ an arbitrary common base.

For two functions and , if there exist such that for all . For a -dimensional function , for some function if for all .

The standard Gaussian cumulative distribution function is denoted by

(2)

Function denotes the standard Gaussian complementary cumulative distribution function, and denotes the inverse function of

. The standard Gaussian probability density function is denoted by

(3)

For a distribution on a countable alphabet , the information (entropy density) and conditional information are defined as

(4)
(5)

for any with , , and

. The corresponding (conditional) entropy, varentropy, and third centered moment are denoted by, respectively,

(6)
(7)
(8)
(9)
(10)
(11)

Iii Random Coding in Almost-Lossless Source Coding

Iii-a Definitions

In point-to-point almost-lossless data compression, a discrete random variable

defined on a finite or countably infinite alphabet is encoded into a message taken from the set of codewords . A decoder subsequently reconstructs the source symbol from the compressed description. Due to the limitation on the code size , an almost-lossless source code is often associated with a non-zero error probability. The following definitions formalize almost-lossless source codes and their fundamental limits.

Definition 1 (Almost-lossless source code).

An code for a random variable with discrete alphabet comprises an encoding function and a decoding function such that the error probability satisfies .
The minimum achievable code size compatible with error probability is defined by

(12)

The generality of the setting in Definition 1 allows one to particularize any result derived for that setting to more specialized scenarios, such as the block code described in the next definition.

Definition 2 (Block almost-lossless source code).

An almost-lossless source code for a random vector defined on , the -fold Cartesian product of the set , is called an code.
The minimum code size and rate achievable at blocklength and error probability are defined by, respectively,

(13)

and

(14)

Almost-lossless block codes were previously defined in, for example, [7, Chapter 1].

A discrete information source is a sequence of discrete random variables , which is specified by the transition probability kernels , for each Many classes of sources, including sources with memory and non-stationary sources, conform to the setting of Definition 2. In our asymptotic analysis, we focus on the class of stationary memoryless sources, where for all , i.e., are i.i.d.

Iii-B Background

Shannon’s source coding theorem [1] gives a fundamental limit on the asymptotically achievable performance of the codes for a stationary memoryless source with single-letter distribution :

(15)

In the finite-blocklength regime, which is of more practical interest, Kontoyiannis and Verdú [4] gave a lower and an upper bound on that coincide in the first three terms. They also demonstrated an gap in the fourth-order term. Recall that and denote the second and third absolute centered moments of (see (7), (10)).

Theorem 1 (Kontoyiannis and Verdú [4]).

Consider a stationary memoryless source with a finite alphabet and single-letter distribution that satisfies . The following bounds111These bounds are stated in [4] in a base-2 logarithmic scale, but they hold for any base. The base of logarithm determines the information unit. hold:
(achievability) for all and all ,222According to [4], the achievability bound holds for any . Notice, however, that it only becomes meaningful when .

(16)

(converse) for all and all such that

(17)
(18)
Remark 1.

Although Theorem 1 was stated in [4] only for and a finite source alphabet, the proof in [4] shows that provided that is finite, for all and any countable source alphabet, the bounds in (16) and (18) still hold with the same first three terms, replacing the bounds on the fourth-order terms by (with dependency on ).

Remark 2.

When

, the source is non-redundant; that is, it has a finite alphabet and a uniform distribution. In this case,

. The optimal code simply maps a fraction of possible source outcomes to unique codewords. So the minimum achievable code size satisfies

(19)

It follows immediately from (19) that

(20)

The characterization of in (20) agrees with (16) in its first- and second-order terms (since ) but lacks an third-order term.

Remark 3.

Although its dependency on is not explicitly noted, is indeed a function of , , and . The characterization of in (20) might lead one to suspect that has a discontinuity at equal to the uniform distribution on due to the missing third-order term (which otherwise appears in both (16) and (18)). However, this conclusion is flawed because the upper bound in (16) blows up for any finite when . Indeed, the Berry-Esseen type bounds are loose for small . See Figure 1. The discontinuity appears in the bounds of but there is no discontinuity in . (Note that unlike most non-asymptotic limits in information theory, in almost-lossless source coding is directly computable.) The right way to interpret the results in Theorem 1 is to see that for any , there exists some such that for all , behaves like in the third-order term. For a smaller , the minimum needed becomes larger.

[width=0.235]p2p-1

(a)

[width=0.235]p2p-2

(b)
Fig. 1: Evaluations of the bounds in (16) and (18), and the optimum vs. for a Bernoulli- source at .

Kontoyiannis and Verdú [4] obtained the bounds in Theorem 1, which coincide up to the third order, by analyzing the optimal code. That code encodes a cardinality- subset of that has the largest probability. The decoder declares an error whenever a symbol outside this optimum set is produced by the source. With a few notable exceptions (e.g., a few scenarios of (almost) lossless data compression examined in [4], [5]), characterizing the optimal code is elusive in most communication scenarios of interest. Thus, the random coding argument, first proposed by Shannon [1], has become a popular and powerful technique in deriving achievability results. Here we review the existing achievability bounds for almost-lossless compression based on random coding.

Theorem 2 (e.g. [21], [9, Th. 9.4]).

There exists an code for discrete random variable such that

(21)

The bound in Theorem 2 is obtained by assigning source realizations to codewords independently and uniformly at random. The decoder uses a threshold decoding rule that decodes to if and only if is a unique source realization that (i) is compatible with the observed codeword under the given (random) code design, and (ii) has information below . Particularizing (21) to a stationary memoryless source with single-letter satisfying and , choosing and optimally and applying the Berry-Esseen inequality, one obtains an asymptotic expansion of the bound: for all ,

(22)

This (optimal) application of Theorem 2 yields an excessive in the third-order term of the asymptotic expansion, while the optimum is (Theorem 1).

The key question is whether the penalty in the third-order-term exhibited in (22) is due to random coding or due to the choice of the decoding rule. The optimum decoding rule is maximum likelihood. Previously, Kontoyiannis and Verdú [4, Th. 8] gave an exact expression for the performance of random coding under i.i.d. uniform random codeword generation and maximum likelihood decoding. However, their result, which is derived for general sources, is not easy to analyze. In Section III-C Theorem 4 below, we derive a new random-coding bound based on maximum likelihood decoding, and we demonstrate that random coding is capable of achieving third-order optimality for a stationary memoryless source.

Iii-C Main Results: New Achievability Bounds Based on Random Coding

In this section, we present two new achievability bounds for almost-lossless source coding of general sources. The first, called the dependence testing (DT) bound, parallels the DT bound in channel coding [3, Th. 17]. The second, called the random-coding union (RCU) bound, parallels the RCU bound in channel coding [3, Th. 16].

The DT bound tightens the prior bound based on threshold decoding presented in Theorem 2.

Theorem 3 (DT bound).

There exists an code for a discrete random variable such that

(23)
Proof.

Note the following equality that holds for arbitrary and [3, Eq. (68)]:

(24)

If we take and , and take the expectation of both sides of (24) with respect to , we obtain the following equivalence

(25)

where denotes a probability with respect to , and denotes a mass with respect to the counting measure defined on , which assigns unit weight to each . Thus, it is sufficient to show (25).

We appeal to the following auxiliary result: there exists an code for discrete random variable such that for all ,

(26)

Taking in (26) yields (25). So all remains is to show (26).

The proof of (26) is based on random coding. Fix . We draw encoder outputs i.i.d. uniformly at random from for each . We adopt a threshold decoder:

(27)

The averaged error over this random code construction is bounded by the union of the two error events:

(28)
(29)

By the random coding argument and the union bound, there exists an code such that

(30)

where

(31)
(32)
(33)
(34)
(35)

where (33) applies the union bound to all and (34) holds since the encoder outputs are drawn i.i.d. uniformly at random and independent of . ∎

The inequality in (26) bounds the random coding performance of a threshold decoder with threshold . Paralleling the observation made in [3] in the context of channel coding, we notice that the right-hand-side of (26) is equal to times the minimum measure of the error event in a Bayesian binary hypothesis test between with a priori probability and with a priori probability . See [22, Remark 5] for an observation that the Neyman-Pearson lemma generalizes to -finite measures such as our here. Thus, this measure of error is minimized by the test that compares the log likelihood ratio between and () to the log ratio of the two a priori probabilities (), i.e.,

Therefore, taking minimizes the right-hand-side of (26). Hence Theorem 3 gives the tightest possible bound for random coding with threshold decoding. Particularizing Theorem 3 to a stationary memoryless source with a single-letter distribution satisfying and , and invoking the Berry-Esseen theorem, we obtain an asymptotic expansion of the bound: for all ,

(36)

Unfortunately, (36) is also third-order-suboptimal. Thus, threshold-based decoding in random coding is not sufficient to achieve the best performance in the third-order-term.

Next, we present the RCU bound. Unlike the random-coding bounds in Theorems 2 and 3, which employ threshold decoding, the RCU bound yields an asymptotic achievability result for stationary memoryless sources that is tight up to the third-order term. Therefore, the loss in the third-order-term in both (22) and (36) is due to the sub-optimal decoder rather than the random encoder design.

Theorem 4 (RCU bound).

There exists an code for a discrete random variable such that

(37)

where for all .

Proof.

We begin our random code design by drawing the encoder output i.i.d. uniformly at random from for each . For decoding, we use the maximum likelihood decoder

(38)

When there is more than one source symbol that has the maximal probability mass, the decoder chooses among them equiprobably at random.

The error probability averaged over this random code construction is bounded by the probability of the event:

(39)

To prove the existence of an code satisfying (37) using the random coding argument, we show that , where the probability measure is generated by both and the random encoding map , is bounded from above by the right-hand-side of (37):

(40)
(41)
(42)
(43)
(44)

where (41) holds by the law of iterated expectation, (42) bounds the probability by the minimum of the union bound and 1, (43) holds because the encoder outputs are drawn i.i.d. uniformly at random and independently of , and (44) rewrites (43) in terms of the distribution . The proof is now complete, with (44) equal to the right-hand-side of (37). ∎

Remark 4.

Applying the argument employed in the proof of [9, Th. 9.5] to the above analysis, we can obtain the same RCU bound by randomizing over only linear encoding maps. Thus, there is no loss in performance when restricting to linear compressors.

The RCU bound in Theorem 4 provides a new proof of the asymptotic achievability result in Theorem 1. While the original proof analyzes the optimal code, our proof relies on a randomly designed encoder, showing that optimal code design is not necessary to achieve third-order-optimal performance. This observation is useful in scenarios such as multiple access source coding, where the optimal code is hard to find, as discussed in Section IV below.

Our asymptotic analysis in Theorem 5 relies on the following assumptions. Consider a stationary memoryless source with single-letter distribution . We assume

  1. [label=(a.0),leftmargin=3.3topsep=5pt]

  2. ,

  3. .

Denote

(45)

where is the absolute constant in the Berry-Esseen inequality for i.i.d. random variables (see Theorem 6).

Theorem 5 (Third-order-optimal achievability via random coding).

Consider a stationary memoryless source satisfying the conditions in 1, 2. For all ,

(46)

where .

The remainder term can be characterized more precisely as follows: for all and ,

(47)

for all and ,

(48)

where is defined in (45).

Before we show our proof of the asymptotic bound in Theorem  refthm-rcu-asymp, we state two auxiliary results that turn out to be very useful in our analysis.

The first result is the classical Berry-Esseen inequality (e.g. [23, Chapter XVI.5], [24]). We state it here for i.i.d. random variables with the best known absolute constant given in [24].

Theorem 6 (Berry-Esseen Inequality).

Consider a sequence of i.i.d. random variables with a common distribution such that , and . Then for any real and ,

(49)

where .

We refer to as the Berry-Esseen constant for the i.i.d. random variables .

The second result is [3, Lemma 47] developed by Polyanskiy et al. The bound originally given in [3, Lemma 47] only requires independence of the random variables. One can sharpen it for i.i.d. random variables by appealing to the Berry-Esseen inequality above with . We state the modified version of the lemma below, which allows for a better numerical comparison between Theorem 5 and Theorem 1.

Lemma 7 (Modified from [3, Lemma 47]).

Let be i.i.d. random variables with a common distribution such that and . Then for any ,

(50)
Proof of Theorem 5.

We analyze the random-coding bound in Theorem 4. Denote for brevity

(51)

Each of and is a sum of i.i.d. random variables. Substituting in Theorem 4, we note that there exists an code such that

(52)

where . Let

(53)

denote the Berry-Esseen constant (see Theorem 6) for the i.i.d. random variables . We invoke Lemma 7 with to conclude

(54)
(55)

where

(56)

is a positive finite constant by the assumptions 1 and 2. Using (55), we bound (52) as

(57)
(58)
(59)

where (57) plugs (55) into (52), (58) separates the cases and , and (59) applies Lemma 7 again to the second term in (58).

We now choose

(60)

By the Berry-Esseen inequality (Theorem 6) applied to (59), this choice of gives , and hence an achievability bound on :

(61)

Specifically, we have

(62)
(63)
(64)

for some , where (63) holds by a first-order Taylor bound, and (64) holds by the inverse function theorem.

If and , we have and is decreasing in . We can further bound the right-hand-side of (64) and conclude that

(65)

If and , we have and is increasing in . We conclude that

(66)

By plugging (65) and (66) into (61), we obtain (47) and (48), respectively. ∎

Iv Multiple Access Source Coding

The discussion that follows focuses on multiple access source coding with two encoders. While this choice is expedient for the sake of notational brevity, all of the results discussed here generalize to scenarios with encoders, as briefly noted in Remark 6 below.

Iv-a Definitions

In multiple access source coding, also known as Slepian-Wolf (SW) source coding [10], a pair of random variables with finite or countably infinite alphabets and are compressed separately. Each encoder observes only one of the random variables and independently maps it to one of the codewords in or , respectively; a single decoder subsequently decodes the pair of codewords it receives to reconstruct jointly. As in Section III-A, we first present the definition of a SW code for an abstract random object, and then particularize it to the case where the random object observed by the encoders lives in an alphabet endowed with a Cartesian product structure.

Definition 3 (SW code).

An SW code for a pair of random variables with finite or countably infinite alphabets and comprises two separate encoding functions and , and a decoding function such that the error probability satisfies .
A pair of code sizes is -achievable if there exists an SW code.

In the conventional block setting, the encoders individually observe and

drawn from a joint distribution

defined on . The block SW code is defined as follows.

Definition 4 (Block SW code).

Let and be the -fold Cartesian products of the sets and , respectively. A SW code for a pair of random vectors defined on is called an SW code.

The finite blocklength rates associated with this code are defined by

(67)
Definition 5 (-rate region).

A rate pair is -achievable if there exists an SW code with and . The -rate region