In joint source-channel coding , one seeks to find a necessary and sufficient condition such that a source sequence of length can be reliably transmitted over a channel in
channel uses in the sense that the excess-distortion probability for a given distortion levelvanishes. This condition is captured by the maximum attainable ratio of and , also known as rate. For discrete memoryless systems, Shannon  showed that this maximum attainable rate is , where is the capacity of a discrete memoryless channel (DMC) and is the rate-distortion function of a discrete memoryless source (DMS). Shannon showed that, surprisingly, a separation scheme is optimal in this first-order fundamental limit sense. That is, separately designing a reliable lossy data compression system (source code) and data transmission system (channel code) is optimal. Often, for simplicity, one assumes that these codes are tailored to the source and channel statistics. However, in practice, codes that do not depend on the statistics of the source and channel are of paramount importance. Such codes form the central focus of the present work.
We are primarily inspired by two of Lapidoth’s seminal works [2, 3]. In , he showed that for a channel coding system, if the codebook is Gaussian and the decoder is constrained to be a nearest neighbor or minimum Euclidean distance decoder, regardless of the statistics of the additive noise, the maximum coding rate one can attain is the Gaussian capacity function. This constitutes a robust communication system because the rate that one attains is at least as good (i.e., large) as if the noise is Gaussian as long as the code is so designed. In , Lapidoth considered the rate-distortion counterpart of the same problem and showed that the minimum compression rate one can attain for an arbitrary source is the Gaussian rate-distortion function if one uses minimum Euclidean distance encoding and the codebook is Gaussian. Note that for both the source and channel coding systems, the codes are incognizant of the source and channel laws. These problems are also respectively termed as saddle-point problems because they characterize the extremal input distribution-noise pair (for channel coding) and the source-test channel pair (for source coding).
We extend these two works of Lapidoth [2, 3] in two distinct directions. First, we consider a joint source-channel coding (JSCC) setup. In our JSCC scheme, analogously to [2, 3], one is constrained to use two random Gaussian codebooks, one for the reproduced source sequences and one for the channel codewords. However, both minimum Euclidean distance encoding and decoding schemes need to be judiciously modified to ensure that the best (highest) rates are attained. We describe these modifications in greater detail in Section I-A. We refer to the encoding and decoding schemes as modified minimum distance and modified nearest neighbor schemes respectively. The joint scheme is termed the NN-JSCC scheme (NN stands for “nearest neighbor”). Second, instead of focusing solely on the first-order asymptotics (capacity and rate-distortion function), we examine the fundamental limits of such a mismatched decoding setup via a more refined lens. Specifically, we study the second-order and moderate deviation asymptotics of the problem. Our results recover the classical results by Lapidoth [2, 3] and more recent works on second-order asymptotics for the saddle-point problems for channel and source coding studied by Scarlett, Tan and Durisi  and the present authors .
I-a Main Contributions and Related Works
Our main contributions are summarized as follows:
We propose a JSCC architecture using Gaussian codebooks, with modified minimum distance encoding and decoding, to transmit an arbitrary memoryless source over an arbitrary additive memoryless channel. We argue in Section II-C that this architecture generalizes and unifies works by Lapidoth [2, 3]. While the Gaussian codebooks are similar to those in [2, 3], our encoding and decoding schemes differ somewhat. To capture the JSCC nature of the problem, we draw inspiration from works by Csiszár  and Wang, Ingber and Kochman  who respectively established the error exponent and second-order asymptotics for sending a DMS over a DMC. The authors employed the method of types and an unequal error protection (UEP) scheme (cf. Shkel, Tan and Draper ). In our work, we introduce a natural partition of the source sequences into types; however, the notion of types has to be defined carefully since the source need not be discrete. We also regularize the nearest neighbor decoder  so that an appropriate measure of the size of each type class is carefully taken into account in the decoding strategy. Our architecture (which is shown in Figure 1) and subsequent analyses allow us to show that the maximum attainable rate is the ratio between the Gaussian capacity and Gaussian rate-distortion function.
The main contribution, however, is the derivation of ensemble-tight second-order coding rates and moderate deviations constants for the architecture so described. By allowing a non-vanishing ensemble excess-distortion probability, we shed light on the backoff from the maximum attainable rate at finite blocklengths. This complements the results of Kostina and Verdú  who also derived the dispersion of transmitting a Gaussian memoryless source (GMS) over an additive white Gaussian noise (AWGN) channel. We show that the mismatched dispersion for our NN-JSCC scheme is a linear combination of the mismatched dispersions in the channel coding saddle-point problem by Scarlett, Tan and Durisi  and the rate-distortion saddle-point problem by the present authors . For these refined results, there are some intricacies pertaining to what one means by Gaussian codebook. We consider spherical and i.i.d. Gaussian codebooks for both the source reproduction sequences and channel codewords and discuss some subtleties of the second-order results.
Finally, for both the second-order and moderate deviations asymptotic regimes, we show that the separate source-channel coding scheme by combining the corresponding refined asymptotic results in  and  for channel-coding and rate-distortion saddle-point problems [2, 3] is strictly sub-optimal compared to the newly proposed NN-JSCC scheme. By combining Lapidoth’s results in [2, 3] it is, however, easy to see that separation is first-order optimal.
I-B Organization of the Rest of the Paper
The rest of the paper is organized as follows. In Section II, we set up the notation, present our joint source-channel coding system and formulate our problems explicitly. In Section III, we present our main results and provide corresponding remarks. The proofs of each of the asymptotic results (second-order and moderate deviations) are provided in Sections IV and V respectively. Technical results that are not central to the main exposition are relegated to the Appendices.
Ii The Joint Source-Channel Coding Setup
Random variables and their realizations are in upper (e.g., ) and lower case (e.g., ) respectively. All sets are denoted in calligraphic font (e.g., ). For any two natural numbers and we use to denote the set of all natural numbers between and (inclusive). We let . All logarithms are with respect to base . We use
to denote the Gaussian complementary cumulative distribution function (cdf) andits inverse. Let
be a random vector of lengthand be a realization. We use to denote the norm of a vector . Given two vectors and , the (normalized) quadratic distortion measure is defined as . For any random variable , we use to denote the cumulant generating function . For any two sequences and , we write to mean . We use standard asymptotic notations such as , and .
Ii-B System Model
Consider an arbitrary source
with probability mass function (PMF) or probability density function (PDF)satisfying
Next, consider an arbitrary noise random variable with distribution (PMF or PDF) such that
We are interested in using a fixed code to transmit an arbitrary memoryless source to within distortion over an additive channel . Here, is the channel input, is the noise generated i.i.d. according to and is the corresponding channel output.
To describe our NN-JSCC scheme, we resort to a framework that is ubiquitous in joint source-channel coding, e.g., [6, 7]. We define the notion of power types for positive reals similar to. Let be a positive number. This parameter determines (half) the quantization range. Furthermore, let the number of source power type (or simply type) classes be
Define the lower limit for the power level to be
Given each , define the type quantization level and the power type class respectively as
Thus, in effect, we are partitioning all length- source sequences into disjoint subsets depending on their powers . The upper limit for the power level is when is large. We say that is the type or power type of if . Let be a set of integers to be specified later. Finally, let
be a set of pairs in which the first coordinate denotes the type and the second coordinate denotes the index of the codeword in a sub-codebook corresponding to that type.
Our NN-JSCC scheme is illustrated in Figure 1 and defined formally as follows.
An -code for NN-JSCC scheme consists of
A set of source codewords and a set of channel codewords for each . The realizations of and for each are known to both the encoder and decoder.
An encoder which declares an error if and uses the following modified minimum distance encoding rule otherwise. The encoder maps the source sequence into the channel codeword if and minimizes the Euclidean distance over all source codewords in the set , i.e.,
A decoder which employs the modified nearest neighbor decoder rule; it declares that the reproduced source sequence is if
Throughout the paper, we consider random Gaussian codebooks for both source and channel codebooks for part (i) of Definition 1. To be specific, we consider the following two types of Gaussian codebooks.
First, we consider spherical codebooks where each source codeword (or channel codeword ) is generated independently and uniformly over a sphere with radius (or where is a positive number), i.e.,
where is the Dirac delta function, is the surface area of an -dimensional sphere with radius , and is the Gamma function.
For later use, we define the Gaussian capacity and rate-distortion functions as follows:
Furthermore, define the optimal bandwidth expansion ratio/factor
In other words, the proposed NN-JSCC scheme in Definition 1 consists of a concatenation of a source code and a channel code (cf. [9, Definition 8]). Specifically, the encoder can be regarded as the concatenation of a source encoder and a channel encoder. The source encoder selects the index according to source power type class and then selects the sub-index based on the modified minimum distance encoding rule. The channel encoder maps the output of the source encoder into a channel codeword with index . The decoder can be regarded as the concatenation of a channel decoder which adopts the modified nearest neighbor decoding rule to produce and a source decoder which declares the source reproduction sequence as the source codeword with this pair of indices.
Ii-C Motivation for and Remarks on the System Model
Our motivation for considering the NN-JSCC architecture is, in part, to generalize and unify Lapidoth’s works in [2, 3] and, in part, to obtain the best second-order coding rates for the JSCC problem. Similar to [2, 3], ours is a mismatched coding scheme since neither the encoder nor the decoder is designed to be optimal with respect to the source and channel. Rather, its design does not depend on the source and channel statistics. Hence, unless the source and channel are Gaussian, there is mismatch in the problem. Our NN-JSCC scheme is a UEP-inspired extension of the mismatched coding schemes in the rate-distortion  and channel coding  saddle-point problems to the JSCC setting. In fact, if one chooses the parameters so that there is only type class (so all the source sequences lie in ), our NN-JSCC scheme degenerates to a separate source-channel coding scheme. For this extreme case, choosing such that and combining the results in [2, 3], one concludes that the bandwidth expansion ratio (ratio of source symbols to channel uses) is achievable when the source codebook is a spherical codebook and the channel codebook is either a spherical or i.i.d. Gaussian codebook. However, this naïve choice results in strictly suboptimal second-order and moderate deviation constants. For our second-order and moderate deviations results, we exploit the UEP framework of the coding scheme in Fig. 1 and choose and in a more refined fashion.
The complexity (hence practicality or impracticality) of our NN-JSCC coding scheme is almost the same as the schemes in [2, 3]. To wit, we note that both NN encoding and decoding require exponential-time searches over the source and channel codewords. Our scheme incurs an additional search for the index of the power type class that the source lies in; see point (ii) of Definition 1. We design such that the number of type classes is polynomial; the complexity of this search is thus negligible compared to the aforementioned exponential-time searches. Thus, the “practicality” of the proposed scheme is not too dissimilar compared to [2, 3].
Despite the fact that the coding scheme is relatively simple and the complexity is almost equal to that in Lapidoth’s works [2, 3], it remains robust in the sense the bandwidth expansion ratio (which is optimal for the Gaussian version of the problem) is attained. However, this not necessarily optimal for the given arbitrary source and arbitrary additive channel. Nonetheless, the second-order terms can be shown to be ensemble-tight.
Based on the coding scheme in Definition 1, we see that the (ensemble) excess-distortion probability is
Note that the ensemble excess-distortion probability in (18) is averaged not only over the source and noise distributions, but also over the source and channel codebooks. This is similar to [2, 3] which allows us to obtain ensemble-tight results in the spirit of [11, 4, 5].
For subsequent analyses, let be the maximal number of source symbols that can be transmitted over the additive noise channel in channel uses so that the ensemble excess-distortion probability with respect to distortion level is no larger than when a spherical codebook is used as both source and channel codebooks. In a similar manner, we can define , and .111Throughout the paper, when we use double subscripts consisting of elements of the set , the first subscript denotes the nature of the source codebook (spherical or i.i.d.) and the second denotes the nature of the channel codebook.
Fix any . The spherical-spherical second-order coding rate is defined as
Similarly, we can define , and .
A sequence is said to be a moderate deviations sequence222Our definition of moderate deviations sequence in (20) is different from the standard one in for example [12, 13] in which the term is replaced by the less stringent . We require the additional for technical reasons but it is not restrictive as all sequences of the form for are, by definition, moderate deviations sequences. if
Let the length of the source sequence be
The spherical-spherical moderate deviations constant is defined as
Similarly, we can define , and .
Iii Main Results and Discussions
In this subsection, we present some preliminary definitions to be used in presenting our main results.
For and any source sequence , note by spherical symmetry that the non-excess-distortion probability , where , depends on only through its norm . Thus, for any such that , we define
For , when a Gaussian codebook is used as the random source codebook, for each , we choose
We remark that the choice of for any is universal because it only depends on the quantization level (see (5)), which is fixed a priori, and the type of source codebook (see (10) and (12)). It does not depend on the source codebook realization.
To simplify the presentation of our main results, recalling the definition of in (16), for any , define the joint source-channel mismatched dispersion functions as
Iii-B Second-Order Asymptotics
Let the quantization range be
For any and any , we have
First, given a channel codebook, regardless of the choice of the source codebook, the second-order coding rate remains the same. This is consistent with the result in  where the present authors showed that the dispersion for the rate-distortion problem using Gaussian codebooks and minimum Euclidean distance encoding remains the same regardless of the particular choice (spherical or i.i.d.) of the Gaussian codebook. Furthermore, given a source codebook, the second-order coding rates are different and depend on the choice of the channel codebook. This is consistent with the result in  where Scarlett, Tan, and Durisi showed that the dispersion for the nearest neighbor decoding over additive non-Gaussian noise channels depends on the particular choice of the channel codebook (spherical or i.i.d.). In particular, the authors of  showed that
Second, when we particularize our result to transmitting a GMS over an AWGN channel with noise distribution , we have that and . Hence, we recover the achievability part in [9, Theorem 19] where Kostina and Verdú provided the optimal second-order coding rate of transmitting a GMS over an AWGN channel using spherical source and channel codebooks. Our result in (31) shows that the same second-order coding rate can also be achieved when the source codebook is an i.i.d. Gaussian codebook.
Finally, when we use a separate source-channel coding scheme by combining the models in [2, 3] and the results in [4, Theorem 1] and [5, Theorem 1], we obtain that the second-order coding rate for any is bounded above as
Iii-C Moderate Deviations
Before presenting our results, we need the following assumptions on the source and channel parameters.
The cumulant generating functions , , are all finite in a neighborhood around the origin, where is a Gaussian random variable with zero mean and variance one and it is independent of all other random variables.
First, similarly to the second-order asymptotics in Theorem 1, we observe that the dispersion plays an important role in the sub-exponential decay of the ensemble excess-distortion probability. Furthermore, the moderate deviations performance only depends on the choice of the channel codebook.
Iv Proof of Second-Order Asymptotics (Theorem 1)
To establish Theorem 1, we need to prove the results for four combinations of source and channel codebooks where each codebook can either be a spherical or an i.i.d. Gaussian codebook. In Section IV-A, we present preliminary results. In Sections IV-B and IV-C, we present the achievability and converse proofs of Theorem 1 respectively.
In this subsection, we present some preliminary results for subsequent analyses.
Iv-A1 Analysis of Excess-Distortion Events
Recall our NN-JSCC scheme in Definition 1 and Figure 1. Given any source sequence , the encoder declares an error if and maps it into the codeword if and minimizes the Euclidean distance with respect to over all codewords in the subcodebook . Given the channel output , the channel decoder uses the modified nearest neighbor decoding (see (9)) to find and declares as the reproduced source sequence.
For our NN-JSCC scheme, an excess-distortion event occurs if and only if one of the following events occur:
for some (i.e., for some ) and one of the following events occur:
The message pair is transmitted correctly and the distortion is greater than , i.e,
The message is transmitted incorrectly and the distortion is greater than , i.e.,
The message is transmitted correctly, the message is transmitted incorrectly and the distortion is greater than , i.e.,
In subsequent analyses for the achievability parts, we upper bound the ensemble excess-distortion probability as follows:
where (42) follows by i) using the union bound, ii) ignoring the requirement that the message pair is transmitted correctly in , iii) ignoring the excess-distortion event in and , and iv) noting that
Note that in the sum in (42
), the first two probabilities are with respect to the joint distribution of the source sequence and source codebook while the last probability is with respect to distributions of the channel codebook and the noise.
Iv-A2 Analysis of the Output of the Channel Decoder
First, we clarify the relationship of the random variables involved in our joint source channel coding ensemble (see Definition 1). In particular, we specify the dependence of the channel output on other random variables such as the source sequence and the source codebook. The results in this subsection hold regardless the choices of source and channel codebooks.
For simplicity, let
and let , and be the corresponding realizations. Furthermore, for any and for , let
From (53), we conclude that the output of the channel decoder depends on the source sequence and the source codebook only through the type of the source sequence and the subcodebook , i.e., for any and any ,
where is the -th subcodebook of . Note that the probability in (54) is with respect to the channel codebook.
Given any , the mismatched information density (see [4, Eqns. (28)-(29)]) is defined as
For any and any , let
For simplicity, given and for any , we let
In the following lemma, we present bounds on the error probability of the channel decoder conditioned on a source sequence (within a type class) and a subcodebook realization.
For any and any , given any and any subcodebook , we have
The proof of Lemma 3, inspired by and similar to that in  by Scarlett, Tan and Durisi, is available in Appendix -D. Note that Lemma 3 holds regardless of the choice of the channel codebook. We remark that the upper bound given in (60) is an extension of RCU bound in [14, Theorem 16] to the unequal message protection setting (see ) and the lower bound in (61) is a proxy of the RCU bound in the other direction.