Gallager Bound for MIMO Channels: Large-N Asymptotics

11/23/2017 ∙ by Apostolos Karadimitrakis, et al. ∙ 0

The use of multiple antenna arrays in transmission and reception has become an integral part of modern wireless communications. To quantify the performance of such systems, the evaluation of bounds on the error probability of realistic finite length codewords is important. In this paper, we analyze the standard Gallager error bound for both constraints of maximum average power and maximum instantaneous power. Applying techniques from random matrix theory, we obtain analytic expressions of the error exponent when the length of the codeword increases to infinity at a fixed ratio with the antenna array dimensions. Analyzing its behavior at rates close to the ergodic rate, we find that the Gallager error bound becomes asymptotically close to an upper error bound obtained recently by Hoydis et al. 2015. We also obtain an expression for the Gallager exponent in the case when the codelength spans several Rayleigh fading blocks, hence taking into account the situation when the channel varies during each transmission.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In recent years wireless communications have experienced an unprecedented growth in the number of users, data rate and throughput volumes. To meet this demand, MIMO techniques, at both the link and network level, promise to dramatically increase the information throughput. For fading channels, the standard metric to characterize the performance of the link is the outage capacity [1], which corresponds to the throughput for a fixed outage probability. However, the outage capacity corresponds to infinitely long codewords. To deal with the realistic case of finite length codewords, Gallager [2] proposed a simple yet effective bound to the probability of error, as a function of rate and codeword length . In its original version, as well as in more recent variations [3] this bound focused on single antenna links. There has been a number of extensions of the Gallager bound. For example, in [4] the Gallager’s random coding error exponent was derived for MIMO Rayleigh block-fading channels, however, the expressions, while valid for all antenna sizes, are quite cumbersome to compute and analyze for any reasonably sized antenna array. In [5, 6] expressions for Gallager’s exponent were derived for space-time-block-coding (STBC) MIMO channels for non-Rayleigh fading models. However, STBC reception effectively corresponds to a single antenna link with increased diversity.

More recently, optimal bounds of the error probability for large but finite codewords have been established for single-link communications[7, 8]. These results are of a central–limit–theoretic nature, in that they are valid for large blocklengths with the rate converging to the ergodic rate at a fixed error probability. Similar results were obtained for MIMO systems in [9], where the number of antennas also goes to infinity at a fixed ratio with . In contrast to the Gallager bound this approach does not capture the tails of the error probability, i.e. when the rate deviation per antenna from the ergodic rate is finite.

In this paper we apply random matrix theory to evaluate the error probability exponent of the Gallager bound when the blocklength , the number of transmitting, and receiving antennas, and the rate all become large but at fixed ratios , , . Our large deviation result is valid for all normalized rates . When we evaluate the error exponent for small , our results match with the upper bound obtained by [9]. While the asymptotic limit of large antenna numbers is somewhat idealized, it is known from other works, e.g. [10] that even for moderate antenna numbers the asymptotic results become quite accurate. In addition, we explore the impact of fading in the channel by allowing the channel to take independent realizations within a codeword of length . Our approach, which maps the random matrix problem to a gas of Coulomb charges on a line, was first introduced in the context of statistical physics by Dyson [11] and recently in [10] and [12] in the context of information theory and communications.

I-a Outline and Notations

In the next section we formulate the problem and present the main results. In Section III we discuss our findings in representative limits, while in Section IV we conclude.

We use upper case letters in bold font to denote matrices, , with entries given by . The superscript denotes the Hermitian transpose operation and represents the

-dimensional identity matrix.

Ii Problem Formulation and Results

Ii-a Channel Model and Capacity

Let us consider a MIMO link with transmit and receive antennas and analyze the transmission of symbols. We assume a block fading channel, which remains constant over symbols and changes independently after each such coherence time [13]. Hence is a parameter indicated by the bandwidth of the system and the fading statistics of the channel. Therefore, the memoryless channel reads


for , where is the received signal matrix during the th block, is the channel matrix, whose entries are independent and identically distributed (i.i.d.) , is the transmitted signal matrix and is the noise matrix with entries i.i.d. following . For notational convenience we will denote , , etc. The transmitter has only statistical knowledge of the channel, while the receiver knows it perfectly e.g., using a pilot signal. The mutual information per channel use over the th block for Gaussian input with i.i.d. entries following is given by


The joint distribution of eigenvalues of



where is the normalization constant, is a weight function, which depends on the statistics of and the exponent is an energy functional of the eigenvalues that will become useful later. Fot the case of complex Gaussian channels, which is the focus of this paper, the form of the weight function is . There are a number of other random matrix models for which the joint distribution of eigenvalues takes the same form with different realizations of which we will briefly comment later in the paper. The value of the mutual information per antenna converges weakly to a deterministic value in the large limit, given by the ergodic average of the mutual information [14] (Eq. 105-106),




where . The empirical eigenvalue density of converges weakly to the well-known Marčenko-Pastur distribution [15] (Equation 1.12)


where are the endpoints of its support.

In the infinite codelength limit, the effect of the channel fading is captured through the optimal outage error probability [1] over the channel matrix , given by (in the case of , ). The exponent of the outage probability was analyzed in [10] when the number of antennas becomes large. There it was shown that when with fixed, the outage probability behaves as


where close to behaves as




The above quantity is the dispersion of the mutual information distribution in the infinite codelength limit and will be called hereafter infinite codelength dispersion, in accordance with the names used for similar quantities in [8, 16, 9].

Ii-B Gallager Exponent for Power-Constrained Input Alphabets

On the other hand, for finite codelength

, one can estimate the error probability by using the so-called Gallager bound. Specifically, the error probability of transmission at a code rate of

for a given instantiation of , of a discrete memoryless channel without feedback and maximum likelihood (ML) decoding is bounded by (see Eq. (7.3.20) in [2])


where , is the distribution of the noise , while is the distribution of constrained to inputs such that only codewords with


are used. This constraint can be enforced as an inequality by following (Eq. (7.3.17)) in[2] to observe that


for any , where is the unconstrained input distribution assumed henceforth to be Gaussian and a normalization constant. Integrating over we obtain


after omitting the normalization term , which can be shown to be subleading in [2]. After averaging over and optimizing over the values of , we find that , the average error rate after jointly decoding the total message sent over blocks is bounded by




In the above, and is the per-antenna rate and is the SNR. We then define the Gallager exponent as


It should be stressed that while in single link transmission schemes the exponent of the probability of error scales with the blocklength , in MIMO systems it should be proportional to , which is the number of symbols transmitted. To be able to compare with the infinite codelength error exponent defined in the previous section, we have chosen to re-scale the error exponent in the same way (i.e. with ), adding a factor of in (II-B). We then take the limit , while at the same time keeping the ratios and fixed. The analytic evaluation of the error exponent in this limit is the main result of this paper and is summarized by the following theorem.

Theorem 1.

The limit of the error exponent exists and can be expressed as




and . The values of the parameters , and , as functions of , are the unique solutions of the following equations:


Having determined these parameters as functions of , is determined from as follows. Defining the function as


where the function the following known integral [17]


and setting we have


where indicates the inverse function of .

The proof of Theorem 1 can be found in Appendix A. 111 and are the endpoints of the support of and should not be confused with and .

Remark 1.

defined in (1) and appearing in (1) and (22) can be interpreted as a density of eigenvalues and exhibits a square root singularity at the limits of its support, just as the Marčenko – Pastur density [18]. From physical point of view, corresponds to the equilibrium charge density in the Coulomb gas picture, when the energy function is given by . From a practical point of view, it corresponds to the empirical distribution of observed eigenvalues of the realized channel matrices, which balance the occurrence probability of such channel matrices with the corresponding coding error probability, when operating at a given normalized rate , and .

Remark 2.

Setting in (12) corresponds to an unconstrained Gaussian input distribution. Hence, the corresponding solution of (1) will be the Gallager exponent for unconstrained Gaussian inputs, which is expected to be smaller.

Remark 3.

From the equations of the above theorem we immediately see that the -dependence of has the following form: , where we explicitly included the dependence of on and . This allows us to make all calculations for and in the end to re-scale and accordingly.

Corollary 1.

For the above expression for the error exponent can be calculated in closed form to read


where .

Corollary 2.

In the special case the lower limit of the support of becomes zero, i.e. . In this case (19) (which results from the continuity condition ) does not hold. However, we can obtain by setting , in equations (20), (22), (1). Then reads


Iii Analysis

Iii-a Dependence of on

In Fig. 1, we plot the Gallager error exponent for various values of . We see that increasing brings the error curve closer to the error exponent of the infinite codelength outage probability introduced in [10]. This convergence can be seen directly in (1)-(22). As , , so that and the solution converges to that of [10].

It is important to point out here that the assumption that the receiver knows the channel matrix necessitates the existence of some training overhead, which becomes significant when the number of channel uses becomes comparable to the number of transmit antennas . We do not take into account this issue here, assuming instead that the training takes place through some parallel channel. However, an effective way to incorporate training is to replace by , since it takes roughly channel uses to train the transmit antenna channels.

Fig. 1: The Gallager error exponent . As is increased, the curves for approach the outage probability exponent [10] (dashed). The small circles indicate the points where . For we also depict Gallager exponent for the average power constraint () and the Sphere Packing Bound error exponent (dot-dashed). Parameter values used are: , , .

Iii-B and Comparison with Sphere Packing Bound

The circles in Fig. 1 correspond to the values , below which the Gallager error exponent becomes linear in . This behavior is due to the fact that the value of the error exponent in (1) is the result of the maximization with respect to the parameter over the unit interval . For the maximum lies outaside this interval and hence remains fixed to unity. Hence the error exponent in (1) becomes linear in . Extending the -maximization interval to provides the so-called sphere-packing error exponent [2]. In Fig. 1 we include the sphere-packing exponent in the case of (dash-dot) for comparison. As expected, for rates above the value of indicated by a circle, the error exponent coincides with the Gallager random coding exponent, while for (corresponding to solutions with ) the sphere-packing exponent is higher.

Iii-C Region and Comparison with [9]

The region close to is interesting because the error exponent vanishes and hence the error probability is maximal. It is easy to see that , where is the solution of the equation in (22) for . From (22), we see that when , then . This implies that is a global minimum, since, taking advantage of the convexity of the supremum operation with respect to and , it can be shown that is a convex function of [12]. Therefore, close to , we can write


where . Let us define through


The left-hand-side of the above equation is easy to evaluate since . Hence, by differentiating and expressing its value at , we obtain


In the above, can be expressed as


where is the infinite codelength dispersion given in (9) and has the simple form


where is given by


where is the Marcenko-Pastur distribution given in (6) and its endpoints. It is worth pointing out that the last term in (32) is the correction due to the peak-power codeword constraint (11). We see that the Gallager error exponent , which is valid for all rates

takes a quadratic form akin to the exponent of a normal distribution for rates close to

. This is analogous to the case of infinite codelengths discussed in Section II-A. (30) is valid when , in order for the error exponent to be small. However, it is also implicitly assumed that , so that the term in the error probability exponent (see (16)) is the dominant one. Hence this is exactly the moderate deviations regime discussed for general single link systems in [16]. An important point that can be drawn from the form of (33) is that it depends only on the empirical distribution of eigenvalues, which in this case happens to be the Marcenko-Pastur distribution. Therefore, can be calculated for other channel models for which and are known.

In [9], the authors obtained bounds on the optimum average probability of error for MIMO systems when the normalized rate of the code approaches the ergodic rate such that in the limit that

become large with fixed ratios. In this limit, they show that the error probability is bounded between two Gaussian distributions with variances (or dispersions) given (in their notation) by

, which can be expressed as


and , respectively. Therefore, the Gallager random coding exponent with Gaussian input saturates the upper bound in the dispersion derived by [9].

Fig. 2: The dispersion at the Gaussian limit () using the asymptotic method and the method of induced ergodicity of [9]. The and curves are identical; , , .

Iii-D Impact of Fading

The case models the realistic situation where the channel varies during the transmission of the codeword. Specifically, the channel matrix changes ( times) during the codelength . It is assumed here that the receiver knows each channel realization, either using an additional pilot signal or by using part of the codeword as pilot (in which case will represent the data-transmitting part of the codeword). In Fig. 3 we can see the behavior of the error exponent for increasing values of . As , the number of independent fading blocks within a codeword increases, the error exponent also increases, signifying lower error probabilities. To understand the behavior for large , we prove the following result.

Theorem 2 ( Limit of ).

For fixed , the maximum over in the above equation is attained at the value


Defining the function


and setting we have


where indicates the inverse function of .

From the above theorem we conclude that for fast-fading, and therefore large values of it is the ergodic rate that determines the behavior of the error exponent. When we can once again expand in powers of to obtain


where is given in (32).

Fig. 3: The error exponent for the Gallager bound with power constraint at the limit of ; , , .The small circles indicate the points of behavior change ().

Iv Conclusion

In this paper we have applied random matrix theory to calculate an analytic expression of the Gallager bound for finite codelength for block fading channels with independent fading blocks within a codeword. This method is valid for arbitrary normalized rates , in the large limit. As expected, the error exponent increases with , resulting to a lower error probability. The limit is characterized by . Furthermore, when the normalized rate becomes close to the Gallager exponent becomes asymptotically equal to an upper bound of the fixed optimal error analysis derived recently in [9]. The methodology we have used can be generalized to other MIMO channels for which the joint eigenvalue is known. For example, we have recently applied it to obtain the Gallager exponent for fiber-optical MIMO channels [19]. Other cases for which the joint eigenvalue distribution of the effective channel is known and hence this methodology can be directly applied by using the appropriate weight function in (II-A), include the uplink MU-MIMO channel [20] and the Amplify-and-Forward channel [21]. It should be noted that more general Gaussian channels, which do not have a known joint eigenvalue distribution can be analyzed in similar ways using the replica method [14].

Appendix A Proof of Theorem 1

For we study the limit


where , are the eigenvalues of and is defined on , where is the space of probability measures on as


where the argument of is defined as . It is therefore important to show a number of properties of and . First, when , the function vanishes, i.e. , so that . Second, is continuous in , for which Berge’s Maximum theorem [22] can be invoked. Third, is convex in , which can be shown directly from its definition. Fourth, is quasi-concave in , . To show this we start by noting that, excluding the term in (41), is concave in both , . Hence, since is quasi-concave, so is . Therefore for all for which the integral , has a global maximum in , .

A-a Varadhan’s Lemma

We now wish to invoke Varadhan’s Lemma. To do so, we first provide the following definitions:

Definition 1.

[23] A rate function is a lower semicontinuous mapping , for which all level sets are closed. If, in addition, the level sets are compact, then is called a good rate function.

Definition 2.

[24, 23] The probability law satisfies the large deviation principle in the scale with rate function if, for all subsets of


where and are the interior and closure of , respectively.

We now note that is continuous. In addition, since for every , then for any


Furthermore, in [25] it was shown that , the probability law of the satisfies a large deviation principle with good rate function given by


As a result, Varadhan’s Lemma can be applied to (40) to show that the limit exists and is equal to


Furthermore, it is possible to show that is a convex function of . This follows directly from [24, 10] by observing the quadratic dependence of in . Therefore, since is convex in its infimum has a unique solution. Taking into account the definition of and its concave-convex properties discussed above, we may apply Sion’s theorem [26] to exchange the order in which the are applied. Therefore, in (45) can be expressed as


A-B Explicit Solution of optimum and Evaluation of

To solve the above optimization problem (46), we introduce the Lagrangian functions


Since is convex in and concave in , and , the saddle point is unique [27] and we obtain


Taking advantage of the convexity in , in order to find the infimum of we will take the functional derivative with respect to , which is defined as


where and is a test function. This can be re-written as




At the minimum, (51) must vanish identically for all , thus and it follows that


Next, we differentiate (52) with respect to to obtain


where PV denotes the principle value. Following [28], the solution of the last equation is given by