In recent years wireless communications have experienced an unprecedented growth in the number of users, data rate and throughput volumes. To meet this demand, MIMO techniques, at both the link and network level, promise to dramatically increase the information throughput. For fading channels, the standard metric to characterize the performance of the link is the outage capacity , which corresponds to the throughput for a fixed outage probability. However, the outage capacity corresponds to infinitely long codewords. To deal with the realistic case of finite length codewords, Gallager  proposed a simple yet effective bound to the probability of error, as a function of rate and codeword length . In its original version, as well as in more recent variations  this bound focused on single antenna links. There has been a number of extensions of the Gallager bound. For example, in  the Gallager’s random coding error exponent was derived for MIMO Rayleigh block-fading channels, however, the expressions, while valid for all antenna sizes, are quite cumbersome to compute and analyze for any reasonably sized antenna array. In [5, 6] expressions for Gallager’s exponent were derived for space-time-block-coding (STBC) MIMO channels for non-Rayleigh fading models. However, STBC reception effectively corresponds to a single antenna link with increased diversity.
More recently, optimal bounds of the error probability for large but finite codewords have been established for single-link communications[7, 8]. These results are of a central–limit–theoretic nature, in that they are valid for large blocklengths with the rate converging to the ergodic rate at a fixed error probability. Similar results were obtained for MIMO systems in , where the number of antennas also goes to infinity at a fixed ratio with . In contrast to the Gallager bound this approach does not capture the tails of the error probability, i.e. when the rate deviation per antenna from the ergodic rate is finite.
In this paper we apply random matrix theory to evaluate the error probability exponent of the Gallager bound when the blocklength , the number of transmitting, and receiving antennas, and the rate all become large but at fixed ratios , , . Our large deviation result is valid for all normalized rates . When we evaluate the error exponent for small , our results match with the upper bound obtained by . While the asymptotic limit of large antenna numbers is somewhat idealized, it is known from other works, e.g.  that even for moderate antenna numbers the asymptotic results become quite accurate. In addition, we explore the impact of fading in the channel by allowing the channel to take independent realizations within a codeword of length . Our approach, which maps the random matrix problem to a gas of Coulomb charges on a line, was first introduced in the context of statistical physics by Dyson  and recently in  and  in the context of information theory and communications.
I-a Outline and Notations
In the next section we formulate the problem and present the main results. In Section III we discuss our findings in representative limits, while in Section IV we conclude.
We use upper case letters in bold font to denote matrices, , with entries given by . The superscript denotes the Hermitian transpose operation and represents the
-dimensional identity matrix.
Ii Problem Formulation and Results
Ii-a Channel Model and Capacity
Let us consider a MIMO link with transmit and receive antennas and analyze the transmission of symbols. We assume a block fading channel, which remains constant over symbols and changes independently after each such coherence time . Hence is a parameter indicated by the bandwidth of the system and the fading statistics of the channel. Therefore, the memoryless channel reads
for , where is the received signal matrix during the th block, is the channel matrix, whose entries are independent and identically distributed (i.i.d.) , is the transmitted signal matrix and is the noise matrix with entries i.i.d. following . For notational convenience we will denote , , etc. The transmitter has only statistical knowledge of the channel, while the receiver knows it perfectly e.g., using a pilot signal. The mutual information per channel use over the th block for Gaussian input with i.i.d. entries following is given by
where is the normalization constant, is a weight function, which depends on the statistics of and the exponent is an energy functional of the eigenvalues that will become useful later. Fot the case of complex Gaussian channels, which is the focus of this paper, the form of the weight function is . There are a number of other random matrix models for which the joint distribution of eigenvalues takes the same form with different realizations of which we will briefly comment later in the paper. The value of the mutual information per antenna converges weakly to a deterministic value in the large limit, given by the ergodic average of the mutual information  (Eq. 105-106),
where . The empirical eigenvalue density of converges weakly to the well-known Marčenko-Pastur distribution  (Equation 1.12)
where are the endpoints of its support.
In the infinite codelength limit, the effect of the channel fading is captured through the optimal outage error probability  over the channel matrix , given by (in the case of , ). The exponent of the outage probability was analyzed in  when the number of antennas becomes large. There it was shown that when with fixed, the outage probability behaves as
where close to behaves as
The above quantity is the dispersion of the mutual information distribution in the infinite codelength limit and will be called hereafter infinite codelength dispersion, in accordance with the names used for similar quantities in [8, 16, 9].
Ii-B Gallager Exponent for Power-Constrained Input Alphabets
On the other hand, for finite codelength
, one can estimate the error probability by using the so-called Gallager bound. Specifically, the error probability of transmission at a code rate offor a given instantiation of , of a discrete memoryless channel without feedback and maximum likelihood (ML) decoding is bounded by (see Eq. (7.3.20) in )
where , is the distribution of the noise , while is the distribution of constrained to inputs such that only codewords with
are used. This constraint can be enforced as an inequality by following (Eq. (7.3.17)) in to observe that
for any , where is the unconstrained input distribution assumed henceforth to be Gaussian and a normalization constant. Integrating over we obtain
after omitting the normalization term , which can be shown to be subleading in . After averaging over and optimizing over the values of , we find that , the average error rate after jointly decoding the total message sent over blocks is bounded by
In the above, and is the per-antenna rate and is the SNR. We then define the Gallager exponent as
It should be stressed that while in single link transmission schemes the exponent of the probability of error scales with the blocklength , in MIMO systems it should be proportional to , which is the number of symbols transmitted. To be able to compare with the infinite codelength error exponent defined in the previous section, we have chosen to re-scale the error exponent in the same way (i.e. with ), adding a factor of in (II-B). We then take the limit , while at the same time keeping the ratios and fixed. The analytic evaluation of the error exponent in this limit is the main result of this paper and is summarized by the following theorem.
The limit of the error exponent exists and can be expressed as
and . The values of the parameters , and , as functions of , are the unique solutions of the following equations:
Having determined these parameters as functions of , is determined from as follows. Defining the function as
where the function the following known integral 
and setting we have
where indicates the inverse function of .
defined in (1) and appearing in (1) and (22) can be interpreted as a density of eigenvalues and exhibits a square root singularity at the limits of its support, just as the Marčenko – Pastur density . From physical point of view, corresponds to the equilibrium charge density in the Coulomb gas picture, when the energy function is given by . From a practical point of view, it corresponds to the empirical distribution of observed eigenvalues of the realized channel matrices, which balance the occurrence probability of such channel matrices with the corresponding coding error probability, when operating at a given normalized rate , and .
From the equations of the above theorem we immediately see that the -dependence of has the following form: , where we explicitly included the dependence of on and . This allows us to make all calculations for and in the end to re-scale and accordingly.
For the above expression for the error exponent can be calculated in closed form to read
Iii-a Dependence of on
In Fig. 1, we plot the Gallager error exponent for various values of . We see that increasing brings the error curve closer to the error exponent of the infinite codelength outage probability introduced in . This convergence can be seen directly in (1)-(22). As , , so that and the solution converges to that of .
It is important to point out here that the assumption that the receiver knows the channel matrix necessitates the existence of some training overhead, which becomes significant when the number of channel uses becomes comparable to the number of transmit antennas . We do not take into account this issue here, assuming instead that the training takes place through some parallel channel. However, an effective way to incorporate training is to replace by , since it takes roughly channel uses to train the transmit antenna channels.
Iii-B and Comparison with Sphere Packing Bound
The circles in Fig. 1 correspond to the values , below which the Gallager error exponent becomes linear in . This behavior is due to the fact that the value of the error exponent in (1) is the result of the maximization with respect to the parameter over the unit interval . For the maximum lies outaside this interval and hence remains fixed to unity. Hence the error exponent in (1) becomes linear in . Extending the -maximization interval to provides the so-called sphere-packing error exponent . In Fig. 1 we include the sphere-packing exponent in the case of (dash-dot) for comparison. As expected, for rates above the value of indicated by a circle, the error exponent coincides with the Gallager random coding exponent, while for (corresponding to solutions with ) the sphere-packing exponent is higher.
Iii-C Region and Comparison with 
The region close to is interesting because the error exponent vanishes and hence the error probability is maximal. It is easy to see that , where is the solution of the equation in (22) for . From (22), we see that when , then . This implies that is a global minimum, since, taking advantage of the convexity of the supremum operation with respect to and , it can be shown that is a convex function of . Therefore, close to , we can write
where . Let us define through
The left-hand-side of the above equation is easy to evaluate since . Hence, by differentiating and expressing its value at , we obtain
In the above, can be expressed as
where is the infinite codelength dispersion given in (9) and has the simple form
where is given by
where is the Marcenko-Pastur distribution given in (6) and its endpoints. It is worth pointing out that the last term in (32) is the correction due to the peak-power codeword constraint (11). We see that the Gallager error exponent , which is valid for all rates
takes a quadratic form akin to the exponent of a normal distribution for rates close to. This is analogous to the case of infinite codelengths discussed in Section II-A. (30) is valid when , in order for the error exponent to be small. However, it is also implicitly assumed that , so that the term in the error probability exponent (see (16)) is the dominant one. Hence this is exactly the moderate deviations regime discussed for general single link systems in . An important point that can be drawn from the form of (33) is that it depends only on the empirical distribution of eigenvalues, which in this case happens to be the Marcenko-Pastur distribution. Therefore, can be calculated for other channel models for which and are known.
In , the authors obtained bounds on the optimum average probability of error for MIMO systems when the normalized rate of the code approaches the ergodic rate such that in the limit that, which can be expressed as
and , respectively. Therefore, the Gallager random coding exponent with Gaussian input saturates the upper bound in the dispersion derived by .
Iii-D Impact of Fading
The case models the realistic situation where the channel varies during the transmission of the codeword. Specifically, the channel matrix changes ( times) during the codelength . It is assumed here that the receiver knows each channel realization, either using an additional pilot signal or by using part of the codeword as pilot (in which case will represent the data-transmitting part of the codeword). In Fig. 3 we can see the behavior of the error exponent for increasing values of . As , the number of independent fading blocks within a codeword increases, the error exponent also increases, signifying lower error probabilities. To understand the behavior for large , we prove the following result.
Theorem 2 ( Limit of ).
For fixed , the maximum over in the above equation is attained at the value
Defining the function
and setting we have
where indicates the inverse function of .
From the above theorem we conclude that for fast-fading, and therefore large values of it is the ergodic rate that determines the behavior of the error exponent. When we can once again expand in powers of to obtain
where is given in (32).
In this paper we have applied random matrix theory to calculate an analytic expression of the Gallager bound for finite codelength for block fading channels with independent fading blocks within a codeword. This method is valid for arbitrary normalized rates , in the large limit. As expected, the error exponent increases with , resulting to a lower error probability. The limit is characterized by . Furthermore, when the normalized rate becomes close to the Gallager exponent becomes asymptotically equal to an upper bound of the fixed optimal error analysis derived recently in . The methodology we have used can be generalized to other MIMO channels for which the joint eigenvalue is known. For example, we have recently applied it to obtain the Gallager exponent for fiber-optical MIMO channels . Other cases for which the joint eigenvalue distribution of the effective channel is known and hence this methodology can be directly applied by using the appropriate weight function in (II-A), include the uplink MU-MIMO channel  and the Amplify-and-Forward channel . It should be noted that more general Gaussian channels, which do not have a known joint eigenvalue distribution can be analyzed in similar ways using the replica method .
Appendix A Proof of Theorem 1
For we study the limit
where , are the eigenvalues of and is defined on , where is the space of probability measures on as
where the argument of is defined as . It is therefore important to show a number of properties of and . First, when , the function vanishes, i.e. , so that . Second, is continuous in , for which Berge’s Maximum theorem  can be invoked. Third, is convex in , which can be shown directly from its definition. Fourth, is quasi-concave in , . To show this we start by noting that, excluding the term in (41), is concave in both , . Hence, since is quasi-concave, so is . Therefore for all for which the integral , has a global maximum in , .
A-a Varadhan’s Lemma
We now wish to invoke Varadhan’s Lemma. To do so, we first provide the following definitions:
 A rate function is a lower semicontinuous mapping , for which all level sets are closed. If, in addition, the level sets are compact, then is called a good rate function.
We now note that is continuous. In addition, since for every , then for any
Furthermore, in  it was shown that , the probability law of the satisfies a large deviation principle with good rate function given by
As a result, Varadhan’s Lemma can be applied to (40) to show that the limit exists and is equal to
Furthermore, it is possible to show that is a convex function of . This follows directly from [24, 10] by observing the quadratic dependence of in . Therefore, since is convex in its infimum has a unique solution. Taking into account the definition of and its concave-convex properties discussed above, we may apply Sion’s theorem  to exchange the order in which the are applied. Therefore, in (45) can be expressed as
A-B Explicit Solution of optimum and Evaluation of
To solve the above optimization problem (46), we introduce the Lagrangian functions
Since is convex in and concave in , and , the saddle point is unique  and we obtain
Taking advantage of the convexity in , in order to find the infimum of we will take the functional derivative with respect to , which is defined as
where and is a test function. This can be re-written as
At the minimum, (51) must vanish identically for all , thus and it follows that
Next, we differentiate (52) with respect to to obtain
where PV denotes the principle value. Following , the solution of the last equation is given by