Due to the random nature of the wireless channel, it is notoriously difficult to design wireless communication systems for applications that require both very low latency and very high reliability. For example, applications in factory automation often require latencies of just a few milliseconds and reliability (with respect to this deadline) of and above [1, 2], which is difficult to achieve in a wireless channel that is subject to fading and noise. In order to increase the reliability of the system, one can equip the transmitter with multiple antennas, which is known as multiple-input single-output (MISO). When transmitting only to a single receiver, multiple transmit antennas increase the diversity of the system and reduce the variations in the signal strength in fading channels, leading to more reliable transmissions. When the transmitter has channel state information (CSI), it can use beamforming to send the signal in the direction of the user’s channel, resulting not only in a diversity gain but also a power gain , making the system even more resilient against errors. On the other hand, a transmitter with antennas can also serve different users at the same time, i.e., achieve a multiplexing gain. Serving multiple users at once means that each user can be scheduled more often, which can reduce the delay. An often used transmission strategy for the multiuser MISO downlink is zero-forcing beamforming (ZFBF), which ensures that the signal intended for each user does not create interference at the other (unintended) receivers. Nevertheless, increasing reduces the beamforming gain, i.e., reduces the data rates of the individual transmissions. The trade-off between the multiplexing gain and the beamforming gain was studied in  with respect to the ergodic capacity.
However, the ergodic capacity does not accurately reflect the delay performance of the system. When there is a certain probability that the data rate is small, or when transmission errors occur, the transmitter must keep the data in a buffer so that it can be transmitted in subsequent time slots. This buffering causes a random delay. The queueing delay may sometimes grow until the deadline of the application is violated. For applications that require very high reliability, the communication system must be designed such that the probability of a deadline violation is minimized. For example, low beamforming gains should be avoided, as low beamforming gains increase the probability that the individual data rates are small. In , we studied the trade-off between the multiplexing and beamforming gains with respect to the queueing performance.
Fortunately, as the number of antennas grows, the transmitter can schedule users with and thus benefit from a linear increase in both multiplexing gain and beamforming gain. As the beamforming gain increases linearly in , the system will experience only small variations (relative to the average) in the achievable transmission rate. This effect is known as channel hardening . In this case, we suspect that only the average of the transmission rate will determine the queueing performance: When the average transmission rate is higher than the incoming data rate at the transmit buffer, then long queueing delays should be very unlikely (occur with almost zero probability); otherwise, long queueing delays have probability one.
In this context, it is of critical importance that the reliability of the physical layer transmissions is modeled accurately. Specifically, when the duration of each time slot is short, channel estimation causes a significant overhead, and the transmitter can only acquire an imperfect estimate of the channel state. First of all, this means that zero-forcing beamforming cannot eliminate the interference. Second, the transmitter does not know the actual signal-to-interference-and-noise ratio (SINR) of the channel, and therefore, outages can occur when the actual channel capacity is below the selected transmission rate. The transmitter must then find a careful balance between the outage probability and the rate with respect to the queueing delay. In addition to imperfect CSI, we note that the transmitter cannot achieve error-free transmissions at the channel capacity when the blocklength of the channel code is finite. All of these effects must be taken into account when considering systems for ultra low latency communications.
I-a Related Work
This paper builds on results from several research areas. On the physical layer, we consider the multiuser MIMO downlink with imperfect CSI, as well as finite blocklength coding. On top of these physical-layer aspects, we investigate the queueing delay of the data at the link layer.
I-A1 Multiuser-MIMO and Imperfect CSI
Linear ZFBF precoding in the multiuser MISO downlink has been studied by several authors. Although ZFBF is not capacity-achieving, Yoo and Goldsmith  showed that when the total number of users is much larger than the number of antennas , then ZFBF can achieve the same asymptotic performance as the capacity-achieving scheme based on dirty-paper coding (DPC) . These results hold only if the transmitter has channel state information (CSI) of all users. While the authors investigated in  also the impact of quantized channel state feedback, it would still be impossible to receive feedback from e.g. 100 or more users when the duration of each time slot is short. The random beamforming scheme by Sharif and Hassibi  reduces the overhead from collecting CSI by transmitting a training sequence along a set of randomly created beams. The users then send the index and the signal-to-interference-and-noise ratio (SINR) of the beam with the highest SINR. However, in this scheme, some of the users may not be scheduled for a long period because the scheduling decision is based on the random SINR. Furthermore, even though the overhead from collecting CSI is reduced, collecting feedback from many users may still be infeasible when considering scenarios with very low latency. In fact, Ravindran and Jindal  found that, given a fixed budget for the total overhead, it is better to collect accurate channel estimates from only a small number of users than to collect inaccurate CSI from many users, i.e., they found that accurate CSI is more important than multiuser diversity. When the transmitter has only CSI for the users that are scheduled, Zhang et al.  studied whether the transmitter should send data to a single user or to users. The same authors studied in  the more general case and also considered imperfect CSI. This is very close to our work, but the authors studied only the ergodic sum rate and assumed that an additional perfect feedback link provided the exact value of the channel capacity to the transmitter, such that outages did not occur. Similarly, the authors in  studied the ergodic capacity of a multiuser MISO system under imperfect CSI. In the ergodic case, rate adaptation is not necessary, and the performance loss is due to imperfect beamforming and due to additional noise terms at the receiver. In contrast, we want to study the queueing performance of a system where outages occur because the transmitter must adapt the rate to an imperfect estimate of the channel, and where the receiver must decode the signal in the same time slot (as opposed to decoding over an infinite time horizon).
I-A2 Finite Blocklength Coding
Some well-known results on channel coding at finite blocklength were derived by Polyanskiy et al. , who showed that the loss in the achievable data rate due to finite blocklength can be approximated by a simple second-order expression. Yang et al. extended these results to fading channels . These works generally assume non-Gaussian codebooks in Gaussian noise. Therefore, the results cannot be applied when the transmissions create mutual interference. Scarlett et al.  studied the performance of Gaussian codebooks under non-Gaussian interference. However, we are not aware of any results combining finite blocklength coding with multiuser MIMO.
I-A3 Delay Analysis
The queueing delay, which occurs due to transmission errors and low transmission rates, can for example be analyzed through the frameworks of stochastic network calculus [18, 19] or effective capacity . In our previous work , we applied stochastic network calculus to a single-antenna channel with imperfect CSI and finite length coding. With respect to MIMO systems, the effective capacity of the single-user MIMO channel was studied in several works [22, 23, 24]. The multi-user case was studied by Li et al. , but this work does not fit our assumptions because the authors assumed that the channels are non-fading.
In this work, we study the multiuser MISO downlink both under ideal assumptions and under more realistic assumptions with imperfect CSI and finite blocklength coding. Specifically, we make the following contributions:
For the ideal scenario with perfect CSI and long codewords, we use our previous results  to study the effect of channel hardening. We investigate how many antennas are necessary to achieve extremely high reliability with almost zero violations of the deadline.
For the realistic scenario with imperfect CSI, we derive two closed-form approximations, corresponding to lower and upper bounds on the conditional outage probability.
For the realistic scenario with imperfect CSI and finite blocklength of the channel code, we derive a closed-form approximation for the conditional error probability.
We verify by extensive Monte Carlo simulations that the derived expressions are lower or upper bounds on the conditional outage or error probability.
We show that the closed-form expressions can be used to find a rate adaptation function that minimizes the delay violation probability (based on network calculus).
Our numerical analysis shows that imperfect CSI leads to substantial losses in the average transmission rate. Furthermore, we find that the additional loss due to finite blocklength effects is only relevant when the CSI becomes nearly perfect. Despite the massive performance loss compared to the ideal case, a system that shows a strong channel hardening effect in the ideal scenario maintains this behavior qualitatively in the realistic case. In other words, multiuser MISO systems can achieve very high reliability even under non-ideal assumptions.
This paper is structured as follows: In Sec. II, we present the system model for the ideal scenario with perfect CSI and long codewords. In Sec. III, we present a short summary of the delay analysis through network calculus, and perform the delay analysis for the ideal scenario. In Sec. IV, we show how imperfect CSI and finite blocklength effects can be modeled and analyzed. Numerical evaluations are presented in Sec. V, before we finally conclude the paper in Sec. VI.
Ii System Model
We consider a system where data is sent from a transmitter with antennas to single-antenna users, with . We assume time-slotted transmissions. In each time slot, the transmitter can select only a subset of users, with the number of scheduled users denoted as . As we consider scenarios where the duration of each time slot is short, the overhead from collecting channel state information (CSI) for all users would be overwhelming. Thus, the transmitter cannot select the user set based on the instantaneous CSI. Instead, the transmitter selects in each time slot a set of users and then collects channel state information only for those users, similar to . In this section, we describe a basic model with ideal assumptions based on , i.e., we assume that the transmitter has perfect CSI for the scheduled users and that data can be transmitted at a rate equal to the channel capacity without errors. The more realistic scenario with imperfect CSI and finite blocklength of the channel code will be modeled and analyzed in Sec. IV.
First, we describe in Sec. II-A the data transmission from a physical layer perspective. In Sec. II-B, we discuss user scheduling. In Sec. II-C, the system is described from a queueing perspective. Finally, we present the problem statement in Sec. II-D.
Ii-a Physical Layer Model
The received signal at the scheduled users can be described as
For the channel matrix
, we assume Rayleigh fading, i.e., all elements are independent and identically distributed (i.i.d.) with Gaussian distribution. Furthermore, we consider the quasi-static fading model where the channel remains constant for the duration of one time slot, consisting of channel uses, and changes to an independent realization in the next time slot (note that the set of scheduled users also changes). The input signal is denoted as and must satisfy a short-term power constraint for each realization of . The noise has i.i.d. components .
The transmitter encodes the data for the scheduled users into code symbols (one symbol per antenna). In order to obtain , the transmitter can encode the data of the users individually into symbols and apply a precoding strategy to obtain from
. We focus in this work on Zero-Forcing Beamforming (ZFBF), which is a linear strategy that completely eliminates the interference of the signals at the other receivers. In this case, the input signal vectoris given by [8, 5]
where is the precoding matrix and is the power allocation matrix. We require that the sum power is allocated equally to all users, i.e., . The vector denotes the vector of (independently) coded Gaussian symbols for the scheduled users. When the transmitter perfectly knows the channel matrix , the ZFBF precoder is given as [8, 5]
where is the Moore-Penrose pseudo-inverse of and is the normalization matrix such that the columns of have unit-2 norm. The variables
are central chi-square distributed (scaled by a factor) with degrees of freedom, where . Their PDF is given by [8, Lemma 4]
which changes along with from time slot to time slot.
In each time slot, the transmitter can schedule only a subset of users. To make sure that each user is scheduled regularly, we consider superframes of length slots, and we require that each user is scheduled exactly once within a superframe. The average number of scheduled users per slot is given as . However, may not always be integer. To simplify notations and discussions, our analysis considers only the case where is integer, i.e., where the transmitter schedules a constant number of users in each time slot. The analysis of non-integer is discussed in Appendix A.
Ii-C Link Layer Model
In time slot , data bits intended for downlink transmission to user arrive at the transmitter. The data is stored in a transmit buffer, with individual buffers (or queues) for each user. We assume that the arrival process is constant over time and equal for all users, with denoting the constant number of bits that arrive at the queue of each user in each time slot. In the first part of this work, we assume error-free transmissions, and the service rate offered by the wireless system in each time slot to user is given as when is among the scheduled users, or when user is not scheduled. When transmission errors occur with probability , a scheduled user is served with , where . The departure process describes the amount of data that is transmitted to the receiver. Thus, is limited both by the amount of data waiting in the buffer, as well as by the service rate . The cumulative arrival, service, and departure processes are defined as
The delay is random. The reliability of a communication system with respect to the deadline of the application can be described by the probability that the random delay of the data for user exceeds the target delay at any time :
We note here that for the considered system, the delay violation probability cannot be analyzed through closed-form expressions. We thus follow our previous works [21, 5] and use stochastic network calculus [18, 19] to obtain analytical bounds on .
Ii-D Problem Statement
In the first part of this work, we consider a system with perfect CSI and long blocklength of the channel code. In this case, error-free transmissions at the channel capacity can be achieved. In our previous work , we analyzed the optimal number of scheduled users such that the delay violation probability is minimized, i.e., the optimal trade-off between the multiplexing gain and the beamforming gain. In this work, we study the effect of channel hardening: when the number of antennas grows, the data rate of the wireless channel becomes nearly constant. Due to channel hardening, we expect that the system will become very reliable, i.e., that long queueing delays occur with very low (almost zero) probability when the average transmission rate is above the arrival rate. Naturally, when the average transmission rate is below the arrival rate, the queueing delay grows to infinity and the delay violation probability is one. We will investigate how many antennas are necessary to observe such a zero/one behavior in practice.
In the second part of this work, we consider the same question in a more realistic scenario, where the transmitter must first estimate the channel before the transmission starts, and where the blocklength of the channel code is finite. As we will discuss in Sec. IV, imperfect CSI and finite blocklength coding may have a significant impact on the system performance in the realistic case. Most importantly, scheduling a larger number of users will increase the interference and also result in a larger overhead for channel estimation. We therefore expect that the optimal number of scheduled users will decrease. However, it is not clear whether these effects will just lead to a change in the optimal value of and to a quantitative loss in the overall performance, or whether these effects lead to a qualitative change in the system performance. Specifically, we want to find out whether a realistic system maintains the zero/one behavior with respect to the delay distribution, i.e., whether the system still shows extremely high reliability whenever the average transmission rate is above the arrival rate.
Iii Analysis – Ideal Case
In this section, we follow  and outline the analytical approach to determine . Specifically, in Sec. III-A, we present a summary of the delay analysis through stochastic network calculus in a transform domain . In Sec. III-B we show how these results can be used when the users are only scheduled once per superframe. In Sec. III-C, we analytically obtain the stochastic network calculus bounds for the ideal case. We assume that all users are subject to the same channel characteristics and delay requirements, and we drop the subscript to shorten the notation.
Iii-a Stochastic Network Calculus (SNC)
The delay in (8) is defined in terms of the arrival and departure processes. However, the distribution of the delay can be found directly from the statistics of the arrival and service processes. We follow  and describe these processes in the exponential domain, also referred to as SNR domain. The arrival and service processes in the bit domain, and , are converted to the SNR domain (denoted by calligraphic letters) as
We assume constant arrivals with . Consider for now a service process that is independent and identically distributed (i.i.d.) between time slots. Then, an upper bound on the delay violation probability can be obtained in terms of the Mellin transforms of and . The Mellin transform
of a nonnegative random variableis defined as 
Under the condition , the kernel converges:
For any parameter , the kernel provides an upper bound on the delay violation probability [19, 26]. This holds also in steady-state, i.e., in the limit . The tightest upper bound can be found by iterating over the parameter :
Iii-B SNC and Scheduling
The delay analysis through stochastic network calculus as shown in Sec. III-A cannot be applied directly because is zero in the time slots where the user is not scheduled, i.e., is not i.i.d. between time slots. However, stochastic network calculus can be applied on the superframe level. The service that a user receives in superframe is denoted as , and is i.i.d. between superframes, because each user is scheduled exactly once per superframe of length . The arrival process on the superframe level is given as bits, and the Mellin transform of the process in the SNR domain is .
In case is an integer, it follows directly from (14) that:
When the condition holds, the kernel converges to
The bound depends on the Mellin transform of the service per superframe in the SNR-domain, which is connected to the bit-domain service process as . In the bit-domain, each user experiences a service of bits per superframe, where when a transmission error occurs and otherwise. Therefore:
In case is not an integer, some users (denoted as group 1) will be served times before the deadline, while others (group 2) will only be served times. For the sake of fairness, we assume that the users are assigned randomly to the slots. Then, the probability of being in the second group is , and . Thus, the overall bound on the delay violation probability is given by 
Iii-C Delay Analysis – Ideal Case
In the ideal case with perfect CSI and long codewords, the rate is given as , and no errors occur ( with prob. ). In this case, we can obtain the Mellin transform of in closed form. The variable is a central variable with degrees of freedom as outlined in Sec. II-A. Thus, we obtain the following result:
Given the transmit power and the number of scheduled users , the Mellin transform of the service process is given as:
where , , and is the upper incomplete Gamma function
Iv Analysis – Realistic Case
In the previous section, we analyzed the delay performance of the multiuser MISO downlink channel and provided closed-form expressions for the Mellin transform of the service provided to each user during a superframe of time slots. However, the analysis did not account for some of the effects that may severely deteriorate the performance of actual systems. In a real system, the transmitter first needs to acquire an estimate of the channel matrix before computing the beamforming matrix . Due to the channel estimation error, the ZFBF matrix is not perfectly matched to the actual channel , and thus, the interference cannot be completely eliminated. Furthermore, the transmitter must adapt the coding rate to the imperfect channel estimate . Outages will occur whenever the rate selected by the transmitter happens to be below the actual capacity. Moreover, when the blocklength of the channel code is small, then one cannot achieve error-free transmissions at a rate equal to the channel capacity. Instead, the transmitter must choose rates below the channel capacity in order to achieve low (but still non-zero) error probabilities. All these effects have an impact on the optimal number of scheduled users . Specifically, there are now three additional reasons to choose a small value of :
Channel estimation overhead: For each scheduled user, a dedicated training period is required. At large , this overhead severely reduces the number of symbols that remain for the data transmission. Finite blocklength effects may cause an additional performance loss when becomes very small.
Interference: The signal for each scheduled user creates interference at the other users. A smaller number thus reduces the interference.
Backoff: In order to transmit reliably, the transmitter must often choose a rate below the estimated capacity. Reducing the number of scheduled users increases the individual channel capacities and thus reduces the relative impact of this backoff.
However, there is now also a major reason to increase the number of scheduled users :
Reliability: When more users are scheduled, the transmitter can schedule each user more often, which means that multiple retransmission opportunities are available for each user before the target deadline is reached. Thus, even though scheduling many users (large ) may reduce the reliability of the individual transmissions, it may massively enhance the overall reliability of the system with respect to the deadline.
Taking all these effects into account, our main problem remains the same: we want to determine the optimal value of such that the overall reliability of the system with respect to the target deadline is maximized. To solve this question, one must also solve a secondary problem: one must determine the optimal rate adaptation function . When choosing a high rate , the corresponding error probability will be too high. On the other hand, choosing very low rates may also lead to violations of the deadline, because then the transmitter cannot transmit all buffered data.
In order to solve these questions, we model in the following the effects of imperfect CSI and finite blocklength channel coding. Without loss of generality, we will consider only the signal at receiver . The derived quantities in this section correspond to user , which will not be indicated by a subscript.
Iv-a Imperfect CSI
We consider a time-division duplex (TDD) system, where the transmitter can estimate the channel from training sequences of length symbols sent by the users in the uplink. The training sequences must be mutually orthogonal, thus channel uses are required for the training of users. The SNR of the uplink channel is denoted as and known to the transmitter. By observing the training sequence, the transmitter can obtain the MMSE estimate of the channel towards user . According to , the actual channel vector is given in terms of the MMSE estimate as
with independent of , and
The transmitter then applies zero-forcing beamforming to create the beamforming matrix based on the estimated channel matrix . The received signal at user is given as:
The signal-to-interference-plus-noise ratio (SINR) is then:
where we have denoted the power of the signal at the receiver as and the sum of the interference powers as , and where we made use of the fact that for . After the uplink training, the transmitter must send a downlink training sequence of symbols per user, so that the receivers can also learn the channel and decode the signal . In this work, we assume that when , the estimation error at the receiver is negligible compared to the estimation error at the transmitter.111The first reason for this assumption is that the transmitter can generally transmit at higher power than the users’ devices, which may be battery-powered. Second, a small estimation error at the transmitter may lead to significant interference and to outages, whereas a small estimation error at the receiver would correspond only to a small additional noise term in the decoding process. Third, the receiver can estimate the channel not only from the dedicated training sequence, but also from the codeword itself (joint estimation and decoding) , further improving CSI at the receiver.
For the moment, we ignore the effects of finite blocklength channel coding, and assume that data can be successfully transmitted when the data rateis below the instantaneous capacity . Note that for a fixed slot length of symbols, only symbols remain for the data transmission.
Given the estimated channel , the transmitter must also choose a certain data rate for the transmission. However, due to the channel estimation error , the transmitter does not know the exact value of the SINR in (28). Therefore, there is a chance that the channel will be in outage, i.e., that the transmission fails. The outage probability , conditioned on the channel estimate , is defined as
Unfortunately, analytic expressions for cannot be easily determined, as
depends on the joint distribution of the received signal powerand the interference power . In order to find a solution, we consider the distributions of and separately. The signal-of-interest has power . Conditioned on the known values and , with denoting the estimated receive power, has non-central chi-square distribution with two degrees of freedom and PDF
The interference is the sum of
random variables, each being exponentially distributed with mean
The beamforming vectors are not mutually orthogonal, and thus the individual interference terms are correlated, so that one cannot determine the distribution of , conditioned on or on the corresponding , in closed form. As
is a sum of random variables, the variance ofis minimal in case the individual interference terms are completely independent (all are orthogonal), and the variance is maximal in case all the interference terms are completely correlated (all point in the same direction). Minimum variance of the interference generally minimizes the chance that the interference is very large, and thus minimizes the outage probability compared to the correlated case. Similarly, we conjecture that maximum variance (due to completely correlated interferers) maximizes the outage probability . Therefore, in the following two subsections, we use these two cases to obtain approximate lower and upper bounds on the outage probability .222We acknowledge that considering only the variance of is not sufficient to find rigorous bounds, which would need to be based on the actual joint distribution of and , for which no analytical expressions are known. We will verify the bounds in Sec. V.
Iv-A1 Lower Bound – Uncorrelated Interference
When assuming that the vectors are mutually orthogonal, all the interference terms are uncorrelated. In this case, the sum interference is given as the sum of independent, exponentially distributed random variables, each with mean . Thus, the interference
has gamma distribution with shape factorand scale
, whose cumulative distribution function is given as:
Defining , the conditional outage probability can then be approximated as
In order to find a closed-form solution for the second term, we extend the results in  to the multi-antenna scenario and derive a Gaussian approximation for :
We note that is Gaussian and distributed as , according to (31). The distribution of a circularly symmetric random variable is not affected by a phase shift , thus the second term, which is denoted as , is a real-valued Gaussian variable with variance
The third term has variance , which becomes insignificant compared to even for moderate training powers and training sequence lengths . Thus, the actual receive power can be closely approximated by . Using this the Gaussian approximation for , and the finite series [28, Eq. (8.352.7)] for the incomplete gamma function at integer values of :
we find that the outage probability in case of uncorrelated interferers is approximated by:
We define , and we obtain after some algebra, which includes the binomial expansion :
When assuming that the interferers are mutually orthogonal, the conditional outage probability for a given channel estimate and a given rate can be approximated as
with , and
It can be seen directly that . Furthermore, we obtain . For values , integration by parts can be applied, resulting in:
Thus, the values of for can be obtained iteratively from (43).
Iv-A2 Upper Bound – Correlated Interference
In the previous subsection, we considered the case where all vectors are orthogonal, and thus, the individual interference terms are independent. Conversely, we consider in this subsection the extreme case where the vectors for are identical, resulting in completely correlated interference. This assumption results in the maximum possible variance of the interference . If all are equal, then the sum interference is equal to times the first interference term , which is exponentially distributed with mean . Thus, is exponentially distributed with mean , which is equivalent to a gamma-distributed variable with shape and scale . Thus, the outage probability can be approximated by (41), which for simplifies to the following result:
When assuming completely correlated interference, the outage probability for a given channel estimate and a given rate can be approximated as
with and .
Iv-B Finite Blocklength Channel Coding
When the duration of each time slot is short, the blocklength of the channel code used for the transmission is rather small. This invalidates the assumption that error-free transmissions can be achieved at a rate equal to the channel capacity . Instead, results for finite blocklength channel coding must be used. For AWGN (additive white Gaussian noise) channels, a well-known result is given by Polyanskiy et al. [15, Thm. 54], who showed that given a maximum error probability , the achievable coding rate with complex channel uses at SNR is closely approximated by
However, this result for AWGN channels holds only in case the interference is zero, i.e., when or when the transmitter has perfect CSI and applies ZFBF. Under imperfect CSI with , each receiver experiences interference from the signals intended for other users. In order to achieve the rate (45) for user , the transmitter would need to use a non-Gaussian codebook . Thus, the other users would be subject to non-Gaussian interference, and then (45) would not hold for the other users.
Thus, we must employ different results to model the effects of finite blocklength coding. Specifically, Scarlett et al.  considered the performance of Gaussian codebooks under non-Gaussian noise and nearest-neighbor decoding. The authors also considered the case where sender-receiver pairs transmit concurrently, using i.i.d. Gaussian codebooks, and the receiver experiences i.i.d. Gaussian interference from the transmitters . These results can be directly applied to our scenario because there is no difference between interference that originates from independent transmitters and interference that originates from a single transmitter superimposing independently coded signals. Therefore, when i.i.d. Gaussian codebooks are used, a second-order approximation for the achievable coding rate is given by [17, Eq. (24)]
When the transmitter picks an i.i.d. Gaussian codebook containing different messages, the decoding error probability at the receiver at a specific can thus be approximated as
We note that the choice of a Gaussian codebook depends only on its size, defined by the number of messages , and not on the exact value of . As a result, (49) holds even when the transmitter does not know the exact value of ahead of the transmission. Thus, when the transmitter knows only the estimated channel and chooses a coding rate , the overall error probability at the receiver can be approximated as444This still requires that the receivers obtain perfect CSI from the downlink training symbols. However, for the single-antenna case, we showed that an expression similar to (50) is accurate even when the receiver has only imperfect CSI .
where the expectation is taken over the distribution of , conditioned on the estimated channel matrix . We note that (45) and (47) are second order approximations, i.e., as , the term becomes insignificant compared to the second term, which decays as . For the AWGN channel, it was shown that the approximation (45) can be accurate for blocklengths as small as . In our previous work , which considered a transmitter with only one antenna, we were able to compute a strict lower bound on the achievable coding rate, which showed that the approximation was very accurate for the considered parameters. However, we are not aware of any results that can be used to verify the accuracy of (47) and (50). Therefore, our results should not be seen as the actually achievable performance, but rather as approximations, which can help guide the transmitter in the difficult task of selecting the coding rate and the optimal number of scheduled users .
The error probability in (50) can be obtained from Monte-Carlo simulations just like in the case of infinite blocklength. However, in order to find an optimal rate adaptation function , the transmitter must be able to quickly determine the error probability for rate through a closed-form expression. Therefore, we apply several approximations to (50) in order to obtain a closed-form expression.
Iv-B2 Closed-form Approximation
We apply the concept of random blocklength-equivalent capacity , which allows treating the effects from finite-blocklength coding in the same fashion as outages. We define the random blocklength-equivalent capacity as
with independent of . For a fixed value of , only is random, and the blocklength-equivalent outage probability is – by definition – equal to the error probability at finite blocklength given in (49). Furthermore, it can be easily verified that when both and are random, is exactly equal to the expression for given in (50). Using this concept, we follow the further steps in  and apply the first-order Taylor approximation 
to around in order to bring into the same domain as :
In (53), we applied the Taylor approximation. In (55), we applied the Gaussian approximation . In (56), we replaced and in the factor before with their respective expectations. This is reasonable because this factor corresponds only to the variance of the term with . Although even small values of or can cause an outage, the same small values lead only to a small change (relative to ) in the variance of the term with , which does not significantly affect the distribution of . Finally, in (57), we have defined as the sum of the Gaussian variable and the independent Gaussian variable , multiplied by a constant factor. The sum of two independent Gaussian random variables is Gaussian, and the variance of the sum is equal to the sum of the individual variances. Thus, is zero-mean Gaussian with variance
The error probability due to finite blocklength coding and imperfect CSI is given in (50), which is equal to the blocklength-equivalent outage probability . Using the approximation (57) for , we can follow the same steps as in Sec. IV-A to approximate , simply replacing by in (44):
When assuming that the interference is completely correlated, the error probability under imperfect CSI and finite blocklength coding for a given channel estimate and rate can be approximated as: