# Balance Queueing and Retransmission: Latency-Optimal Massive MIMO Design

One fundamental challenge in 5G URLLC is how to optimize massive MIMO communication systems for achieving both low latency and high reliability. A reasonable design is to choose the smallest possible target error rate, which can achieve the highest link reliability and the minimum retransmission latency. However, the overall system latency is the sum of latency due to queueing and due to retransmissions, hence choosing the smallest target error rate does not always minimize the overall latency. In this paper, we minimize the overall latency by jointly designing the target error rate and transmission rate, which leads to a sweet tradeoff point between queueing latency and retransmission latency. This design problem can be formulated as a Markov decision process, whose complexity is prohibitively high for real-system deployments. Nonetheless, we managed to develop a low-complexity closed-form policy named Large-arraY Reliability and Rate Control (LYRRC), which is latency-optimal in the large-array regime. In LYRRC, the target error rate is a function of the antenna number, arrival rate, and channel estimation error; and the optimal transmission rate is twice of the arrival rate. Using over-the-air channel measurements, our evaluations suggest that LYRRC can satisfy both the latency and reliability requirements of 5G URLLC.

## Authors

• 2 publications
• 28 publications
• 31 publications
• 8 publications
09/14/2018

### A Statistical Learning Approach to Ultra-Reliable Low Latency Communication

Mission-critical applications require Ultra-Reliable Low Latency (URLLC)...
01/29/2020

### Design of Non-orthogonal and Noncoherent Massive MIMO for Scalable URLLC Beyond 5G

This paper is to design and optimize a non-orthogonal and noncoherent ma...
07/09/2020

### Outage Analysis of Downlink URLLC in Massive MIMO systems with Power Allocation

Massive MIMO is seen as a main enabler for low latency communications, t...
10/27/2021

### A Linear Bayesian Learning Receiver Scheme for Massive MIMO Systems

Much stringent reliability and processing latency requirements in ultra-...
01/10/2022

### Spatiotemporal 2-D Channel Coding for Very Low Latency Reliable MIMO Transmission

To fully support vertical industries, 5G and its corresponding channel c...
07/17/2021

### Reliability and User-Plane Latency Analysis of mmWave Massive MIMO for Grant-Free URLLC Applications

5G cellular networks are designed to support a new range of applications...
12/11/2021

### Achieving Low Complexity Neural Decoders via Iterative Pruning

The advancement of deep learning has led to the development of neural de...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Next-generation cellular systems, labeled as 5G, are targeting low latency and ultra-high reliability to support new forms of applications, e.g. mission critical communications. One of the key technologies for 5G will be massive MIMO, where the base-stations will be equipped with tens to hundreds of antennas [1, 2, 3]

. In this paper, we explore how to leverage the large number of spatial degrees of freedom to minimize latency while ensuring high reliability.

Current cellular system design follows a layered approach. The queueing latency is managed at MAC and higher layers, while the target error rate is managed separately by the physical layer to maximize the physical layer throughput. For example, the transmission rate is often chosen to meet a fixed target error rate (around %). This decoupled design is shown to be nearly throughput optimal [4] for single-antenna systems. However, such a decoupled design may not achieve low latency.

As 5G pushes to low latency (10-100 times lower latency than current systems) and ultra-high reliability, it is of paramount importance to control the latency and service unreliability caused by retransmissions. The 5G Ultra-Reliable Low-Latency Communication (URLLC) has a reliability requirement of [5]

, i.e., the probability of packet successful delivery within

round of transmissions ( ms5G frame length) should be higher than %. To satisfy such reliability requirement, the target error rate cannot exceed %. It might be natural to choose the smallest possible target error rate, as a smaller target error rate results in higher link reliability and less retransmission latency. However, since the overall system latency is the sum of latency due to queueing and due to retransmissions, choosing the smallest target error rate does not always minimize the overall latency. In this paper, we achieve reliability guaranteed latency minimization by jointly optimizing the target error rate and the transmission rate policy.

While it is widely known that the error rate reduces with a higher power or a lower transmission rate, the relationships between the error rate and latency are more complex. There is a tradeoff between retransmission latency and queueing latency, and both are impacted by the target error rate: On the one hand, the retransmission latency reduces as the error rate reduces. On the other hand, if the system is fixed to an extremely low error rate, fewer packets can be transmitted in each frame i.e., the transmission time to send the same amount of packets increases, and packets have to wait for a longer time in the queue. Therefore, under a given arrival rate, the queueing latency increases as the error rate and transmission rate reduce. The situation is further complicated by the fact that current mobile users can adapt their transmission powers, which makes the feasible (transmission rate, error rate) tuple time-varying. Fig. 1 depicts an example of the minimum latency achieved at different error rates with the optimal transmission rate policies (developed later in Section III). For the specific example in Fig. 1, an error rate (1%) smaller than the 5G URLLC reliability requirement (error rate of 3.16%) results in the minimum latency. It is clear that we need to find a sweet spot on the error rate that minimizes the overall latency by balancing the queueing latency with the retransmission latency.

Most of the existing work on massive MIMO focuses on the physical layer aspects, a layer at which latency optimization can not be addressed. Massive MIMO was shown to provide higher spectral efficiency [1, 6, 7, 8], wider coverage [9, 3] and easier network interference management [10, 7, 9] than traditional MIMO. This work differs from previous massive MIMO works in that we provide reliability guaranteed latency-optimal transmission control. There are also prior works that optimized the retransmission process, either for throughput maximization [4] or energy efficiency [11] maximization. In addition, cross-layer optimization [12, 13, 14, 15] is known to have the potential to achieve latency reduction. For a point-to-point system, past studies [16, 17, 18, 19] showed that using the queue-length information for transmission rate control can reduce queueing latency.

To the best of our knowledge, this paper is the first work to identify the tradeoff between retransmission latency and queueing latency. We further optimize the error rate and transmission rate policy to achieve the optimal tradeoff. The main contributions of our work are the following:

• We formulate the reliability constrained latency-minimization problem as the joint control design of the error rate and the transmission rate policy. We cast this optimization problem as a Markov decision process and solve it by value iteration.

• Because Markov decision process does not provide insights on the optimal control, we develop a deterministic control for large-array systems with constant arrival rate, which is an important 5G URLLC type of traffic (like the time-sensitive VoIP service [20]). The joint error rate and transmission rate policy is labeled as Large-arraY Reliability and Rate Control (LYRRC). LYRRC is a low complexity, closed-form solution to the latency minimization problem in the large-array regime: The optimal transmission rate is , where is the packet arrival rate. The optimal error rate of the physical layer is , where is the CDF of the effective channel gain (defined later), is the array size, is the number of users, and is the traffic arrival load over link capacity. The is the number of uplink pilots, which captures the impact of interference from imperfect channel knowledge. Furthermore, we discover that the total latency is determined by the array size , the number of pilots , the number of served users , and . In particular, for , we show that the average waiting time diminishes to zero as the array size increases to infinity.

• To verify LYRRC’s performance and usefulness in the wild, we measure massive MIMO channels on the GHz with Rice Argos platform [2], which consists of a -antenna base-station and multiple mobile users. Based on the measurements, we find LYRRC with 5G self-contained frame [21, 22, 23] can simultaneously meet the ms latency and % reliability criterion. Under the same conditions, the best latency of fixed error rate control policies ( error rate) is more than ms. On the measured channels, we find that LYRRC provides latency reduction compared to current LTE transmission control ( error rate with peak power control). Compared to the best (queue-length based) rate adaption with fixed error rate (), LYRRC achieves a latency reduction.

The remainder of this paper is structured as follows. In Section II, we provide a physical layer abstraction and new network model for latency minimization problem of a single user wideband massive MIMO with retransmissions. Section III provides an algorithm to solve the proposed latency minimization problem. A simple yet latency-optimal transmission control, LYRRC, is further investigated in the large-array regime in Section IV. Additionally, we capture how the minimum latency reduces with larger antenna array sizes in closed-form. In Section V, we extend our single-user analytical results to multiuser massive MIMO systems. We provide numerical results in Section VI and conclude in Section VII.

## Ii System Model and Problem Formulation

In this paper, we consider a multi-user massive MIMO uplink system with retransmission. The base station only has imperfect channel knowledge from pilot estimation. The transmission of each user is under a maximum error rate constraint. Each user can control the average packet latency by designing both the target error rate and the transmission rate policy. We now formulate the reliability constrained latency minimization problem.

### Ii-a System Model

Fig. 2 depicts a discrete time-slotted model that consists of an -antenna base-station and a single-antenna user. The extension to multi-user systems is presented in Section V. We take inspiration from the recent 5G proposals [21, 22, 23] and assume that the system operates in self-contained frames, as shown in Fig. 3. A self-contained frame consists of both the data transmission and the immediate ACK/NACK. Without loss of generality, the duration of each frame is of unit and Frame spans the time interval . Within each frame, the user first transmits the encoded uplink data to the base-station. The base-station then feeds back an ACK or NACK to signal whether a decoding error occurred via downlink without error.

#### Ii-A1 Physical Layer Model

During the (uplink) data transmission, received signal by the base-station over the wideband channel is

 yn=√γhnxn+zn,n=1,...,N, (1)

where is the subcarrier index and is the total number of subcarriers. Here is the transmitted signal and

is a zero-mean circularly symmetric complex Gaussian noise vector. The frequency independent large-scale fading channel constant is

. The frequency selective channel vector follows Rayleigh fading and is . We adopt the block fading model to capture the channel fading processes. According to the model, the small-scale fading vector are the same during each frame. And is i.i.d. across frames and subcarriers.

During the data transmission in each frame, the user transmits uplink pilots. The base-station estimates the channel with MMSE. The estimated (small-scale) channel vector is then

 ^hn=τpτγτpτγ+1hn+1τpτγ+1en, (2)

where is the power of the pilots and is a zero-mean circularly symmetric complex Gaussian noise vector. Therefore, after applying the receive beamformer, the received signal is

 ^yn=^hHnyn=τγpττγpτ+1√γhHnhnxn+√γτpτγ+1hHnenxn+^hHnzn.

Here, the first term denotes the signal and the other two terms capture the interference and noise. According to the above signal model, the received on Subcarrier is

 SINRn=τγpτγp∥hn∥2τγpτ+γp+1, (3)

where is the power of uplink data transmission.

The user is considered aware of only the large-scale channel and the distribution of the small-scale channel via the estimation of a periodic indication signal that is broadcast by the base-station [24]. We consider the transmission power of the user to satisfy a long-term power constraint of . During each frame, the uplink packets to be transmitted are encoded within a single code block that spans all subcarriers. When packets are scheduled, the error rate is a function of the transmission power. We use the following model to capture the interplay between imperfect channel, error rate, transmission rate, and power.111 In this paper, we use for notational simplicity. One can also replace by the term in (4), and the expression in the effective channel gain (11) and power mapping (12) could be changed accordingly. Our simulation in Fig. 4 suggests that (4) is sufficiently accurate for LPDC codes at moderate .

 ϵ= Prob[N∑n=1log(SINRn)≤rL], (4)

where is the number of information bits in each packet and is the number of transmitted packets per frame, which is referred to as transmission rate. In [25, 26, 27], it was shown that with strong channel coding, (4) closely captures the error rate for sufficiently high . Fig. 4 provides an example of (4) for the case of LDPC based system.

During Frame , the transmission power is adapted, based on the transmission rate , and the channel estimation accuracy, to achieve the selected error rate . We let denote the used transmission power, whose expression is provided later in (12).

#### Ii-A2 Buffer Dynamics with Retransmission

We assume that there is no packet in the buffer at time . During each frame, new packets arrive in the queue222 Our model and analysis can be directly generalized to the case where the number of new arrival packets across frames follow an independent and identically distribution. . And each packet contains -bits. After receiving uplink data, as shown in Fig. 3, the base-station notifies the user on whether an error has occurred via an immediate downlink ACK/NACK. In this work, we assume the feedback messages are correctly decoded due to the beamforming gain and the limited downlink ACK/NACK rate (ACK/NACK is -bit for each transmission). Upon ACK, the transmitted packets are removed from the buffer. Upon NACK, the transmitted packets remain at the buffer queue head333 It is possible to reduce the power of retransmissions via the joint decoding of failed packets and retransmissions. For mathematical tractability, we consider that the receiver will discard packets which cannot be decoded.. We use the indicator function to represent decoding success, means success and otherwise. The distribution of the is determined by the chosen error rate as and .

At time , let be the queue-length of the buffer, and be the number of packets to be transmitted at Frame as per the control decision. The buffer-length process follows

 qt+1=min[max(qt+λ−1trt,λ),B], (5)

where is the size of the buffer. Then the state space of the buffer, , is . If the number of the generated packets is larger than the remaining capacity of the buffer, not all packets can enter the buffer and buffer overflow occurs. Let denotes the number of dropped packets due to overflow, and is given by

 bt=max(qt+λ−1trt−B,0). (6)

When packet overflow happens, the dropped packets induce significant latency in a realistic system. To capture the buffer overflow latency penalty, we assume that each overflowed packet introduces a large latency penalty of . The average number of dropped packets due to overflow, measured in packets per frame, is . We are interested in minimizing the average packet latency (from arrival to successfully delivery). We consider the stationary policies are complete, i.e., the minimum latency can be achieved by a stationary policy. Under a stationary policy, the queueing latency of successfully served packets are

 limT→∞1TT−1∑t=0qtλ−λdrop, (7)

which is by Little’s Law [29]. To summarize, if a packet is dropped, its latency is and if a packet is successfully served (not dropped), its latency is (7). The average latency is then

 D=λ−λdropλlimT→∞1TT−1∑t=0qtλ−λdrop+λdropλDdrop=¯qλ+λdropλDdrop, (8)

where is the proportion of the packets are successfully served and is the average queue-length, i.e., .

### Ii-B Single-user Latency Minimization Problem

We first formulate and analyze the joint error rate and transmission rate control in a single-user system that is shown in Fig. 2. The multiuser extension will be presented in Section V.

We define the system state as the queue-length whose state space is . The objective of the transmission controller is to minimize the average packet latency under a long-term average power constraint. Based on the power constraint, arrival rate, large-scale channel, and the array size , the transmission controller chooses an error rate from a finite set . This selected error rate remains constant in all frames. At the beginning of Frame (time ), the transmission rate policy determines the number of packets to send based on the system state of Frame , which is the queue-length . Denote the above stationary transmission rate policy as . Following (3) and (4), the transmission power is adapted based on the designed rate , error rate , and number of pilot , which is denoted as . Both the transmission rate policy and the resulting transmission power are independent of the exact small-scale fading as it is unknown to the user.

For any error rate and transmission rate policy

, we assume that the resulted Markov chain of the system states is ergodic. The associated unique steady state of the system is denoted as

. The objective of the joint error rate and transmission rate control is to minimize the average packet latency. For the above-mentioned system, the aim is to find the optimal error rate and an optimal sequence of transmission rate (measured in packets per frame) via solving the following optimization problem:

 minimizeϵ∈E, μ:Q→R+ D=¯qλ+λdropλDdrop (9a) subject to limT→∞1TT−1∑t=0p(rt,ϵ,τ)≤P (9b) ϵ≤ϵmax (9c) State Transition Model~{}(???),~{}\eqref% {equ:p_to_epsilon},~{}(???),~{}(???), (9d)

where it the maximum error rate (minimum reliability) required by the user service. For 5G URLLC, .

By configuring the base-station with a larger array, the average latency can be potentially reduced due to increased spatial degrees of freedom. In this work, we want to capture how the minimum latency changes as a function of the array size . For each given pair of large-scale fading and arrival rate, the minimum latency for different array sizes can be found by solving (9). We denote the relationship between the minimum latency and the array size as the function , which is referred to as the array-latency curve.

Notation: We use boldface to denote vectors/matrices. We use to denote the -norm and to denote complex space. The space of real value is whose positive half is denoted as . The following notations are used to compare two non-negative real-valued sequences , : if ; if .

## Iii Latency-Optimal Single-User Transmission Control

In this section, we first formulate the latency minimization problem (9) as a constrained average cost Markov Decision Process (MDP) and solve it by a proposed algorithm.

### Iii-a MDP Formulation

In each frame, the transmission power is adapted based on the channel gain, scheduled transmission rate, and the selected error rate. We now formally quantify the interplay between the power, channel and control actions. Substituting the derived  (3) into the channel outage model (4), we have that

 ϵ=Prob⎡⎢⎣(N∏n=1κn)1/N≤exp(rL/N)Mγp(1+pτpτ+1τpτγ)⎤⎥⎦. (10)

where is the per-antenna channel gain

. Thus, the error rate is modeled as the probability that a random variable is smaller than a constant. The right-hand-side constant (of the inequality) is independent of the small-scale fading channel. And the small-scale fading determines the distribution of the left-hand-side. Let

denote the random variable as

 ηΔ=(N∏n=1κn)1/N, (11)

which is referred to as the effective channel gain. Since signals received with different antennas are combined during the linear beamforming, is the arithmetic mean of the small-scale channel gain across the antenna. For a coded system, the total mutual information is the linear sum of the per-subcarrier

. Therefore, the effective channel gain is the geometric mean across the

subcarriers. We let

denote the cumulative distribution function (CDF) of the effective channel

. And the inverse CDF of is . With some algebraic manipulations, using (4), we have

 p(r,ϵ,τ)=[MγF−1η(ϵ)exp(rL/N)τpτγτpτγ+1−γτγpτ+1]−1, (12)

where is the inverse CDF of the effective channel gain in (11). When increases, the base-station has a more accurate channel estimation and the needed transmission power (at the same rate with the same reliability) reduces. One can observe that the required transmission power increases with the transmission rate and the packet size , and decreases with the array size and the number of subcarriers .

The system state space of the queue length is denoted as . For a selected error rate , and a stationary transmission rate policy , based on the definition of average latency (8), we define the induced latency cost mapping on each state action pair as

 d(q,r,ϵ)=qλ+bλDdrop,

where is the number of the dropped packet due to buffer overflow as shown in (6). By taking expectation over the steady state of the system , the average latency is then given by

 Dπ=limT→∞1TT−1∑t=0Eπ[d(qt,rt,ϵ)].

Similarly, utilizing the transmission power characterization in (12), the average power is

 Pπ=limT→∞1TT−1∑t=0Eπ[p(rt,ϵ,τ)].

Given an average power constraint , the objective of the joint error rate selection and transmission rate control is restated as a constrained MDP as

 Minimize Dπ subject to Pπ≤P, ϵ≤ϵmax, % and State Transition Model~{}(???),~{}(???),~% {}(???),~{}(???) . (13)

The constrained MDP (13) is converted to an unconstrained MDP via Lagrange’s relaxation as

 Minimize Dπ+βPπ subject to ϵ≤ϵmax, and% State Transition Model~{}(???),~{}(???),~{}% (???),~{}(???) . (14)

The results in [30, 31] provide a sufficient condition under which the unconstrained MDP is also optimal for the original constrained problem (9). For all policies such that , the sufficient condition provided by [30, 31] is satisfied. Thus, when the constraint is binding, there exists zero-duality gap between original problem (9) and the unconstrained MDP (14), i.e., their optimal solution is the same.

We now present the algorithm to solve (14) in Section III-B. The closed-form solution of (14) and the characterization of the array-latency curve are presented in Section IV.

### Iii-B An Algorithm to Solve for ϵ∗, μ∗, D∗

For each error rate that is smaller than , and , problem (14) is an MDP with an average cost criterion and with infinite horizon. For each and , we thus find the optimal transmission rate policy by considering the corresponding -discounted problem [32]. For each system state , define value cost function as

 Vα(q)Δ=minμEπ[∞∑t=0αt(d(rt,qt,ϵ)+βp(rt,ϵ,τ))],

where is the discount factor. For each and , we want to find a stationary policy for all -discounted problem with , i.e., the Blackwell optimal policy. For the considered finite state MDP, the Blackwell optimal policy [32] exists and is also optimal for the average cost problem (14). The Bellman’s equation of the above -discounted problem is then

 V∗α(q)=minμ{ d(r,q,ϵ)+βp(r,ϵ,τ)+ [(1−ϵ)V∗α(min(q+λ−r,B))+ϵV∗α(min(q+λ,B))]}, (15)

whose state transition is described by (4), (5), and (6). Using dynamic programming with value iteration [32] over (15), we can solve the -discounted problem. Since the discounted cost is bounded, we have that by updating the cost value using (15), the solved optimal transmission rate control converges to  [32].

For each error rate , to find whose solution of (14) satisfies the long-term power constraint , we can use the binary search method due to the following observation. The binary search method guarantees the convergence to the optimal solution for (13) because for each , the average power is monotonically non-decreasing on . Finally, by solving the latency minimization problem for each , we can find the optimal . We summarize the above steps in Algorithm 1.

With Algorithm 1, we can solve (13) to find the optimal error rate and transmission rate policy, and the minimum average latency. To provide insights on the structure of optimal transmission controls, we now resent a closed-form characterizations when in Section IV.

## Iv Large-Array Latency-Optimal Control

In this section, we evaluate how the optimal solution to the latency minimization problem (9) behaves as the array size . Specially, we seek to find the minimum achievable latency for systems with large array.

We start with the following assumptions. We consider the distribution of the per-antenna channel gain  (12) to satisfy the following assumptions.

• Its mean grows linearly as increases, i.e.,

 limM→∞E[κn]=O(1). (16)
• Its variance is inverse proportional to , i.e.,

 limM→∞Var[κn]=O(1M). (17)

It is worthwhile to comment that the above assumptions on the per-antenna gain are reasonable and hold true in many practical systems. For example, for a single user system with imperfect channel and in Rayleigh fading environment, it is straightforward to check that the mean condition (16) and variance condition (17) are both satisfied. In Section V, we will show that conditions (16) and (17) also hold true in uplink multiuser systems with imperfect channel.

Based on the condition of mean (16), the achievable rate of the link (on each subcarrier) converges to as the array size . Here, is a fixed constant that does not increase as increases. Hence,

can be viewed as the link “capacity”. In a practical coded system targeting low-latency and high reliability, only part of the capacity can be achieved. In asymptotic analysis, we define the system utilization factor to be a constant

as

 ρΔ=λLNlogM, (18)

where is the packet arrival rate, is the number of bits in a packet, and is the number of subcarriers. Hence, under (18), the packet arrival rate increases with the array size and equals . Conceptually, the term can be viewed as the total “capacity” of the wideband link and can be viewed as the data load. Thus, the utilization factor can be interpreted as the ratio between the offered data load and the total link “capacity”.

We also make the following assumptions for mathematical tractability. We consider an infinite buffer (i.e., ), thus no buffer overflow or overflow latency occurs. And the error rate can be chosen from a continuous set . We also consider that there does not exist a finite positive value such that444For the degenerate case where there exists a finite such that , there is a finite array size with which all incoming traffic can be instantly served without any waiting time and without any channel-induced retransmission. .

### Iv-a Array-Latency Scaling Lower Bound

Notice that a trivial lower bound of is frame, which is the first transmission attempt of a packet. This frame latency lower bound can only be achieved if the transmission power can be arbitrarily high. We now provide a tighter lower bound of the array-latency curve . We will later present a simple yet optimal transmission control policy that can achieve such lower bound in the large-array regime.

###### Theorem 1 (Latency Scaling Lower Bound).

The optimum array-latency curve satisfies

 D∗(M)−1≥ϵo1−ϵo, (19)

where equals and is the minimum error rate that results in finite average queue length. Function is the CDF of the effective channel gain in (11). Here, is the utilization factor in (18), and is the number of pilots.

###### Proof.

The main idea is to first lower bound the average latency by considering only the packet retransmissions latency. We then complete the proof by converting the latency lower bound via the Jensen’s inequality. Appendix A provides the proof details. ∎

Theorem 1 presents a latency lower bound. Theorem 1 states that selecting any error rate smaller than leads to a very large queueing latency. Both and the latency lower bound increases as the channel knowledge reduces, i.e., as the number of pilots reduces. For example, if the channel estimation error is large (), and the retransmission latency becomes large. Therefore, without good enough channel estimation, neither the reliability target and latency target can be met. Given the highly complex nature of the considered lossy channel with retransmissions, it is not immediately clear whether such a latency lower bound in Theorem 1 is achievable. We show that there exists a simple transmission control policy that is latency-optimal as in Section IV-B.

### Iv-B Large-Array Optimal Error Rate and Transmission Rate Control

In this subsection, we first present a simple transmission control and then prove that it is latency-optimal as .

###### Definition.

We define the Large-arraY Reliability and Rate Control (LYRRC) to be

 ⎧⎪⎨⎪⎩ϵlΔ=Fη[1M1−ρ(1Pγ+1τpτγ+1Pτpτγ2)]μl:r(q)Δ=min(q, 2λ). (20)

The LYRRC policy contains two parts: an error rate control policy and an transmission rate control policy . Here, the superscript stands for large-array. In the error rate control policy , a smaller error rate is selected for systems with larger array size or if the utilization factor reduces. In addition, the select error rate increases with channel estimation error. The transmission rate policy describes a simple thresholding rule: If there are more than packets in the buffer queue, i.e., , packets will be transmitted. If less than packets are currently in the buffer, all packet in the queue will be scheduled for transmission in the frame. In each frame, based on the transmission rate of , the user utilizes power adaption (12) to achieve the error rate target . The arrival rate scales linearly as . Hence, the error rate and the transmission rate of LYRRC are both determined by the array size .

To provide insights on the reasoning behind , we consider the associated Markov chain of the buffer-length. The buffer-length state transition under any error rate , which is not necessarily equal to , and the transmission rate policy is depicted in Fig. 5. By Little’s Law, the average latency equals to the ratio between the average queue-length and the arrival rate . Notice that is the difference between the adjacent states in Fig. 5. Hence, the average queue-length is in proportional with (see Appendix B for a rigorous proof). As a result, the average latency depends only on the error rate , but not on .

The transmission rate control policy applies a negative drift with probability towards the minimum queue-length . To minimize the latency as , the queue-length needs to be regulated towards the minimum queue-length . This regulation is achieved by selecting a smaller error rate. As mentioned above, the error rate of LYRRC (20) reduces as the array size increases. We conclude that the achieved latency under the LYRRC is a function of the error rate which reduces as . Next, we will characterize the latency under LYRRC and prove that it is asymptotically latency-optimal.

###### Lemma 1 (Latency Under Transmission Rate with Thresholding).

Under any error rate and transmission rate policy , the average latency is .

###### Proof.

The main idea is to compute the steady state distribution of the queue-length, which is a Markov chain with infinite countable states. Appendix B provides the complete proof. ∎

Lemma 1 provides a closed-form characterization of the transmission rate policy when the maximum buffer-length is infinite. By using Lemma 1, we have that the achieved latency of LYRRC is . We next prove the optimality of LYRRC (20) by comparing the achieved latency to the minimum latency lower bound in Theorem 1.

###### Theorem 2 (Optimal Large-Array Control).

For any and positive , as , LYRRC (20) guarantees that the average latency is within a vanishing gap from optimal as

 Dϵl,μl(M)−D∗(M)≅(ϵl)2={Fη[1M1−ρ(1Pγ+1τpτγ+1Pτpτγ2)]}2, M→∞, (21)

where is the average latency by LYRRC. And denotes that .

###### Proof.

We first characterize the gap between latency under LYRRC and minimum latency by combining Lemma 1 and Theorem 1. The proof is complete by using the large deviation theory to show that the power constraint is satisfied. Please see Appendix C for details. ∎

Theorem 2 establishes the optimality of LYRRC. In addition, the latency gap between the lower bound and LYRRC increases as the channel estimation error increases ( reduces). Furthermore, Lemma 1 and Theorem 2 suggest that LYRRC reduces latency by selecting lower target error rate for systems with larger array sizes. Hence, the reliability and low-latency design objectives of 5G URLLC naturally matches with each other for large . Finally, we note that LYRRC can achieve optimal-latency for any , which seems to contradict the transmission rate of . This can be explained by the fact that we are considering a wireless link with power adaption and the probability of transmit at reduces as . Therefore, using larger transmission power (over a few frames) can increase the peak transmission rate beyond the long-term average rate. We next combine Theorem 2 and Theorem 1 to characterize the scaling of the array-latency curve in closed-form.

###### Theorem 3 (Large-Array Latency Scaling).

As , for any positive and , the optimum latency converges to frame as

 (22)

where is the CDF function of the effective channel gain . And denotes that .

###### Proof.

Theorem 1 provides a latency lower bound. The optimal joint control in Theorem 2 serves as an achievability proof and provides an upper bound. The proof is complete by showing that the ratio of the upper bound and the lower bound converges to as . ∎

Theorem 3 provides a closed-form characterization of the large-array latency. In closed-form, it describes the minimum latency as a function of the utilization factor , the channel estimation error, and the array size . As , . Thus, both the retransmission and queueing latency converges to frame. Additionally, we observe that serving less load (smaller ) leads to faster latency convergence rate. Finally, we comment on the impact of imperfect channel state information. For any , the latency convergence to the frame as . However, the convergence speed of is determined by the channel estimation error, which reduces as the number of pilots increases. For a practical system with finite , more accurate channel leads to smaller latency.

During the large-array analysis, we find that the CDF of the effective channel critically determines both the minimum latency and the latency-optimal target error rate. In the real world, the spatial channel correlation [33, 34] can exist because of the limited number of scatterers. Due to such spatial correlation, the distribution of  can differ from the popular Rayleigh fading model. To evaluate the real-world performance of LYRRC, we conduct numerical experiments with measured over-the-air channels in Section VI.

## V Multi-user Extension

In this section, we now consider the -user latency minimization problem over the lossy channel. Still, the base-station only has imperfect channels from uplink pilots estimation. Fig. 6 pictures the setup. In this section, suffix denotes the user index. Each user’s transmission power is subject to an individual long-term power constraint . The multiuser controller decides the error rate and the transmission rate of User . The buffer dynamic of each user is identical to that of the single user counterpart that is described in Section II-A2.

To minimize the system latency of the users at the same time, we associate positive weights  to users. The multiuser latency minimization problem is then

 minϵ[k],μ[k],k=1,2,…,KK∑k=1ωk(¯qt[k]λ[k]+λdrop[k]λ[k]Ddrop)s.t.limT→∞1TT∑t=1pt[k]≤P[k], k=1,…,Kϵ[k]=Prob[N∑nlog(SINRt,n[k])≤rt[k]L], k=1,…,K,ϵ[k]≤ϵmax[k], k=1,…,K, (23)

where is the maximum error rate (minimum reliability) of User . And is the receiver of the -th subcarrier in Frame for User . Here, the buffer length and buffer overflow of User is given by (5) and (6), respectively.

To jointly detect signals from the users, the base-station applies (receiver) beamforming on the received signal. Let matrix denotes the uplink small-scale channel fading between the -antenna base-station and the users. The channel of User on Subcarrier is and is the -th column of . Throughout this section, we consider users are in a rich scattering environment and user channels follow i.i.d. Rayleigh fading.

In practice, the base-station learns the uplink channel matrix by estimating uplink pilots. Denote the estimated channel as . The channel estimation accuracy depends on the number of pilots and the pilot power. With the commonly used MMSE estimator, the estimation error between each base-station antenna and User is an complex Gaussian random variable with zero mean and variance of . Here, and are the number of uplink pilots and the pilot power, respectively. Using the estimated channel, the base-station generates zero-forcing receive beamformers to detect the uplink signal of each user. The beamforming matrix is . One Subcarrier , the corresponding uplink of User is

 SINRn[k]=γ[k]p[k]∥vHn[k]hn[k]∥21+∑j≠kγ[j]p[j]∥vn[k]Hhn[j]∥2,k=1,…,K, (24)

where is the -th column of . The second term in the denominator represents the residual inter-beam interference after the receiver beamforming. Due to channel estimation error, the residual inter-beam interference is always be positive. Previous work has shown that, with MMSE channel estimator of pilots, the uplink effective is [7, Eq.]

 SINRn[k]=pkγk(1+∑Ki=1p[i]γ[i]τpτ[i]γ[i]+1)[(^HHn^Hn)−1]kk, (25)

where denotes the -th diagonal element of a matrix and captures the inter-beam interference penalty.

Due to the interference, for a practical uplink system where each user is unaware of other users’ channel or queue information, the joint error rate and transmission rate policy design appears intractable. To see the difficulty of the joint policy design, let and be the (scheduled) error rate and transmission rate, respectively. Recall that (25) finds that the inter-beam interference of each user depends also on the power and large-scale fading of the other users. This problem is further complicated by the fact that each user’s transmission power changes in each frame based on its current queue-length. Thus, it is extremely difficult for each user with only local knowledge (queue-length and large-scale fading) to infer the exact value of and hence the proper transmission power. As a result, the error rate and transmission rate policy cannot be designed distributedly by each user, which is undesirable for a practical uplink system.

Here, we proceed with the observation that, in real-world systems, the pilot power is usually required to be higher than the data signal power [24]. Hence, the term is upper bounded by , which can be viewed as a worst cast interference penalty. Each user then adjusts its power based on the loss upper bound. Substituting the expression (25) of the multiuser system into (4), we then have that the error rate is now

 ϵ≈Prob⎡⎢⎣(N∏n=1κn)1/N≤(1+Kτ)exp(rL/N)Mpγ⎤⎥⎦, (26)

where the per-antenna gain is

 κn={M[(^HHn^Hn)−1]kk}−1. (27)

Similarly to the single-user case, we also compute the per-frame transmission power as

 p(r,ϵ)=(1+Kτ)exp(rL/N)F−1η(ϵ)Mγ, (28)

where is the scheduled reliability (error rate) target and is the transmission rate (in unit of packet). Here, in (26) is due to that each user considers the upper bound of inter-beam interference. The per-antenna gain (27) is independent of the large-scale channel, transmission power, and hence queue-length of the other users. For each user, the distribution of the effective channel in (11) then becomes independent of the channel, queue-length, and power of the other users. Therefore, we can decouple the multiuser problem. By adopting a new distribution of the effective channel gain (generated by (27)) and the new power mapping (28), the multiuser problem is decoupled to independent single user problems (9). Each of the single-user problems can be solved by Algorithm 1. We now further demonstrate that the large-array analytical results in Section IV also apply to the considered multiuser system with imperfect channel knowledge.

###### Theorem 4.

For multiuser downlink systems, LYRRC becomes

 ⎧⎨⎩ϵl[k]=Fη[1M1−ρ[k](1+Kτ[k])]μl[k]:r[k]=min(q[k], 2λ[k]). (29)

As , for positive and , each user operates under LYRRC achieves the minimum latency of

 D∗[k]−1≅ϵl[k], k=1,2,…,K, M→∞. (30)

Here, denotes that .

###### Proof.

With random matrix theory, we prove by showing that (

27) satisfies the mean (16) and variance (17) conditions. Please find the proof in Appendix D. ∎

LYRRC, therefore, indeed provides the latency-optimal error rates and transmission rate policies to the multiuser massive MIMO system. And Theorem 3 also captures the optimal latency of each user. In conclusion, for any non-negative weights , we can convert the user latency minimization problem into parallel single user optimization problems. For finite , Algorithm 1 solves each of the single user problems and provides the optimal error rate and transmission rate policy. Furthermore, we showed that each user operates using LYRRC distributedly is latency-optimal as . Section VI will use numerical experiments to evaluate the proposed transmission control.

## Vi Numerical Results

In this section, we utilize measured over-the-air channels to confirm our previous analysis in Section III and Section V. During the numerical evaluation, the base-station still only has imperfect channel from pilots estimation. And the latency duration is captured in the unit of second, which is obtained by multiplying frame duration to latency measured in the unit of frame. We measure the over-the-air channels between mobile clients and a -antenna massive MIMO base-station with Argos system [2] on the campus of Rice University. Figure (a)a and (b)b describes the Argos array and the over-the-air measurement setup. We measured the GHz Wi-Fi channel ( MHz, non-empty data subcarriers) for four pedestrian users at different locations, which are denoted by Fig. (c)c. For each location, we take channel measurements over frames of all subcarriers. During measurements, the effective measured between each mobile user and each base-station antenna is higher than dB. In simulations, we consider measured over-the-air channel traces as the perfect channel.

The base-station adopts MMSE estimator to estimate uplink pilots, each of power dBm, from the users. Using the imperfect (estimated) channel, the base-station generates zero-forcing receive beamforming vectors to decode the signal of each user. The user is assumed to follow average power constraint of dBm with large-scale fading of dB. The maximum buffer length is . The packet arrival rate is uniform over the time at the rate of packets per frame. And the packet size is bits per OFDM symbol. The latency penalty of dropped packets from buffer overflow is s. And each self-contained frame is considered of duration ms. The state space of the error rate is , , and . Each user is under a maximum error rate constraint of %, which is equivalent to the 5G URLLC reliability constraint of % (over ms).

Fig. 8 provides the latency performance comparison of four different policies over the measured over-the-air channels with different channel estimation accuracies. The blue lines are the optimal array-latency curves under the proposed joint reliability and transmission rate policy, which is obtained by Algorithm 1. The red lines are the proposed low-complexity LYRRC (20), which was discussed in Section IV. The green colored lines capture the latency under optimal transmission rate adaption but fixed reliability (error rate of ). And the black lines are the latencies of fixed reliability ( error rate) and transmission rate adaption under a peak power constraint, which is currently deployed in LTE and Wi-Fi systems.

The proposed joint control (blue and red lines) clearly provides better latency performance than the two fixed-reliability counterparts. Allowing error rate to be adaptive on array size turned out to reduce the latency significantly. Compared to the fixed error rate with peak power control, for base-stations with array size larger than , a latency reduction compared is observed. Additionally, when array size is larger than , we find that the proposed joint control can provide a latency reduction compared to the state-of-the-art control that fixes error rate and adapts transmission rate [16, 17, 18, 19] (based on array size and queue length). Our large-array asymptotic latency-optimal control, LYRRC, turned out to be near latency-optimal when array size is larger than . It is worthwhile to note that the above-described policy impacts on latency are consistent across different channel accuracies, i.e., systems with different numbers of uplink pilots (). Finally, we find that fixed error rate (at ) policies leads to at least ms latency and cannot satisfy the URLLC latency requirement.

Fig. 8 also captures the influence of imperfect channel state information on latency. For a multiuser uplink system, the inter-beam interference (27) reduces as the channel estimation error reduces, i.e., as the number of pilots increases. And achieving the same reliability (error rate) becomes more power expensive with larger inter-beam interference. Therefore, under all policies, the latency increases as the number of pilots reduces. For example, a base-station with -antennas and perfect channel knowledge, can reduces latency to near the frame first transmission time of ms. But with estimated channel from pilots, achieving the same latency needs a larger array size of . Additionally, even for systems with single uplink pilot (), the proposed joint control satisfies the latency requirement of URLLC with larger than .

We now comment on the optimal error rate that minimizes the latency. Fig. (a)a describes the latency-optimal error rate obtained during solving the latency minimization problems in Fig. 8. The latency-optimal error rate reduces as reduces due to less accurate channel estimation, which agrees with LYRRC. Additionally, due to the reliability constraint, the solved latency-optimal error rates satisfy the 5G reliability requirement (error rate of ).

Finally, we use simulations to verify our structural analysis in Section IV. Fig. 8 confirms that LYRRC (20) is near latency-optimal for larger than a finite number of . One technical contribution independent of the massive MIMO system is a simple transmission rate policy as , which is referred to as “rule of double” and is part of LYRRC. Lemma 1 captures that, when buffer size , the resulted latency by using and a error rate is . Fig. (b)b shows the resulted latency by using with a finite buffer size. The (large-buffer) asymptotic latency turned out to accurately approximate the system latency when is larger than . And as the target reliability increases (error rate reduces), buffer overflow is less likely to happen and the latency approximation in Lemma 1 becomes increasingly accurate.

## Vii Conclusion

In this work, we study the latency-optimal cross-layer control over wideband massive MIMO channels. By identifying a tradeoff between queueing and retransmission latency, we find that a lower physical layer target error rate does not always guarantee lower latency. We present algorithms that generate the optimal error rate and transmission rate policies. We show that to achieve the minimum latency, the target error rate can no longer be considered fixed and needs to be adapted based on different base-station array sizes, channel estimation accuracy, and the traffic arrival rate. Our results also demonstrate that massive MIMO systems have the potential to achieve both high reliability and low latency and are a promising candidates of 5G URLLC.

## Appendix A Proof of Theorem 1

We use a per packet argument. Since infinite buffer is assumed in this section, no packet is dropped and all packets will be successfully received with a variable number of transmissions due to the potential channel-induced error. For any selected error rate , let be the average number of retransmissions, the average retransmission latency and transmission time is

 1+∞∑r=0Prob(r)r=1+∞∑r=0r(1−ϵ)ϵr=1+ϵ1−ϵ, (31)

which is a lower bound of the total latency. To finish the proof, we now lower bound under the long-term power constraint . Under the steady state, the average transmission rate must equal to the packet arrival rate that is

 λ=Eπ[r(1−ϵ)]=Eπ[r](1−ϵ). (32)

Notice that the power function (12) is convex on , we apply Jensen’s inequality and have

 P=Eπ[p(r,ϵ,γ)] ≥{τpτγτpτγ+1γF−1η(ϵ)exp[(ρ1−ϵ−1)logM]−γτγpτ+1}−1.

Here, the second step is by using the considered the arrival packet scaling (18). Function is an inverse CDF and is non-decreasing. A lower bound of is computed as

Using the monotonicity of the CDF, a lower bound on the error rate is then

 (33)

We finish the proof by combining (33) and (31).

## Appendix B Proof of Lemma 1

We compute the queueing latency by considering the steady state. Under transmission rate policy , the buffer length process (5) is rewritten as The buffer length process under thus constitutes a Markov chain with countably infinite states [36]. The distribution of is determined by error rate as and . The state transition is shown in Fig. 5. Denote the steady state distribution of the buffer length as . We then have that

 {πλ=(1−ϵ)πλ+(1−ϵ)π2λπiλ=ϵπ(i−1)λ+(1−ϵ)π(i+1)λ,i≥2,

where . The steady state distribution is then computed as

 πiλ=(1−ϵ1−ϵ)(ϵ1−ϵ)i−1,i=1,2,…. (34)

Using (34), the average latency is then computed as

 1λEπq[q]=1λ(∞∑i=1πiλiλ)=∞∑i=1(ϵ1−ϵ)i−1i−∞∑i=1(ϵ1−