Large MIMO Detection Schemes Based on Channel Puncturing: Performance and Complexity Analysis

by   H. Sarieddeen, et al.
American University of Beirut

A family of low-complexity detection schemes based on channel matrix puncturing targeted for large multiple-input multiple-output (MIMO) systems is proposed. It is well-known that the computational cost of MIMO detection based on QR decomposition is directly proportional to the number of non-zero entries involved in back-substitution and slicing operations in the triangularized channel matrix, which can be too high for low-latency applications involving large MIMO dimensions. By systematically puncturing the channel to have a specific structure, it is demonstrated that the detection process can be accelerated by employing standard schemes such as chase detection, list detection, nulling-and-cancellation detection, and sub-space detection on the transformed matrix. The performance of these schemes is characterized and analyzed mathematically, and bounds on the achievable diversity gain and probability of bit error are derived. Surprisingly, it is shown that puncturing does not negatively impact the receive diversity gain in hard-output detectors. The analysis is extended to soft-output detection when computing per-layer bit log-likelihood ratios; it is shown that significant performance gains are attainable by ordering the layer of interest to be at the root when puncturing the channel. Simulations of coded and uncoded scenarios certify that the proposed schemes scale up efficiently both in the number of antennas and constellation size, as well as in the presence of correlated channels. In particular, soft-output per-layer sub-space detection is shown to achieve a 2.5dB SNR gain at 10^-4 bit error rate in 256-QAM 16×16 MIMO, while saving 77% of nulling-and-cancellation computations.



There are no comments yet.


page 24


Large Multiuser MIMO Detection: Algorithms and Architectures

In this thesis, we investigate the problem of efficient data detection i...

Soft-Output Detection Methods for Sparse Millimeter Wave MIMO Systems with Low-Precision ADCs

The use of low-precision analog-to-digital converters (ADCs) is a low-co...

Low-Complexity Soft-Output MIMO Detectors Based on Optimal Channel Puncturing

Channel puncturing transforms a multiple-input multiple-output (MIMO) ch...

Decision Fusion in Space-Time Spreading aided Distributed MIMO WSNs

In this letter, we propose space-time spreading (STS) of local sensor de...

Optimal Augmented-Channel Puncturing for Low-Complexity Soft-Output MIMO Detectors

We propose a computationally-efficient soft-output detector for multiple...

Concatenated Permutation Block Codes for Space-Time Shift Keying in Indoor Visible Light Communication

In this paper, concatenated permutation codewordsare used to improve the...

Modified zero forcing decoder for ill-conditioned channels

A modified zero-forcing (MZF) decoder for ill-conditioned Multi-Input Mu...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

MIMO technology is a technique that exploits the spatial dimension by adding more antennas [1] to increase spectral efficiency and network capacity. However, conventional MIMO configurations fall short of providing the required spatial diversity in the upcoming fifth generation (5G) mobile communication standard, which promises to connect billions of devices and achieve several gigabit-per-second data rates. Towards this end, massive MIMO has been introduced [2], in which few hundred antennas serve tens of terminals over time and frequency resources.

Despite the extensive work on massive MIMO, large MIMO will also play an important role in the future. Large MIMO systems use tens of antennas in communication terminals, and can afford large number of antennas on both the transmitter and the receiver sides [3], such as for example , , , and configurations. Large point-to-point MIMO wireless links are of specific interest in 5G for high-speed wireless backhaul connectivity between base stations (BSs). Also, multipoint-to-point large multiuser MIMO can be used in 5G in the uplink when the number of served transmitting users is less than, but comparable to, the number of BS antennas. Nevertheless, large MIMO can also be considered for point-to-multipoint downlink multiuser MIMO (MU-MIMO) [4], whether in enhanced versions of the current wireless communications standards, or in 5G, where users sharing the same physical resource blocks are chosen based on the degree of orthogonality of their cascaded precoder and channel.

After being traditionally driven by diversity-multiplexing tradeoffs, recent wireless communication system designs have been driven by two factors; system performance in terms of throughput and bit error rate, and system complexity in terms of processing latency and computational complexity. The performance of MIMO systems is largely determined by the detection scheme at the receiver side; various schemes provide different performance-complexity tradeoffs [5]. Linear detectors, such as zero forcing (ZF) and minimum mean square error (MMSE), are the least-complex, but the least-optimal as well. On the other hand, maximum likelihood (ML) detectors are optimal but most computationally intensive, with complexity that grows exponentially with the number of antennas. Several sub-optimal detectors fill the spectrum in between, including sphere decoders (SD) and their variants [6, 7, 8, 9]. Moreover, in addition to conventional hard-output (HO) detectors, soft-output (SO) detectors play an important role in near-capacity achieving systems, but are more complex because they require processing significantly more signal combinations to generate reliability information.

In massive MIMO systems, linear detectors achieve near-optimal performance by exploiting the channel hardening effect [10], and approximate matrix inversions via Neumann series approximations [11] are used for practical implementations. However, large MIMO systems do not have very large receive-to-transmit antenna ratios. Hence, they cannot achieve the performance gains of asymmetric massive MIMO systems, and they do not allow for similar practical implementations, where Neumann series expansions fail to converge. For large MIMO systems, the detection schemes in the literature are grouped into several areas: detection based on local search [12, 13]

; detection based on meta-heuristics

[14, 15]; detection via message passing on graphical models [16, 17]; lattice reduction (LR) aided detection [18, 19]; and detection using Monte Carlo sampling [20]. However, for these schemes to achieve a near-ML performance with high orders of antennas and modulation constellations, the entailed complexity would be prohibitive.

A popular family of MIMO detectors that achieves good performance-complexity tradeoffs employs non-linear subset-stream detection. The nulling-and-cancellation (N/C) detector [21] is a low-complexity member of this family; it consists of linear nulling followed by successive interference cancellation (SIC). The chase detector (CD) [22]

is a more complex member of this family; it first creates a list of candidate decision vectors, and then chooses the best candidate from this list as a final decision. Chase detection is considered a special case of list detection. However, it differs from list sphere decoding (LSD)

[23] for example in the way the list is generated and administered; in LSD, list admission is based on proximity to an initial solution, while in CD, list generation is deterministic, and is done by spanning all possible sub-tree symbols emanating from the root symbol in a specific layer of interest. Furthermore, other popular subset-stream detectors exist (e.g., [24, 25, 26]), that decompose the channel matrix into lower order sub-channels to reduce the number of jointly detected streams.

All aforementioned subset-stream detectors make use of QR decomposition (QRD). However, the SO sub-space detector (SSD) [27], transforms the channel matrix via a punctured QRD, which we refer to in this paper as WR decomposition (WRD). In [28, 29, 30], WRD-based SSD is generalized to allow for joint detection of arbitrary-sized subsets of decoupled streams, and efficient implementation methods are presented. The QRD-based version of this detector is called the layered orthogonal lattice detector (LORD) [31, 32], and both are special cases of CD. To the best of our knowledge, the use of punctured QRD in MIMO detectors has not been studied analytically in the literature, and its applicability to large MIMO systems has not been addressed.

The contributions of this paper are summarized as follows:

  1. We present a family of WRD-based detectors that build on popular QRD-based detectors. In particular, we propose a punctured ML (PML) detector, a punctured N/C (PN/C) detector, a punctured CD (PCD), as well as a hard-output sub-space detector.

  2. We analyze mathematically the bit error rate (BER) performance of the proposed HO detectors. First, the diversity gain is characterized and used to show that channel matrix puncturing does not negatively affect the diversity gain in HO detection. Second, the performance of these detectors is studied via a probabilistic BER characterization.

  3. We extend the study for several variations of SO detection schemes, and show that significant performance gains can be achieved with channel puncturing.

  4. We propose efficient architectures and analyze the computational complexity of the proposed detectors. We show that the computational savings are much more pronounced with large MIMO dimensions.

  5. We study the performance of the proposed detectors in the context of large MIMO with high order modulations, and in the presence of spatial channel correlation. We show that the performance of these schemes scales up efficiently with high orders, and that they are superior to their QRD-based counterparts in the presence of channel correlation.

The remainder of the paper is organized as follows. The system model and basic reference detectors are presented in Sec. II. The proposed WRD-based ML detector, N/C detector, CD, and SSD detection algorithms are presented in Sec. III. The achievable diversity gains of these detectors are derived in Sec. IV, followed by a probabilistic BER characterization that describes the behaviour of the proposed approaches in Sec. V. The SO versions of the detectors are then proposed in Sec. VI, and an efficient architecture is proposed in Sec. VII alongside a complexity study. Finally, simulation results are presented in Sec. VIII.

Regarding notation, bold upper case, bold lower case, and lower case letters correspond to matrices, vectors, and scalars, respectively. Scalar norms, vector norms, and Frobenius norms are denoted by , , and , respectively. , , , , and , stand for the expected value, trace function, real part, transpose, and conjugate transpose, respectively.

refers to normal distribution, and

refers to the Q-function, where . is a punctured matrix with entries , and

is an identity matrix of size

. Detector ML optimality is in the log-max sense.

Ii System Model and Reference Detectors

Ii-a System Model

We consider spatial multiplexing in a MIMO system with transmit antennas and receive antennas. The equivalent complex baseband input-output system relation is given by


where is the received complex vector,

is the channel matrix with entries that are assumed to be i.i.d. complex, circularly symmetric Gaussian random variables,

is the transmitted symbol vector, and

is a complex-valued circular-symmetric Gaussian random vector with zero mean and variance

(). Each symbol , , belongs to a normalized complex constellation (), and we have , where is the finite set of points on a -dimensional lattice generated by all possible symbol vectors. For simplicity, we assume a uniform modulation constellation on all layers, and hence . The coded bit-representation of a symbol is denoted by , where and for . The signal to noise ratio () is defined in terms of the noise variance as .

At the receiver side, and assuming perfect knowledge of the channel, QRD decomposes as , where has orthonormal columns and , and is a square upper-triangular matrix (UTM) with real and positive diagonal entries. The transformed receive symbol vector can then be equivalently expressed as


where and are statistically identical since is orthonormal.

An “exhaustive” log-max ML detector searches the complete lattice , computing Euclidean distance metrics, to solve for


Note that the SD achieves exact log-max ML performance with less computations, by executing a tree-based search on a subset of , skipping vectors in the space whose partial distance already exceeds the current best distance.

Ii-B Nulling-and-Cancellation (N/C) Detector

The N/C detector [21] is used in the widely known vertical Bell Labs layered space time (V-BLAST) architecture [33]. When combined with QRD, N/C becomes a computationally-efficient procedure which is highly sensitive to layer ordering. Nulling is performed by linearly pre-multiplying the received vector with , which suppresses the interference from , , at the layer. This is followed by SIC (back-substitution and slicing) to suppress co-antenna interference; hence, is computed as


for , where is the slicing operator on the constellation . N/C serves as an upper bound on the performance of other detection schemes.

Ii-C Chase Detector (CD)

The CD [22] mitigates error propagation in SIC by populating a list of candidate symbol vectors for final decision. It first partitions , , and in (2) as


where , , , , , , is a vector of zero-valued entries, and . Then, for each at the root layer, a candidate vector is calculated as in (4) and added to . The maximum number of candidate vectors in is , and the final HO decision vector is chosen from to be


Note that CD differs from LSD [23] in several aspects. For example, LSD list admission depends on run-time channel conditions, which makes it nondeterministic and more complex. Also, in a SO setting, LSD does not guarantee computing all the required distance metrics.

Ii-D Layered Orthogonal Lattice Detector (LORD)

Instead of executing the CD routine once, LORD repeats chase detection with different layer orderings, each time with a different layer as root, by cyclically shifting the columns of . The best output from these trials is the final solution. Each permuted at step , , is QR-decomposed into and according to (5). Let denote the output CD solution from step . Then, the final solution is , where


Since distances are preserved under different layer orderings with QRD, the accumulated candidate vectors across different partitions form an “extended” candidate list, despite the potential overlap of lists from each partition. Therefore, the added gain with LORD compared to CD is significant.

Iii Detection Schemes Based on Punctured Channel Matrix

Iii-a Punctured QR Decomposition (WRD)

WRD transforms into a punctured UTM with , by puncturing entries between the diagonal and the last column through a matrix , such that . A brute force approach for computing [27] involves matrix inversions, which is complex and prone to roundoff error. However, an alternative approach that employs QRD followed by elementary matrix operations can be used to derive and  [29].

Let be QR-decomposed such that . Obviously, and for all , hence, . Now assume the entry in row of is to be nulled, for and . We have and , from which it follows that . Hence, with , the equations


when repeated for , would puncture the required entry and update the entry in row of , as well as update the column of accordingly, while


would normalize in and update the non-zero entries in row of accordingly. All these operations are to be carried for . The resultant is , and the resultant is . The transformed received symbol vector after applying can then be expressed as


such that


where in this case is a diagonal matrix. For example, in the special case of MIMO, is obtained from by puncturing entries :


Note that the column at the root layer in (layer here), remains orthogonal to all other columns. Hence, taking the expectation of over , we have:


Therefore, although the resultant noise after puncturing is colored, WRD preserves the noise variance at the layer of interest. However, the statistical properties of the elements of get distorted under puncturing. The non-zero elements of (given i.i.d. Rayleigh fading) are known to be independent random variables with the following distributions [34, 21]:

where chi-squared comes from the sum of squares of Rayleigh distributed random variables. While the distributions of non-zero off-diagonal elements remain intact, the distributions of diagonal elements at upper layers , lose degrees of freedom from down to , as depicted in Fig. 1 for a channel matrix. This is caused by the fact that each puncturing operation at layer renders the column of dependent on one of the remaining columns, thus eliminating two degrees of freedom from the corresponding distribution of .

Fig. 1:

Empirical cumulative distribution functions (CDFs) of the diagonal elements of (a)

, and (b) shown in dotted lines compared to theoretical chi-squared CDFs in solid lines.

Similar to the ML detector, an “exhaustive” PML detector searches to find


Pre-multiplying by , unlike , modifies Euclidean distances, hence we have


Note that this minimum distance detector is not optimal due to the presence of colored noise.

Iii-B Punctured N/C Detector (PN/C)

With PN/C, we null by pre-multiplying by instead of , and perform SIC as


for , where , and . Note that slicing on layers can be done in parallel since is diagonal.

Iii-C Punctured Chase Detector (PCD)

The PCD builds on the partition in equation (15), and performs the operations of a CD (Sec. II-C). A modified list of candidate symbol vectors is thus created. The distance of a vector is given by


For a given , the distance in (22) is minimized as


where , which is a vectorized slicing operation, and . The symbol vector is then added to , together with its distance . The final HO symbol vector is found from as the one with smallest distance.

While the PCD computes distances only to candidate symbol vectors, for a given layer ordering and channel partition, it is clear from (23) that it achieves the exact performance as that of the PML detector. In other words, there is no vector in the lattice , outside the set , that can have a smaller distance metric than that of the PCD solution. The proof goes as follows:


Iii-D Vector-Based Sub-Space Detector (VSSD)

The VSSD is an extension to PCD, the same way LORD is an extension to CD. The columns of are cyclically shifted, and punctured UTMs are generated as shown in Fig. 2. Each permuted at step , , is WR-decomposed into and according to (15). Let denote the PCD solution from step . The final solution is , where is defined as:


Note that we revert back to the original space of to compute the true Euclidean distance metrics in (30). The gain achieved by VSSD compared to PCD is limited, since each generates an independent space, and hence we end up taking the best output from independent trials. The VSSD is in effect the HO version of the reference SO SSD [28], and we refer to it by simply SSD in the remainder of this paper.

Fig. 2: Punctured channel decomposition structures under different permutations: (a) ; (b) ; (c) ; (d) .

Iii-E Symbol-Based Sub-Space Detector (SSSD)

As a variation of SSD, the SSSD selects at each step , only the root symbol of the output vector as a component of the final output vector. Thus, the output vector gets assembled one symbol at a time over executions of PCD, where


For example, in a MIMO system, we have , where is the HO solution of a PCD following the partition in Fig. 2(a). Similarly , , and , are obtained following the partitions (b), (c), and (d), respectively. Note that we can define symbol-based LORD (SLORD) in a similar manner:


Iv Analysis of Achievable Diversity Gain

It is known that ML detection achieves full receive diversity , and it can be shown that the N/C and PN/C detectors, being special cases of ZF with decision feedback, can only achieve a receive diversity gain of . Moreover, it can be argued that both SSD (VSSD) and LORD also achieve full diversity, since they exploit the full channel matrix to compute distance metrics. In what follows, we study the achievable diversity gains of PML (PCD), SSSD, and SLORD.

Iv-a Punctured ML Detector / Punctured Chase Detector (PML/PCD)

To capture the diversity order of PML, we derive the pairwise error probability (PEP). Suppose that is transmitted, while is erroneously detected, the PEP can be expressed as


where is the probability that event occurs, and . Since consists of circular symmetric complex Gaussian random variables, then so is . It is easy to show that


where is introduced since is a scalar. Hence, we have

and therefore,


where the inequality holds since (section 5.2 in [35]). Moreover, using union bound, we have


where , and . Finally, using the Chernoff bound, the average PEP is upper bounded as


where since the columns of were normalized in (11).

For regular ML detection [36, 5, 37], we have


where the expected value over the elements of results in full receive diversity , because each column of contains

independent Rayleigh distributed random variables, whose square is exponentially distributed. However, with

instead of in PML detection, the first columns have single diagonal elements, whose squares are chi-squared distributed with degrees of freedom, which corresponds to two exponentially distributed complex random variables, and hence a receive diversity order equal to . Only column of provides a diversity equal to . Therefore, by analogy with (41), the average PEP for the PML detector is


and hence PML detection can not achieve a receive diversity gain of order greater than 2. However, noting that PML and PCD are identical (Sec. III-C), and knowing that the regular CD achieves a receive diversity order of 2 (more on that in Sec. V-B), we conclude that channel puncturing does not reduce the diversity gain of the CD.

Iv-B Symbol-Based Sub-Space Detector (SSSD)

To capture the diversity order of SSSD, we derive a modified PEP. Without loss of generality, we assume that layer is the root layer of interest. Hence, an error occurs when is transmitted and is erroneously detected, with probability


where , is computed as in Sec. III-C, and are the Nth column of and , and and are the first columns of and , respectively. Let , and let ; we have


Since and the columns of are circular symmetric complex Gaussian, then so is . Thus, it can be shown that


Hence, continuing from (IV-B), we have


Then, using union and Chernoff bounds, with (), the average PEP can be upper bounded as


where the last approximation holds since the second exponential term is less that , with equality at high (. Finally, taking the expectation over all squared elements of , which are exponentially distributed, we obtain


The denominator represents noise plus interference, hence, SSSD appears to achieve a full receive diversity gain at the layer of interest when BERs are plotted in terms of signal-to-interference-plus-noise ratio (). In the case of SLORD, following a similar derivation, the average PEP can be expressed as


where , and consists of the first columns of . Note that can be expressed as