# Capacity-Achieving MIMO-NOMA: Iterative LMMSE Detection

This paper considers a low-complexity iterative Linear Minimum Mean Square Error (LMMSE) multi-user detector for the Multiple-Input and Multiple-Output system with Non-Orthogonal Multiple Access (MIMO-NOMA), where multiple single-antenna users simultaneously communicate with a multiple-antenna base station (BS). While LMMSE being a linear detector has a low complexity, it has suboptimal performance in multi-user detection scenario due to the mismatch between LMMSE detection and multi-user decoding. Therefore, in this paper, we provide the matching conditions between the detector and decoders for MIMO-NOMA, which are then used to derive the achievable rate of the iterative detection. We prove that a matched iterative LMMSE detector can achieve (i) the optimal capacity of symmetric MIMO-NOMA with any number of users, (ii) the optimal sum capacity of asymmetric MIMO-NOMA with any number of users, (iii) all the maximal extreme points in the capacity region of asymmetric MIMO-NOMA with any number of users, (iv) all points in the capacity region of two-user and three-user asymmetric MIMO-NOMA systems. In addition, a kind of practical low-complexity error-correcting multiuser code, called irregular repeat-accumulate code, is designed to match the LMMSE detector. Numerical results shows that the bit error rate performance of the proposed iterative LMMSE detection outperforms the state-of-art methods and is within 0.8dB from the associated capacity limit.

• 63 publications
• 5 publications
• 110 publications
• 29 publications
• 52 publications
07/18/2018

### Practical MIMO-NOMA: Low Complexity & Capacity-Approaching Solution

MIMO-NOMA combines Multiple-Input Multiple-Output (MIMO) and Non-Orthogo...
11/21/2019

### Max-Min Fair Precoder Design for Non-Orthogonal Multiple Access

In this paper, a downlink multiple input multiple output (MIMO) non-orth...
03/11/2021

### RIS-Assisted Code-Domain MIMO-NOMA

We consider the combination of uplink code-domain non-orthogonal multipl...
10/25/2018

### Gaussian Message Passing for Overloaded Massive MIMO-NOMA

This paper considers a low-complexity Gaussian Message Passing (GMP) sch...
11/22/2021

### Capacity Optimal Generalized Multi-User MIMO: A Theoretical and Practical Framework

Conventional multi-user multiple-input multiple-output (MU-MIMO) mainly ...
03/03/2021

### Terahertz-Band MIMO-NOMA: Adaptive Superposition Coding and Subspace Detection

We consider the problem of efficient ultra-massive multiple-input multip...
07/22/2019

### Achievable Rate Region for Iterative Multi-User Detection via Low-cost Gaussian Approximation

We establish a multi-user extrinsic information transfer (EXIT) chart ar...

## I Introduction

Recent investigations have shown that Multi-user Multiple-Input Multiple-Output (MU-MIMO), where multiple single-antenna users communicate with a multi-antenna Base Station (BS), has become increasing important due to their potential applications in 5G cellular systems and beyond [1, 3, 2, 4, 6, 5]. In particular, massive MU-MIMO has been shown to be able to bring significant improvement in throughput and energy efficiency [3, 4].

Multiple access schemes, the fundamental techniques of coordinated multi-user communication in the physical layer, play the most important role in each cellular generation. Frequency Division Multiple Access (FDMA), Time Division Multiple Access (TDMA), Code Division Multiple Access (CDMA), and Orthogonal Frequency-Division Multiple Access (OFDMA) are the conventional Orthogonal Multiple Access (OMA) schemes, which orthogonalize users in time/frequency/code domain to avoid multi-user interference [7, 8]. Due to the orthogonality of OMA, no inter-user interference exists at the receiver side. Hence, simple single-user signal processing in the conventional point-to-point communication can be directly used for OMA. However, there is no free lunch. First, OMA is not able to achieve all points in the capacity region of multiuser access channel (MAC). Besides, massive connectivity will be the key scenario in the future wireless communication, and thus the limited radio resources cannot support the massive orthogonal access devices in the OMA any more. Apart from that, user scheduling such as resource allocation is required for orthogonal users in OMA, which leads to heavy additional overhead and results in large latency and high processing complexity in massive connectivity system.

Recently, Non-Orthogonal Multiple Access (NOMA), where all the users can be served con-currently in the same time/frequency/code domain, has been identified as one of the key radio access technologies to increase the spectral efficiency and reduce latency in 5G mobile networks [10, 9, 14, 13, 8, 16, 15, 11, 12, 17]. As opposed to OMA, the key concepts behind NOMA are summarized as follows [18, 16, 20, 19, 17].

• All the users are allowed to be superimposed at the receiver in the same time/code/frequency domain.

• All points in the capacity region of MAC are achievable.

• Interference cancellation is performed at receiver, either Successive Interference Cancellation (SIC) or Parallel Interference Cancellation (PIC).

More recently, to enhance spectral efficiency and reduce latency, MIMO-NOMA that employs NOMA techniques over MU-MIMO is considered as a key air interface technology in the fifth-generation (5G) communication system [23, 22, 19, 21, 20, 18, 17]. Therefore, we focus on MIMO-NOMA in this paper.

### I-a Challenge of Multi-User Detection in MIMO-NOMA

Unlike the MIMO-OMA, signal processing in MIMO-NOMA will cost higher complexity and higher energy consumption at BS [3, 2]. Low-complexity uplink detection for MIMO-NOMA is a challenging problem due to the non-orthogonal interference between the users [3, 13, 11, 12], especially when the number of users and the number of BS antennas are large. The optimal multiuser detector (MUD) for the MIMO-NOMA, such as the

maximum a-posteriori probability

(MAP) detector or maximum likelihood (ML) detector, was proven to be an NP-hard and non-deterministic polynomial-time complete (NP-complete) problem [24, 25]. Furthermore, the complexity of optimal MUD grows exponentially with the number of users or the number of BS antennas, and polynomially with the size of signal constellation [25, 26].

### I-B Background of Low-Complexity Multi-User Detector

Several low-complexity multi-user detectors have been proposed in the literature. They are mainly divided into three categories: uncoded detection, coded SIC detection, and coded PIC detection.

#### I-B1 Uncoded Low-Complexity Detection

Many low-complexity linear detections such as Matched Filter (MF), Zero-Forcing (ZF) receiver, Minimum Mean Square Error (MMSE), and Message Passing Detector (MPD) [7, 27] are proposed for the practical systems. In addition, some iterative methods such as Jacobi method, Richardson method[28, 29, 30], Belief Propagation (BP) method, and iterative MPD [6, 5, 31, 32] are put forward to further reduce the computational complexity by avoiding the unfavorable matrix inversion in the linear detections. Although being attractive from the complexity view point, these individual detectors are regarded to be sub-optimal MUDs, where decoding results are not fed back to the detector. As a result, the multi-user interference is not cancellated sufficiently.

#### I-B2 Coded SIC Detection

SIC, where correct decoding results are fed back to the detector for perfect interference cancellation, is one of the key technologies to improve the detection performance. It is well known that for the MAC, the SIC is an optimal strategy and can achieve all points in the capacity region of MIMO-NOMA with time-sharing technology [33, 34]. Besides, the MMSE-SIC detector [37, 38] has been proposed to achieve the optimal performance [7]. Nevertheless, the following disadvantages make SIC infeasible when applying to the practical MIMO-NOMA [3, 7, 35].

• The users are decoded one by one, which greatly increases the time delay.

• The decoding order is required to be known at both the transmitter and receiver, which results in additional overhead cost.

• It assumes that all the previous users’ messages are recovered correctly and thus can be completely removed from the received signals. Nevertheless, in practice, the correct recovery is never be possible, which leads to error propagation during the interference cancellation.

• To achieve all points in the capacity region of MIMO-NOMA, time-sharing should be used, which needs cooperation between the users.

• The decoding order of SIC changes with the different channel state and different Quality of Service (QoS), which brings a higher overhead cost.

#### I-B3 Coded PIC Detection

PIC, where users are parallelly recovered and messages exchanged between the detector and decoders are soft, is another promising technique for the practical MIMO-NOMA systems [32, 30, 6, 37]. This technique has been commonly used for the non-orthogonal MAC like the Code Division Multiple Access (CDMA) systems [7, 35] and the Interleave Division Multiple Access (IDMA) systems [39, 40]. Various iterative detectors111For the uncoded iterative detector in Section I-B-1, the iteration is processed inside the detector. However, for the coded PIC detector, the iterative detection is performed between the detector and decoders, i.e., outside the detector., such as iterative Linear MMSE (LMMSE) detector, iterative BP detector and iterative MPD [41, 42, 43]. The advantages of iterative detection are listed as follows.

• The complexity is very low, since the overall receiver is departed into many parallel low-complexity processors.

• Time delay is much lower than SIC, since the users are recovered in parallel.

• Error propagation is greatly mitigated, since user interference are cancellated in soft and thus perfect interference cancellation is not required.

• System overhead is reduced, since the preset decoding order is not required.

• User cooperation is removed, since time-sharing is not required.

The existing PIC detections have a good simulative performance, but are regarded as suboptimal due to a performance gap to the associated capacity limit [35]. This is due to the fact that the detector and the decoders are designed separately and are not matched with each another, which results in performance loss although the decoding feedback is included for the detection.

#### I-B4 Principles of A Good Iterative Multi-User Detector

From the review above, we conclude the key principles in designing a good iterative multi-user detector.

• Multi-user interference cancellation and discrete signal reconstruction are performed respectively by MUD and user detectors.

• The decoding results should be fed back to the detector for a thorough interference cancellation.

• The detector and multi-user code should be jointly designed and matched with each other to avoid rate loss. In particular, the multi-user channel code should be optimized for the super-channel that encompasses the MIMO-NOMA channel and the multi-user detector.

The achievable rate analysis of such iterative detection for MIMO-NOMA is an intriguing problem.

### I-C Relationship with Interference Channel and Vector Multiple Access Channel

To clarify the relationship between interference channel (IC), vector multiple-access channel (VMAC) and MIMO-NOMA channel. We first give the definitions of IC and VMAC below.

• IC considers multiple transmitters and multiple receivers, and transmitter cooperation and receiver cooperation are not allowed (i.e. multiple scalar/vector inputs and multiple scalar/vector outputs).

• VMAC considers multiple transmitters and a single receiver, and both transmitters and receiver are equipped with multiple antennas (i.e. multiple vector inputs and a vector output).

Hence, the MIMO-NOMA channel (multiple scalar inputs and a vector output) discussed in this paper is different from IC because only a single receiver is considered. Moreover, the MIMO-NOMA channel is a special case of VMAC if each transmitter is only equipped with single antenna.

It is well known that the capacity of IC [44] is still an open issue. In addition, the capacity of general VMAC is only solved by a numerical algorithm [45]. In contrast, MIMO-NOMA channel (or VMAC with single-antenna transmitters) has a closed-form capacity region, which has been solved in [52], see also [7] and [34] for more details.

### I-D Gap Between P2P MIMO and MIMO-NOMA

The Extrinsic Information Transfer (EXIT) [46, 47], MSE-based Transfer Chart (MSTC) [49, 48], area theorem and matching theorem [46, 47, 48, 49] are the main methods to analyse the achievable rate or the Bit Error Rate (BER) performance of MIMO systems. It is proven that a well-designed single-code with linear precoding and iterative LMMSE detection achieves the capacity of the MIMO systems [43]. However, this results only applies to point-to-point (P2P) MIMO systems.

Since there is no user collaboration in MIMO-NOMA, the precoding in P2P MIMO [43]

cannot be used. Besides, the singular value decomposition (SVD) and water-filling in

[43] are unachievable in multi-user MIMO NOMA too, since there is no channel information at transmitters. Furthermore, only one user rate is analyzed in P2P MIMO [43], but in MIMO-NOMA, the whole achievable rate region that contains all the user rates needs to be established. Apart from that, the non-orthogonal multi-user interference makes the problem be more complicated. For example, the decoding processes of the non-orthogonal users in MIMO-NOMA interfere with each other, which results in a much more complicated MSTC functions and area theorems. In summary, the results in P2P MIMO (e.g. [43]) cannot be cannot be straightforwardly applied to analyze the achievable rates of the iterative detection for MIMO-NOMA.

### I-E Contributions

In this paper, the achievable rate analysis of the iterative LMMSE detection is provided for MIMO-NOMA, which shows that the low-complexity iterative LMMSE can be rate region optimal if it is properly designed. The contributions of this paper are summarized as follows222

In points a, b, c and d, the ideal SCM codes (with infinite layers and infinite length), which are designed to match the SINR-variance transfer curves of LMMSE detection, are used for the multiuser codes.

.

1. Matching conditions and area theorems of the iterative detector are proposed for MIMO-NOMA.

2. Achievable rate analysis of iterative LMMSE detection are provided.

3. Analytical proofs are derived for the designed iterative LMMSE detection to achieve:

• the capacity of symmetric MIMO-NOMA with any number of users,

• the sum capacity of the asymmetric MIMO-NOMA with any number of users,

• all the maximal extreme points in the capacity region of the asymmetric MIMO-NOMA with any number of users, and

• all points in the capacity region of two-user and three-user asymmetric MIMO-NOMA.

4. We prove that the elementary signal estimator (ESE) of IDMA in Multiple Input and Signal Output (MISO) and the maximal ratio combiner (MRC) in Multiple Output and Signal Input (SIMO) are two special cases of iterative LMMSE receiver. Hence, both ESE of IDMA in MISO and MRC in SIMO are sum capacity achieving.

5. An algorithm is provided to design a practical iterative LMMSE detection.

6. A kind of capacity-approaching multi-user NOMA code for the LMMSE detection, in the form of a special (non-standard) Irregular Repeat-Accumulate (IRA) multiuser code, is systematically constructed. This special IRA multi-user code must be designed in conjunction with the LMMSE detection to produce extrinsic transfer functions that satisfy a certain constraint among the different users.

7. Numerical results show that our iterative LMMSE detection with optimized IRA code outperforms the existing methods, and is within 0.8dB from the associated capacity limit.

From the information theoretic point of view, to the best of our knowledge, this is the first work that proves that a proper designed PIC (joint design of the iterative LMMSE detection and the multi-user code) can achieve the capacity of MIMO-NOMA with low complexity. From the practical point of view, the jointly designed iterative LMMSE detection (PIC) has significant improvement in the BER performances over the existing iterative receivers (including both SIC and PIC) in a variety of system loads.

Comments: It is well known that finite-length coding will lead to rate loss. In this paper, when we refer to the proposed iterative LMMSE achieving the capacity (sum capacity or all points in the capacity region) of MIMO-NOMA, infinite-length channel codes are considered by default. Specifically, in this paper, we use an ideal SCM code (with infinite layers and infinite length), which is designed to match the SINR-variance transfer curves of LMMSE detection. The existence of such code is rigorously proved in APPENDIX D.

This paper is organized as follows. In Section II, the MIMO-NOMA system and iterative LMMSE detection are introduced. The matching conditions and area theorems for the MIMO-NOMA are elaborated in Section III. Section IV provides the achievable rate analysis. Important properties and special cases of the iterative LMMSE detection are given in Section V. Practical multiuser code design is provided in Section VI. Numerical results are shown in Section VII.

## Ii System Model and iterative LMMSE detection

Consider an uplink MU-MIMO system that showed in Fig. 1: autonomous single-antenna terminals simultaneously communicate with an array of antennas of the BS [4, 3]. Here, and can be any finite positive integers. Since all the users interfere with each other at the receiver and are non-orthogonal in the time, frequency and code domain, it is thus named MIMO-NOMA333Here, MIMO-NOMA is different from IC since only a single receiver is considered. Moreover, MIMO-NOMA is also different from VMAC since each transmitter is only equipped with single antenna.. The system is represented as

 yt=Hxtr(t)+n(t),t∈N,N={1,⋯,N} (1)

where is an channel matrix, an independent additive white Gaussian noise (AWGN), the transmission, and the received vector at time . In this paper, we consider the block fading channel [7], i.e., is fixed during one block transmission and known at the BS. When the channel is block fading, in time-division duplexing (TDD) mode, it is possible for the BS to estimate the downlink channel when receiving message from the uplink. In frequency-division duplexing (FDD) mode, it is possible for the receiver feedback the channel to BS. However, as these are standard assumption for many others in the literature, we will not describe in details.

### Ii-a Transmitters

As illustrated in Fig. 1, at user (), an information sequence is encoded by an error-correcting code into an -length sequence , which is interleaved by an -length independent random interleaver444The interleavers improve the system performance by enhancing the randomness of the messages or the channel noise, and avoiding the short cycles in the system factor graph [39, 40, 50]. to get . We assume that each is taken over the points in a discrete signaling constellation . After that, the is scaled with , and we then get the transmitting , . Let denote the normalized variance of , and be power constraint diagonal matrix whose diagonal elements are . Therefore, the system can be rewritten to

 yt=HK1/2xx(t)+n(t)=H′x(t)+n(t),t∈N, (2)

where .

### Ii-B Capacity region of MIMO-NOMA

Let denote the received random vector, and represent the transmitting random vector. Assuming , and , the partial channel matrix is denoted as , where is the th column of . Similar definition is applied to . Let be the rate of user and represent the sum rate of the users in set . Then, capacity region555Different from the interference channel whose capacity is still an open issue and the vector multiple-access channel whose capacity only has a numerical solution, the capacity calculation of MIMO-NOMA is trivial and has been has been well studied in [52, 7]. of the MIMO-NOMA system is given by [33, 34]

 RS≤I(Y;XS|XSc)=log|I|S|+1σ2nH′HSH′S|,∀S⊆Nu, (3)

where denotes the determinant of . The sum rate is

 Rsum=RNu=log|INu+1σ2nH′HH′|. (4)

### Ii-C Iterative Receiver

We adopt a joint detection-decoding iterative receiver, which is widely used in the multiple-access systems [31, 37, 43]. The messages , , , and , , are defined as the estimates of the transmissions. As illustrated in Fig. 1, at the BS, the received signals and message are passed to a LMMSE detector to estimate message for decoder , which is then deinterleaved with into , . The corresponding single-user decoder outputs message based on . Similarly, this message is interleaved by to obtain for the detector. This process is repeated iteratively until the maximum number of iterations is achieved.

In the rest of this paper, we will not distinguish and as they are same sequences with different permutations, i.e., and can be denoted with and . In fact, the messages and can be replaced by the means and variances respectively.

#### Ii-C1 Key Assumptions

For simplicity, we make the following assumptions, which are widely used in iterative decoding and turbo equalization algorithms [51, 47, 41, 43].

Assumption 1: For the LMMSE detector, each is independently chosen from for any and ; the messages are independent with each other, and the entries of are i.i.d. given .

Assumption 2: For the decoder, the messages , are independent with each other, and the entries of are i.i.d. given .

Assumptions 1 and 2 decompose the overall process into the local processors such as the detector and decoders, which simplifies the analysis of the iterative process. In detail, Assumption 1 simplifies the LMMSE estimation (see Section II-D-1), and Assumption 2 simplifies the transfer function of decoders (see Section II-C-2).

#### Ii-C2 A-posteriori Probability (APP) Decoder

We assume each decoder employees APP decoding666Although computational complexity of the APP decoding is too high to apply in practical systems, low-complexity message-passing algorithms can be used to achieve near-optimal performance [51]. APP decoding assumption is included to simplify our analysis. at the receiver. The extrinsic variance output of APP decoder is defined as

 vi,t=MMSE(xi,t|~\emph{l}DEC(xi,∼t)). (5)

From Assumption 2, we have . Therefore, we can define the SINR-Variance transfer function of the decoders as

 v¯x=ψ(ρ), (6)

Where .

### Ii-D LMMSE detector

In the MIMO-NOMA, the complexity of the optimal MAP detector is too high, and LMMSE detector is an alternative low-complexity detector.

#### Ii-D1 A-posteriori LMMSE Estimation

Message is de-mapped to with variance . Assumption 1 indicates that is invariant with respect to . Hence,

 (7)

where denotes the expectation of given . Let and . The a-posteriori LMMSE estimation [7, 43, 5, 31] is

 ^x(t)=V^x[V−1¯x¯x(t)+σ−2nH′Hyt], (8)

where denotes the a-posteriori deviation of the estimation. A derivation of (8) is given in APPENDIX A. For more details of LMMSE, please refer to Section II-C-2 and Section IV-F of [5].

#### Ii-D2 Extrinsic LMMSE Detector

Let and be the entry and diagonal entry of and , respectively. The LMMSE detector outputs extrinsic777The a-posteriori estimate in (8) cannot be used directly due to the correlation issue. mean and variance for (denoted by and ) by excluding the prior message with the message combining rule [27]:

 ϕi(v¯x)=v−1^xi(v¯x)−v−1iandui,t=^xi,tϕiv^xi−¯xi,tϕivi, (9)

where .

#### Ii-D3 Extrinsic Transfer Function

The following proposition is proved in APPENDIX B.

Proposition 1 [53, 54]: Let , . The output of the LMMSE detector is an observation from AWGN channel888The ”*” indicates that it is not the channel noise, but an imagined noise including the interference., i.e., with Signal Interference Noise Ratio (SINR) .

With Proposition 1, we can define the extrinsic LMMSE SINR-Variance transfer function of user as

 ϕi(v¯x)=v−1^xi−v−1i,fori∈Nu. (10)

The a-posteriori MSE of LMMSE detector for user is

 mmseestap,i(v¯x)=v^xi. (11)

Furthermore, Proposition 1 will be used to derive the area properties of MIMO-NOMA (see Section III-B).

Remark: The variance varies from to , because the signal power is normalized to 1. From (4), the output estimation of user depends on the input variances of all the users. Thus, the SINR-Variance transfer functions of all users interfere with each other. In addition, is monotonically decreasing in , which means the lower input variances of the users, the higher the output SINR of the detector.

### Ii-E Complexity of Iterative LMMSE Detection

From (8), the complexity of LMMSE estimator is , where (or ) arises from the matrix inverse calculation, (or ) from the matrix multiplication, and “” from Matrix Inversion Lemma. Hence, the total complexity of iterative LMMSE detection is , where is the number of iterations and denotes the single-user decoding complexity per iteration. Note that the complexity of LMMSE detector is much lower than the optimal MUD whose complexity grows exponentially with and , and polynomially with .

## Iii Matching Conditions and Area Theorems

In [49, 48, 43], the I-MMSE theorem and the area theorems for the P2P communication systems are proposed. In this section, these results are generalized to the MIMO-NOMA systems.

### Iii-a Matching Conditions of MIMO-NOMA

#### Iii-A1 SINR-Variance Transfer Chart

The iterative receiver performs iteration between the detector and the decoders, which are described by and respectively. Hence, the iteration is tracked by

 ρ(τ)=ϕ(v¯x(τ−1)),v¯x(τ)=ψ(ρ(τ)),τ=1,2,⋯. (12)

Eq. (12) converges to a fixed point , which satisfies

 ϕ(v∗¯x)=ψ−1(v∗¯x)andϕ(v¯x)>ψ−1(v¯x),forv∗¯x

where denotes the inverse of , which exists since is continuous and monotonic [55]. The inequality999In this paper, all the inequalities for the vectors or matrixes correspond to the component-wise inequalities. comes from the normalized signal power of , .

As shown in Fig. 2, if , then all the transmissions can be correctly recovered, which means that for any available , i.e., decoders’ transfer function lies below that of the detector .

#### Iii-A2 Matching Conditions

The detector and decoders are matched if

 ϕ(v¯x)=ψ−1(v¯x),for0

Therefore, we obtain the following proposition.

Proposition 2: For any , the matching conditions of the iterative MIMO-NOMA systems can be rewritten to

 ψi(ρi) = ϕ−1i(ϕi(1))=1,for0≤ρi<ϕi(1); (14) ψi(ρi) = ϕ−1i(ρi),forϕi(1)≤ρi<ϕi(0); (15) ψi(ρi) = 0,forϕi(0)≤ρi<∞. (16)
###### Proof:

Eq. 13 means that for any . First, we have , since the detector always uses the information from the channel. Hence, we get , for . Second, we have , since the detector cannot remove the uncertainty introduced by the channel noise. Hence, we get . At last, exists due to its monotonicity on . Therefore, we have (14)-(16).

Proposition 2 will be used in the area properties and rate analysis of MIMO-NOMA.

### Iii-B Area Properties

Let denote the SNR of the a-priori message for decoder , be the SNR of the extrinsic message for user at detector, be the a-posteriori variance of the message for user at detector, and be the a-posteriori variance of the message at decoder . Besides, . The following proposition gives the area properties of the iterative detection, which will be used to derive the user rate of MIMO-NOMA.

Proposition 3: The achievable rate of user and an upper bound of are given by

 Ri=∞∫0mmsedecap,i(snrdecpri,i)dsnrdecpri,i, (17) Rmaxi=∞∫0mmseestap,i(snrestext)dsnrestext,i, (18)

where , , where the equality holds if and only if the SINR-Variance transfer functions of the detector and user decoders are matched with each other, i.e., the matching conditions (13) (16) hold.

From (4), (6) and Proposition 1, we have , , and . Therefore, we have the following corollary from Proposition 3.

Corollary 1:With the SINR-Variance transfer functions and , the achievable rate of user and an upper bound of are

 Ri=∞∫0(ρi+ψi(ρi)−1)−1dρi, (19) Rmaxi=∞∫0v^xi(v¯x)dϕi(v¯x), (20)

respectively, and , , where the equality holds if and only if the matching conditions (13) (16) hold.

Now, the achievable rates can be calculated by (20) or (19) together with (13) and the matching conditions (14) (16).

## Iv Achievable Rate of Iterative LMMSE Detector

User achievable rate is derived for the iterative MIMO-NOMA in this section. The Superposition Coded Modulation (SCM) code is employed for the Forward Error Correction (FEC) code. We show that the achievable rate of iterative LMMSE can achieve the capacity of symmetric MIMO-NOMA and sum capacity of asymmetric MIMO-NOMA.

### Iv-a Achieving the Sum Capacity of Asymmetric MIMO-NOMA

For a general asymmetric MIMO-NOMA, achievable rate analysis becomes more complicated due to challenges below.

• All the users’ transfer functions interfere with each other at the detector, i.e., the any output of the detector relies on every variance of the input messages from the decoders.

• All the transfer curves of decoders requires to lie below that of the detector.

• The detector and decoders are associated with each other. It is intractable to optimize over an abstract class of decoder transfer functions for each user.

#### Iv-A1 Transfer-Constraint Parameter

The area theorem tells us that the achievable rate of every user is maximized if and only if its transfer function matches with that of the detector. Therefore, we can fix the transfer functions of the detector, and then obtain users’ achievable rate by matching the decoders’ transfer functions with the detector.

To make the analysis feasible, we consider a transfer constraint for the input variances of the detector.

 γi(v−1i−1)=γj(v−1j−1),foranyi,j∈Nu. (21)

Let be the transfer-constraint parameter of the iterative LMMSE detection. Without loss of generality, we assume and , that is, for any .

Actually, different values of give different variance tracks. Furthermore, different variance tracks correspond to different achievable rates of the users, i.e., the user’s achievable rate can be adjusted by the transfer-constraint parameter .

Fig. 3 and Fig. 4 presents the variance tracks with different values of for two-users and three-user MIMO-NOMA systems respectively. As we can see, (21) includes the symmetric case (i.e. ) and all the SIC points (maximal extreme points of the capacity region). If , for any , we obtain the SIC points with the decoding order , which is a permutation of . The blue curve and green curves in Fig. 3 and Fig. 4 correspond to the SIC cases.

#### Iv-A2 Transfer Function

With the transfer constraint in (21), we have

 V−1¯x=INu+γi(v−1i−1)Λ−1γ=V−1¯x(vi) (22)

and

 V^x = (σ−2nH′HH′+V−1¯x)−1 (23) = (σ−2nH′HH′+V−1¯x(vi))−1 = V^x(vi)

where , and is a diagonal matrix whose diagonal entries are . Thus, we have

 ϕi(v¯x)=v^xi(vi)−1−v−1i=ϕi(vi)=ρi. (24)

For example, if we take , we have

 V−1¯x=V−1¯x(v1),V^x=V^x(v1),andϕi(v¯x)=ϕi(v1). (25)

#### Iv-A3 Asymmetric Matching Condition

With the transfer constraint, the matching conditions are simplified as follows.

Proposition 5: Based on (24), for , the matching conditions (13) can be rewritten to

 ψi(ρi) = ϕ−1i(ϕi(1))=1,for0≤ρi<ϕi(1); (26) ψi(ρi) = ϕ−1i(ρi),forϕi(1)≤ρi<ϕi(0); (27) ψi(ρi) = 0,forϕi(0)≤ρi<∞. (28)
###### Proof:

From (25), we have and . Substituting it to (14)-(16), we obtain Proposition 5.

#### Iv-A4 User Achievable Rate

The users’ achievable rates are given by the following lemma.

Lemma 1: For the asymmetric MIMO-NOMA with any and , the achievable rate of user for iterative LMMSE detection is

 Ri=v1=0∫v1=1[v1−γ−1i[V^x(v1)]i,i]dv−11−log(γi), (29)

where , and denotes the -th diagonal entry of the matrix.

###### Proof:

See APPENDIX C.

Lemma 1 gives the achievable rate of each user, but it is an complicated integral function and we cannot see the specific relationship between the achievable rates and .

Remark: When for , and for a symmetric system with: (i) the same rate for ; (ii) the same power , Theorem 1 degenerates to Corollary 2.

#### Iv-A5 Achievable Sum Rate

Although it is difficult to give the exact achievable rate region, the iterative LMMSE detection is shown to sum capacity achieving.

Theorem 1: For any and , the iterative LMMSE detection achieves the sum capacity of MIMO-NOMA, i.e., .

###### Proof:

See APPENDIX F.

Theorem 1 shows that for a general asymmetric MIMO-NOMA, from the sum rate perspective, the LMMSE detector is an optimal detector without losing any useful information during the estimation.

#### Iv-A6 Monotonicity of Achievable Rate

The following lemma shows the monotonicity of achievable rate in (29).

Lemma 2: The achievable rate of user increases monotonously with and decreases monotonously with , where and .

###### Proof:

It is easy to find that (or ) increases monotonously with and decreases monotonously for and . Thus, based on Proposition 3, we have that increases monotonously with and decreases monotonously for .

Lemma 2 is important in user rate adjustment, i.e., if we want increase the rate of user , it only needs to increase . Besides, the monotonicity is also important for the practical iterative detection design.

### Iv-B Achieving the Capacity of Symmetric MIMO-NOMA

Then, we consider a simple symmetric MIMO-NOMA systems, that is the users have the same power and the same rate, i.e., and , for .

#### Iv-B1 Transfer Function

Since all the users have the same conditions, we thus obtain that all the users have the same transfer functions, which means and , for any . Therefore, the transfer functions are derived as:

 v^xi(v¯x) (a)= 1Nummseestap(v¯x)=1NuTr{V^x} (30) = 1NuTr{(σ−2nw2HHH+v−1INu)−1} = v^x(v),

and

 ϕi(v¯x) (b)= v^x(v)−1−v−1 (31) = 1NuTr{(σ−2nw2HHH+v−1INu)−1}−1−v−1 = ϕ(v)=ρ,

where equations (a) and (b) are obtained from the symmetric assumption. Similarly, we have , .

#### Iv-B2 Matching Condition

Since all the users are symmetric, Proposition 2 can be simplified as follows.

Proposition 4: The matching conditions of the iterative symmetric MIMO-NOMA system are given by

 ψ(ρ) = ϕ−1(ϕ(1))=1,for0≤ρ<ϕ(1); (32) ψ(ρ) = ϕ−1(ρ),forϕ(1)≤ρ<ϕ(0); (33) ψ(ρ) = 0for,ϕ(0)≤ρ<∞. (34)

#### Iv-B3 Achievable Rate

In this case, the analysis of symmetric MIMO-NOMA degenerates into that of single-user and single-antenna system. From the transfer functions and matching conditions above, we obtain the following theorem.

Corollary 2: For a symmetric MIMO-NOMA with any and that: (i) ; (ii) ; the iterative LMMSE detection achieves the capacity, i.e., , and .

Corollary 2 shows that for a symmetric MIMO-NOMA system, the iterative detection structure is optimal, i.e., the LMMSE detector is an optimal detector without losing any useful information during the estimation.

### Iv-C Practical Iterative LMMSE Detection Design

It should be noted that the codes design depends also on . Since we cannot get a closed-form solution of the user rate with respect to , it is hard to obtain the proper for the given user rates. Nevertheless, Algorithm 1 provides a numeric solution of to satisfy user rate requirement.

For any and , Algorithm 1 gives a numeric search of given rate , where is the maximum iterative number, and indicate the allowed precision, and denotes the 1-norm. It should be noted that

in step 6 definitely exists and can be easy searched by dichotomy or quadratic interpolation method as

increases monotonously with (Lemma 2). In addition, steps ensure that the new is always better than the previous one and the search program will not stop until the requirement is got. Experimentally, we find that the points in the system capacity region are always achievable.

## V Important Properties and Special Cases of Iterative LMMSE detection

Can the iterative LMMSE detection achieve all points in the capacity region of asymmetric MIMO-NOMA? To answer this question, we derive some properties and show that:

• for the two-user MIMO-NOMA, all points in the capacity region can be achieved by iterative LMMSE detection;

• all the maximal extreme points in the capacity region of MIMO-NOMA with any number of users can be achieved by iterative LMMSE detection.

Furthermore, MISO and SIMO are discussed as two special cases, which show that the ESE in IDMA and MRC are sum capacity optimal for MISO and SIMO respectively.

### V-a Achieving the Maximal Extreme Point

As it is mentioned in Capacity Region Domination Lemma in APPENDIX G, the system capacity region is dominated by a convex combination of the maximal extreme points, which can be achieved by SIC.

Here, we show that all these maximal extreme points can be achieved by iterative LMMSE detection when the transfer-constraint parameter is properly chosen.

Corollary 3: For any and , all the maximal extreme points in the capacity region of MIMO-NOMA can be achieved by iterative LMMSE detection.

###### Proof:

See APPENDIX H.

This corollary shows that if the parameter is properly chosen, the iterative LMMSE detection degenerates into the SIC methods, i.e., the SIC methods are special cases of the proposed iterative LMMSE detection.

### V-B Two-user MIMO-NOMA

As it is mentioned, it is hard to calculate the specific achievable user rates from (29). However, in two-user case, the achievable rate region can be calculated and it equals to the capacity of MIMO-NOMA.

Theorem 3: Iterative LMMSE detection achieves the whole capacity region of two-user MIMO-NOMA:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩R1≤