I Introduction
With the popularization of Internet and intelligent technology, the number of communication devices is predicted to reach 40.9 billion in 2020 [1], which includes new communication scenes, such as machinetomachine communications [2, 3], Internet of things [4], and vehicletovehicle (V2V) communications [5]. Due to the fact that available spectrum resources are limited, orthogonal multiple access technology in the fourth generation (4G) communication system cannot satisfy the massive access demands. As a result, NonOrthogonal Multiple Access (NOMA) [6, 8, 7, 9, 10, 11, 12, 13, 16, 17, 14, 15, 18, 19] emerges to support heavily overloaded communications, which allows multiple users to share the same time and frequency resources. To further improve spectral efficiency and reduce latency, NOMA combining with MultipleInput MultipleOutput (MIMO) [20, 21], termed MIMONOMA [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], is considered as a key air interface technology in the fifthgeneration (5G) communication system [40, 41].
Theoretical analysis has proved that MIMONOMA systems can achieve higher capacity than orthogonal multiuser MIMO systems of 4G [23]. From the perspectives of applications, multiple users in MIMONOMA are separated by different transmission powers [24, 25, 26] or different channel codes [28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], where the former employs Successive Interference Cancellation (SIC) receiver and the latter relies on a joint iterative multiuser decoding.
In MIMONOMA systems with SIC receiver [24, 25, 26], different users are allocated to different power levels and the SIC receiver decodes and then removes the interference of each user according to a descending order of their channel gains [20]. Although the implementation of the powerallocation system is simple, SIC receiver has three inherent problems in practice: (1) error propagation, i.e., residual errors of earlier decoded users still affect the decoding of the later users, (2) the performance of SIC receiver is sensitive to the accuracy of channel state information (CSI), (3) decoding latencies of the later users might be large especially when the number of users is large.
In MIMONOMA systems with joint iterative multiuser decoding [28, 29, 30, 31, 32, 33, 34], different users are allocated with different codes before transmission and the joint iterative multiuser decoder detects signals for all users simultaneously. In the works on transmitter design, Sparse Code Multiple Access (SCMA), a kind of NOMA, is considered in [28, 30, 29, 31], in which multiple users are allocated with different sparse signature codes for user separation. Works [30] and [31] considered codebook design for SCMA based on the criteria of maximum a minimum code distance and mutual information, respectively. However, since the design involves a joint optimization of multiple users’ codes, which becomes extremely difficult as the user number is large. Works [32, 33, 34, 35, 36, 37, 38] proposed several lowcomplexity multiuser detection schemes for MIMONOMA with near optimal performance, such as the Gaussian message passing detection (GMPD), integer forcing detection, and Linear Minimum MeanSquare Error (LMMSE) detection. Especially, work [39] proved that the LMMSE detector can achieve the capacity region of MIMONOMA system when the employed channel code possesses an EXtrinsic Information Transfer (EXIT) property that perfectly matches with that of the LMMSE detector. Unfortunately, these works did not provide any practical channel code design for MIMONOMA with these excellent multiuser detection schemes. This motivates us to design practical channel codes with lowcomplexity encoding and decoding to achieve this goal.
In this paper, we consider practical code design for uplink MIMONOMA system that takes implementation complexity and performance into account at the same time. The major contributions of this paper are summarized as follows.

An asymptotic analysis is proposed to trace the EXIT property between LMMSE detector and messagepassing decoders.

Based on the asymptotic EXIT analysis, we design multiuser encoders such that the messagepassing decoders match with the LMMSE detector in iterative decoding perspective. The proposed code has an asymptotic performance with only 0.2 dB from the channel capacity.

We show that the proposed system is robust to various code lengths, iteration numbers, and channel conditions via simulations, and is implementable with a low decoding complexity of , where , , and denote the number of users, receive antennas, and iterative detections respectively.
It should be emphasized that comparing with precoded MIMONOMA, where the precoding is generally used for beamforming, power allocation, and user pairing [25, 26, 27], the proposed system does not require instantaneous CSI. Moreover, since our code design aligns with the joint iterative multiuser decoding, a significant coding gain is achieved comparing with the MIMONOMA system with a conventional channel code designed for pointtopoint channel. Therefore, the proposed system can be an attractive solution for the MIMONOMA uplink in 5G communications.
The rest of this paper is organized as follows. In Section II, the model and challenges of MIMONOMA are presented. The asymptotic EXIT analysis between LMMSE detector and messagepassing decoders is introduced elaborately in Section III. Section IV provides a practical coding scheme for MIMONOMA system and the analyses of complexity and performance. Section V presents various simulations to validate the reliability and robustness of the proposed MIMONOMA system. Finally, Section VI concludes this paper and provides some future works.
Ii System Model and Challenges
In this section, the system model of uplink coded MIMONOMA is presented. Subsequently, the challenges in the designs of transmitter and receiver are discussed, which motivate the overall system design with the goal of achieving capacityapproaching performance at low implementation complexity.
Iia System Model
Figure 1 illustrates an uplink coded MIMONOMA system, which includes singleantenna users and a base station (BS) equipped with antennas. At the user transmitters, information sequence is encoded by encoder , , and comes into the followed modulator. Then, generated symbol sequence is transmitted to the channel. Here, we assume that each user has the same transmit rate and the transmitted power for each user is normalized as .
When all transmitted signals from users arrive at BS synchronously, received signal ^{1}^{1}1
denotes the transposition of a vector or matrix.
is(1) 
where denotes transmitted signals from users, is the channel matrix from users to BS, and is an additive Gaussian noise vector. We assume that is available at the BS but unknown for the user transmitters.
The goal of the receiver at BS is to recover the signals for all users. As shown in Fig. 1, the employed receiver consists of a multiuser detector (MUD) and a bank of singleuser decoders, in which the iterative detection for all users’ signals is performed between the MUD and all singleuser decoders. Specifically, based on received signal and a priori estimations derived from the decoders, the MUD outputs soft estimations for each transmitted symbol of each user. Based on these estimations from the MUD, a singleuser decoding is performed in each decoder, which feeds the output estimations back to the MUD. The whole iterative process will stop when all signals are recovered successfully or the maximum iteration number is reached.
IiB Challenges
To enable the system to achieve capacityapproaching performance at low complexity, we discuss the challenges in the designs of transmitter and receiver, and propose the corresponding solutions.
For the user transmitters, the challenge is to conceive the encoding scheme for each user so that the user’s messages could be efficiently decoded via the multiuser decoder. Since this multiuser decoding involves a separation of user’s signals from a compound receiver, a sophisticated encoding or preprocessing for each user’s signal are required to realize this goal. Although works [28, 29, 30, 31, 32, 33, 34] assign userspecific modulation scheme so that the signals of each user could be physically identified at the receiver, alternatively, we apply the same modulation scheme, just the simplest BPSK modulation, for each user, and show that the signals of each user could be well recovered only through our proposed channel coding and decoding. But different from conventional pointtopoint codes [42] that are designed specially to overcome channel noises, the proposed code is designed to overcome not only the noise interference but also the multiuser interference from other nonorthogonal users. Therefore, we name the proposed code as multiuser code. The detailed design process for the multiuser code will be discussed below.
For the receiver, the challenge is achieving capacityapproaching performance for signal detection with low complexity. Although each component of the receiver can adopt an optimal algorithm [43]
, i.e., Maximum A Posteriori (MAP) algorithm in the MUD and A Posteriori Probability (APP) algorithm in the decoders, this optimal solution is severely limited by the prohibitive complexity, which increases exponentially with user number and code length. As a result, we employ an alternative lowcomplexity LMMSE detection in the MUD and a messagepassing decoding in the decoders, which can decompose the overall signal recovery into distributed lowcomplexity calculations
[44, 45, 47, 46]. Meanwhile, since the LMMSE detection is proved to be capacityapproaching in the EXIT point of view under iterative decoding [39], our objective is to design a practical capacityapproaching code.Iii Asymptotic Analysis of Iterative Receiver
In this section, an asymptotic EXIT analysis is proposed to trace the EXIT property between LMMSE detector and messagepassing decoders. Based on this asymptotic analysis, the guideline for multiuser code design is provided.
Here, we consider a realdomain system, where each modulator employs BPSK, the elements of channel matrix
obey a real Gaussian distribution
, and the elements of channel noise obey a real Gaussian distribution . The analysis for complexdomain systems with highorder modulations can be extended accordingly.Iiia LMMSE Detection
The LMMSE detection is used for estimating the transmitted signals of each user. Since the signal estimation for each user is similar, we only focus on the detection of of user .
Based on a priori loglikelihood ratio (LLR) from decoder , mean
and variance
associated with arewhere denotes the conditional expectation of variable when given variable . Let and ^{2}^{2}2 denotes the diagonal matrix with diagonal elements . Based on received signal in Eq. (1), a posterior estimation of LMMSE detector is [36]
(2) 
where denotes the deviation between a posterior estimated signal and exact signal .
According to the message combining rule [48], extrinsic mean and variance are obtained by excluding a priori mean and variance from a posterior mean and variance :
(3) 
On the other hand, Eq. (2) can be rewritten as where is an identity matrix. Thus, can be rewritten as
(4)  
where is the th column of and denotes that the th element of the vector is set as zero.
IiiB Asymptotic Analysis of LMMSE Detector
According to Eq. (4), a Gaussian assumption is employed to simplify the asymptotic analysis, which is commonly used in [36, 37].
Assumption 1: The output estimated signal of the LMMSE detector is equivalent to an observation from AWGN channel, i.e., , where , and is an equivalent Gaussian noise with mean and variance .
With Assumption 1, the output signal of the LMMSE detector can be estimated by tracing the variance of equivalent Gaussian noise . That is, when extrinsic variance decreases to gradually, , the estimated signals become more accurate. Note that the update of in Eq. (4) is determined by a priori variance , a posterior variance , and extrinsic variance . Therefore, we need to trace the variance updates for the inputoutput signals of LMMSE detector in the iterative detection process.
Based on a priori variance , a posteriori variance of signal from the LMMSE detector [36] is calculated by
where . Then, extrinsic variance of signal is obtained as
(5) 
where . Furthermore, for largescale systems, i.e., , , and fixed system load , the asymptotic extrinsic variance is
(6) 
IiiC Asymptotic Analysis of Decoders
Although the asymptotic performance analysis of a channel decoder is usually given by the standard EXIT method with a mutual information measure as in [49, 50, 51, 52, 53], to match with our variance transfer analysis of LMMSE detector, we need to transform the mutual information measure into variance measure.
IiiC1 Lmmse Dec
Based on Assumption 1, the output estimated signal of LMMSE detector is equivalent to an observation from AWGN channel, i.e., , so that input LLR of decoder associated with is calculated by
where is the exponential function and is the logarithm function with respect to exponential. Then, the mean of is . According to the Gaussian assumption in EXIT analysis [51, 52], obeys Gaussian distribution . Thus, a priori mutual information for decoder can be calculated by
(7) 
where .
IiiC2 EXIT Function of DEC
Based on the a priori mutual information, output mutual information is calculated by the EXIT function of decoder and is fed back to the LMMSE detector. Considering that EXIT functions of lowdensity paritycheck (LDPC) like codes are simple [49, 50], the proposed multiuser code is designed based on the structures of LDPClike codes, such that the corresponding EXIT function can be obtained readily.
IiiC3 Dec Lmmse
When the singleuser decoding is finished, LLR associated with output signal of decoder is obtained. Then, the mean and variance of can be calculated by
According to the Gaussian approximation in EXIT analysis [51, 52], obeys Gaussian distribution , where and function is the inverse of . As a result, the variance of that is fed back to the LMMSE detector is
(8) 
where is calculated by the Monte Carlo simulations depending on the Gaussian distribution of .
The complete asymptotic EXIT analysis for the iterative receiver is provided in Algorithm 1. Note that the statistically iterative detection between the LMMSE detector and the decoders is estimated by tracing the variances of estimated signals, which is easy to implement. Meanwhile, by exploiting the proposed asymptotic analysis, the multiuser code based on the structures of LDPClike codes can be designed and optimized readily.
Iv Practical Coding Scheme for MIMONOMA
In this section, we present a practical coding scheme for MIMONOMA system. Subsequently, we analyze the complexity and the asymptotic performance of the overall MIMONOMA system.
Iva Coding Scheme and MessagePassing Decoder
Since the transmitted signals will be deteriorated by the channel noise and the multiuser interference at the same time, we propose a kind of MultiUser Irregular RepeatAccumulate (MUIRA) code for the MIMONOMA system to overcome both kinds of interferences. Fig. 2 shows the graph for MUIRA code structure of user , which consists of a repetition code, a nonsystematic IRA code [42], and a userspecific interleaver . The parameters of the MUIRA code include repetition number , combiner , degree distributions of information sequence , and code rate .
To explain the advantages of the proposed MUIRA code, we briefly present the effect of each component in the MUIRA code.

Although repetition code provides no coding gains in the pointtopoint channel, it can provide multiuser coding gains in the multiuser channels to overcome the multiuser interference. For example, spreading in CDMA systems is in fact repetition code, which can achieve coding gains. Previous work [54] theoretically shows that repetition encoding increases the superposed signal distance of multiuser code. Meanwhile, we will show that introducing repetitions in the codeword benefits the iterative processing between the LMMSE detector and channel decoder.

Here, we set combiner in the MUIRA code, while is in the IRA code designed for MultipleAccess Channel (termed MACIRA code) [16, 17]. Due to this modification, the proposed MUIRA code is a generalization of the MACIRA code. Note that the multiuser scenarios in [16, 17] and this paper are different, where the receiver in [16, 17] has a single antenna and the receiver in this paper has multiple antennas. For the singleantenna receiver, each user requires to employ a very lowrate MACIRA code with combiner to overcome the severe multiuser interference. In this paper, since the multiple antennas in the receiver can provide power gains to overcome a part of multiuser interference, each user can employ a higherrate code. Therefore, the proposed MUIRA code with combiner gives more flexibility to highrate code design. On the other hand, from the EXIT chart point of view, also provides more flexibility for the decoder’s EXIT characteristics, so that we could find better code with EXIT curve matching better with that of the LMMSE detector.

Different interleavers , are employed by different users for user identification [55].
Now we present the messagepassing decoding in the MUIRA decoder, in preparing for the code parameter optimization. As shown in Fig. 2, the MUIRA decoder should consist of a repetition decoder, a nonsystematic IRA decoder, and an information combiner. Based on the estimated signals from the LMMSE detector, the repetition decoding and the IRA decoding are performed once parallelly, where the IRA decoding is realized based on the sum product algorithm [42]. Then, the obtained estimations are combined in the information combiner and the generated extrinsic estimations are fed back to the repetition decoder and the IRA decoder according to message update rules [47]. Afterwards, the repetition decoding and the IRA decoding are performed once, where the output estimations are fed back to the LMMSE detector.
System load  Full loading  Over loading  Severe loading  
K  
M  
R  
q  
MIMONOMA  
capacity (dB)  
Name  Complexity  

user transmitter  





exp/log  
Proposed system 
IvB Optimization of Coding Scheme
With the goal of maximizing sum rate , the asymptotic EXIT analysis of the iterative receiver is employed to optimize the code parameters whose EXIT property matches with that of the LMMSE detector. Fig. 3 shows the asymptotic EXIT process between the LMMSE detector and the MUIRA decoder, where the EXIT function of the MUIRA decoder is obtained by using the similar method in [17]. According to Algorithm 1, a priori variance and extrinsic variance of LMMSE detector are updated according to Eq. (5), Eq. (7), and Eq. (8). Based on a priori mutual information and the messagepassing decoding, the EXIT function of the MUIRA decoder is performed to obtain extrinsic mutual information , which are combined as extrinsic feeding back to the LMMSE detector. Due to the fact that there are multiple optimized parameters for different repetition number , we should choose the optimal parameters. To be specific, let the maximum repetition number be . For , based on the asymptotic EXIT analysis, the optimized code is obtained by optimizing and . Among these candidate codes, the optimal code with the maximum sum rate is selected.
For example, we optimize MUIRA codes over MIMONOMA systems with three types of system loads, i.e., full loading (), over loading (), and severe system loading (), (), (), and (
). Note that in MIMONOMA scenarios, overloading and severe loading denote that the number of transmitted data streams is larger than the spatial degrees of freedom, i.e., the product of user number and the number of transmit antennas is larger than the number of receive antennas. In this case, the system multiplexing gain is maximal. The corresponding noise variance is
, , , , , and . The optimized code parameters are presented in Table I, which shows that the decoding thresholds of MUIRA coded MIMONOMA systems are within dB from the corresponding system capacities.We also observed that the optimal value of could increase to 4 as the system load increases to 8. This is because when the multiplexing gain is large, more repetitions are needed to deal with the signal interference.
IvC Complexity Analysis
To verify practicability of the proposed system, we investigate the implementation complexity for the overall system. In the user transmitters, since the calculations of encoding and modulation are just additions and modulo operations, the implementation complexity is . In the iterative receiver, the complexity of LMMSE detector is , where is the maximum iteration number. In the MUIRA decoder, the averaged number of addition/subtraction (), multiplication/division (), and exponent/logarithm (exp/log) operations is , , and for decoding one information bit of MUIRA code in one iteration. As a result, the decoding complexity of information bits in user decoders is approximately . Note that the implementation complexity of user decoders decreases with the increase of . In summary, the complexities of the overall system and each component are given in Table II.
IvD Asymptotic Performance Analysis
To illustrate that the proposed system can achieve capacityapproaching performances, we provide the inputoutput variancetransfer curves of the LMMSE detector and those of the MUIRA codes in Table I over full loading and severe loading MIMONOMA, i.e., () and ().
As shown in Fig. 4(a) and Fig. 4(d), the variancetransfer curves of rate and rate MUIRA codes match well with those of the LMMSE detector over full loading and severe loading cases. According to the capacityachieving proof of the LMMSE detection in [39], the proposed MUIRA coded system can approach the capacity of MIMONOMA system. In addition, Fig. 4(a) also shows the iterative decoding trajectory between the LMMSE detector and the MUIRA decoder.
To see the advantage of the MUIRA code in the iterative decoding process, Fig. 4(b) shows the variancetransfer curves for component codes of the rate MUIRA code in Table I, i.e., a rate repetition code and a rate nonsystematic IRA code. Notice that the repetition code can provide the high output SNR (i.e., ) when the input SNR (i.e., ) is relatively low. By contrast, the nonsystematic IRA code can provide the very high output SNR when the input SNR is medium or large. As a result, the proposed MUIRA code combines the advantages of these two component codes, which aid the variancetransfer curve of the MUIRA code to match well with that of the LMMSE detector in the entire SNR region.
To confirm the importance of matching between the LMMSE detector and the proposed code in the perspective of EXIT analysis, we present a rate MACIRA code [16, 17] for comparison, whose parameters are , , and . As shown in Fig. 4(c) and Fig. 4(d), the variancetransfer curves of MACIRA code seriously mismatch with those of the LMMSE detector over the full loading and severe loading cases, in which the large gaps between variancetransfer curves of the MACIRA code and those of the proposed MUIRA codes denote large rate losses.
To emphasize the necessity of multiuser code design, we consider conventional SingleUser (SU) IRA codes [42] for comparison. To compare with our code, We design a SUIRA code for the pointtopoint channel by using the EXIT analysis. The optimized parameters of the SUIRA codes are given in Table III, which shows that the decoding thresholds are about dB from the capacity of the pointtopoint channel. However, when we use the codes in MIMONOMA system, their the variancetransfer curves untimely interact with those of the LMMSE detector as shown in Fig. 4(b) and Fig. 4(c), which result in decoding failures. Moreover, as the system load increases, the decoding failure regions of the SUIRA codes become large. This indicates that the welldesigned SUIRA codes do not be suitable for the MIMONOMA system with the LMMSE detector.
R  
q  
Pointto  
Point channel  
capacity (dB) 
V Numerical Results
The above analyses and optimizations are based on the assumptions of infinite code length and iterations. To verify the practicability and reliability of the proposed MIMONOMA system, in this section, we provide extensive finitelength simulations in various aspects.
Va Comparisons of Decoding Complexity
The complexity comparisons of the proposed MUIRA code and the SUIRA codes are given in Table IV, which focuses on the decoding for one information bit of each user per iteration. According to Table II, the number of operations including , , and exp/log are calculated, where the parameters of the MUIRA codes and the SUIRA codes are given in Table I and Table III respectively. Table IV demonstrates that the proposed MUIRA codes achieve lower decoding complexities than the SUIRA codes. As a result, the MUIRA coded MIMONOMA systems have lower implementation complexities than the SUIRA coded MIMONOMA systems.
System load  Code  Rate  exp/log  
Full loading  MUIRA  0.2  768  576  576 
()  SUIRA  0.2  1000  768  768 
Over loading  MUIRA  0.15  1915  1344  1344 
()  SUIRA  0.15  2283  1632  1632 
Severe loading  MUIRA  0.13  2668  1639  1639 
()  SUIRA  0.13  3076  1927  1927 
Severe loading  MUIRA  0.1  4960  3072  3072 
()  SUIRA  0.1  5504  3456  3456 
VB Performance Comparison
To confirm the advantage of the proposed MIMONOMA system, we present the comparisons of three coded MIMONOMA systems, which are the proposed MUIRA coded system with LMMSE detector, denoted as LMMSE+MUIRA, the SUIRA coded system with LMMSE detector, denoted as LMMSE+SUIRA, and the SUIRA coded system with the MUD consisting of LMMSE detector and SIC detector [20], denoted as LMMSESIC+SUIRA.
We consider that the information length of each user is , the repetition pattern of repetition code is , and a random interleaver is employed in the MUIRA code. Assume that each user employs BPSK and SNR , where is the transmitted power of each user. The elements of channel matrix obey a real Gaussian distribution and the maximum iteration number is .
Fig. 5 provides the biterror rate (BER) simulations of these three coded systems over the full loading and over loading MIMONOMA, where and . Note that the gaps between BER curves at of the proposed LMMSE+MUIRA systems and the corresponding Shannon limits are dB and dB respectively. This verifies that the proposed system can achieve capacityapproaching performances.
Compared with the LMMSE+SUIRA systems, the proposed systems have dB and dB performance gains in the full loading and over loading cases. This indicates when the system load increases, the proposed system can achieve more performance gains. By comparing two SUIRA coded systems, Fig. 5 shows that the LMMSE+SUIRA systems can achieve dB and dB performance gains over the LMMSESIC+SUIRA systems, which indicates that the joint iterative multiuser decoding is more reliable than the SIC receiver.
VC The Importance of EXIT Matching between LMMSE detector and MessagePassing decoders
Fig. 6 compares the rate MACIRA [16, 17] coded MIMONOMA with the rate and rate MUIRA coded MIMONOMA. Note that the MUIRA coded systems have dB and dB performance gains as well as and sumrate gains over the MACIRA coded systems in full loading and over loading cases respectively. This demonstrates the necessity of EXIT matching between the LMMSE detector and the messagepassing decoders.
VD Impact of Code Length
In the practical applications, different code lengths might be required. Hence, we investigate the impact of the finitelength MUIRA codes on the proposed system, where the information lengths are , , , and . Fig. 7 shows the BER performances of the MUIRA codes obtained in Table I over the full loading () and severe loading () MIMONOMA. Due to the shortened code length, some performance losses are caused. Nevertheless, the gaps between the BER curves at of the MUIRA code with the shortest information length, i.e., , and the corresponding Shannon limits are still within dB. This further confirms the practicability of the proposed system.
VE Impact of Iteration Number
Due to the requirement of lowlatency communication, low iteration number should be considered. To investigate the impact of iteration number on the proposed system, Fig. 8 shows the BER performances of the MUIRA coded systems with the maximum iteration number over full loading MIMONOMA () and over loading MIMONOMA (). Note that the performance gaps between the MUIRA codes with and the MUIRA codes with are just dB. When , the gaps between the BER curves at of the MUIRA codes and the corresponding Shannon limits are within dB. This validates the fast convergence of the proposed codes.
VF Dynamic system load
In practice, some users will leave the system when finished communications and some new users will joint the system when ready for communications. As a result, the system load will be dynamic over times. To investigate the robustness of the proposed system over the changing system load cases, Fig. 9 shows the BER performances of the MUIRA codes over the different load cases and respectively, where the MUIRA code designed for full loading case ()() is simulated for and the MUIRA code designed for over loading case ()() is simulated for . Note that the gaps between BER curves at of the MUIRA codes and the corresponding capacities are still within 1.45 dB, which illustrates that the proposed system is robust and can provide reliable performances over the low load and changing load cases.
VG Impact of Channel Correlation and Imperfect CSI
In above simulations, we consider the fast fading channel and the receiver can obtain the perfect CSI. To investigate the robustness of the proposed system, we investigate the impacts of channel correlation and imperfect CSI on the proposed system as follows.
VG1 Block Fading Channel
We consider the block fading channels, where the channel fading parameters remain unchanged for every 200 and 400 transmitted symbols of all users. Fig. 10 shows that BER curves at of the proposed systems over block fading channels are about dB from those of the fast fading cases, and are still within dB from the corresponding Shannon limits of the fast fading channels.
VG2 Imperfect CSI
In practical applications, channel estimation is difficult to be always estimated exactly. Therefore, we consider the fast fading channel and variances of estimated channel errors are and . As shown in Fig. 10, although imperfect channel estimations cause some performance losses, the gaps between BER curves at of the proposed systems with imperfect CSI and the corresponding Shannon limits are within dB. This demonstrates that the proposed system is robust to the simulated channel conditions and imperfect channel estimations, which is favourable to the practical applications.
Vi Conclusion and Future Works
In this paper, we proposed a practical MIMONOMA system for 5G communications, where transmitters and receiver were designed to achieve low complexity. The asymptotic EXIT analysis for the receiver consisting of LMMSE detector and messagepassing decoders was provided to trace the statistical characteristics of estimated signals. Based on the asymptotic EXIT analysis, an MUIRA coded MIMONOMA system was provided, whose implementation complexity was low and the asymptotic BER performances were within dB from the system capacity. Moreover, various numerical results were presented to validate the practicability and robustness of the proposed system. This implied that the proposed system would be an attractive solution for the MIMONOMA uplink in 5G communications.
There are two possible extensions for our work. One is finitelength code design, where the multiuser code distance analysis [54] or scattered EXIT analysis [56] could be utilized for this task. Another extension is to further improving the iterative decoding threshold based on the spatial coupling techniques [57, 58, 59].
References
 [1] ABI Reserach. (2014).: “The Internet of things will drive wireless connected devices to 40.9 billion in 2020”.
 [2] G. Wu, S. Talwar, K. Johnsson, N. Himayat, and K. Johnson, “M2M: From mobile to embedded Internet,” IEEE Commun. Mag., vol. 49, no. 4, pp. 3643, Apr. 2011.
 [3] M. Dohler and C. AntonHaro, MachineToMachine (M2M) CommunicationsArchitecture, Performance and Applications. Waltham, MA, USA: Woodhead Publishing, Jan. 2015.
 [4] B. P. L. Lau, N. Wijerathne, B. K. K. Ng, and C. Yuen, ”Sensor fusion for public space utilization monitoring in a smart city”, IEEE Internet of Things Journal, vol. 5, no. 2, pp. 473481, Apr. 2018.
 [5] J. Guo, B. Song, Y. He, F. R. Yu, and M. Sookhak, “A survey on compressed sensing in vehicular infotainment systems,” IEEE Commun. Suveys Tutorials, vol. 19, no. 4, pp. 26622680, 2017.
 [6] Y. Saito, Y. Kishiyama, A. Benjebbour, T. Nakamura, A. Li, and K. Higuchi, “Nonorthogonal multiple access (NOMA) for cellular future radio access,” in Proc. IEEE VTCSpring, Jun. 2013, pp. 15.
 [7] L. Dai, B. Wang, Y. Yuan, S. Han, C.L. I, and Z. Wang, “Nonorthogonal multiple access for 5G: Solutions, challenges, opportunities, and future research trends,” IEEE Commun. Mag., vol. 53, no. 9, pp. 7481, Sep. 2015.
 [8] Z. Ding, Y. Liu, J. Choi, Q. Sun, M. Elkashlan, C. Lin. I, and H. V. Poor, “Application of nonorthogonal multiple access in LTE and 5G networks,” IEEE Commun. Mag., vol. 55, no. 2, pp. 185191, Feb. 2017.
 [9] Y. Liu, Z. Qin, M. Elkashlan, Z. Ding, A. Nallanathan, and L. Hanzaom, “Nonorthogonal multiple access for 5G and beyond,” Proceedings of the IEEE, vol. 105, no. 12, pp. 23472381, Dec. 2017.
 [10] Y. Wang, Y. Wu, F. Zhou, Z. Chu, Y. Wu, and F. Yuan, “Multiobjective resource allocation in a NOMA cognitive radio network with a practical nonlinear energy harvesting model,” IEEE Access, vol. 6, pp. 1297312982, 2018.
 [11] S. Abeywickrama, L. Liu, Y. Chi, and C. Yuen, “Overtheair implementation of uplink NOMA”, Proc. IEEE GLOBECOM, Dec. 2017, pp. 16.
 [12] Z. Chen, Z. Ding, X. Dai, and R. Zhang, “An optimization persepctive of the superiority of NOMA compared to conventional OMA”, IEEE Trans. Signal Process., vol. 65, no. 19, pp. 51915202, Oct. 2017.
 [13] G. Liu, X. Chen, Z. Ding, Z. Ma, and F. R. Yu, “Hybrid halfduplex/fullduplex cooperative nonorthogonal multiple access with transmit power adaptation”, IEEE Trans. Wireless Commun., vol. 17, no. 1, pp. 506519, Jan. 2018.
 [14] C. Xu, Y. Hu, C. Liang, J. Ma, and L. Ping, “Massive MIMO, nonorthogonal multiple access and interleave division multiple access,” IEEE Access, vol. 5, pp. 1472814748, Jul. 2017.
 [15] G. Song, J. Cheng, and Y. Watanabe, “Maximum sum rate of repeataccumulate interleavedivision system by fixedpoint analysis,” IEEE Trans. Commun., vol. 60, no. 10, pp. 30113022, Oct. 2012.
 [16] G. Song and J. Cheng, “Lowcomplexity coding scheme to approach multipleaccess channel capacity,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2015, pp. 21062110.
 [17] G. Song, X. Wang, and J. Cheng, “A lowcomplexity multiuser coding scheme with nearcapacity performance, ” IEEE Trans. Veh. Technol., vol. 66, no. 8, pp. 67756786, Aug. 2017.
 [18] Y. Chi, Y. Li, G. Song, and Y. Sun, “Partially repeated SCLDPC codes for multipleaccess channel,” IEEE Commun. Lett., vol. 20, no. 10, pp. 19471950, Oct. 2016.
 [19] D. Fang, Y. Huang, Z. Ding, G. Geraci, S. L. Shieh, H. Claussen, “Lattice partition multiple access: A new method of downlink nonorthogonal multiuser transmissions,” in Proc. IEEE GLOBECOM, Dec. 2016, pp. 16.
 [20] T. David and P. Viswanath, Fundamentals of wireless communication. Cambridge Univ. Press, 2005.
 [21] X. Gao, L. Dai, S. Han, C.L. I, and R. W. Heath, “Energyefficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 9981009, Apr. 2016.
 [22] B. Wang, L. Dai, Z. Wang, N. Ge, and S. Zhou, “Spectrum and energy efficient beamspace MIMONOMA for millimeterwave communications using lens antenna array,” IEEE J. Sel. Area Commun., vol. 35, no. 10, pp. 23702382, Oct. 2017.
 [23] Y. Liu, G. Pan, H. Zhang, and M. Song, “On the capacity comparison between MIMONOMA and MIMOOMA,” IEEE Access, vol. 4, pp. 21232129, 2016.
 [24] Z. Ding, F. Adachi, and H. V. Poor, “The application of MIMO to nonorthogonal multiple access,” IEEE Trans. Wireless Commun., vol. 15, no. 1, pp. 537552, Jan. 2016.
 [25] Z. Ding, R.Schober, and H. V. Poor, “A general MIMO framwork for NOMA downlink and uplink transmission based on signal alignment,” IEEE Trans. Wireless Commun., vol. 15, no. 6, pp. 44384454, Jun. 2016.
 [26] Z. Ding, P. Fan, and H. V. Poor, “Impact of user pairing on 5G nonorthogonal multipleaccess downlink transmissions,” IEEE Trans. Veh. Technol., vol. 65, no. 8, pp. 60106022, Aug. 2016.
 [27] H. Wang, R. Zhang, R. Song, and S. H. Leung, “A novel power minimization precoding scheme for MIMONOMA uplink systems,” IEEE Commun. Lett., vol. 22, no. 5, pp. 11061109, May 2018.
 [28] H. Nikopour and H. Baligh, “Sparse code multiple access,” in Proc. IEEE PIMRC, Sep. 2013, pp. 332336.
 [29] M. Taherzadeh, H. Nikopour, A. Bayesteh, and H. Baligh, “SCMA codebook design,” in Proc. IEEE Veh. Technol. Conf. (VTC), Sep. 2014, pp. 15.
 [30] G. Song, X. Wang, and J. Cheng, “Signature design of sparsely spread CDMA based on superposed constellation distance analysis,” IEEE Access, vol. 5, pp. 2380923821, 2017.
 [31] K. Xiao, B. Xia, Z. Chen, J Wang, D Chen, and S. Ma, “On optimizing multicarrierlow density codebook for GMAC with finite alphabet inputs,” IEEE Commun. Lett., vol. 21, no. 8, pp. 18111814, Aug. 2017.
 [32] S. Tang, L. Hao, and Z. Ma, “Low complexity joint MPA detection for downlink MIMOSCMA,” in Proc. IEEE GLOBECOM, Dec. 2016, pp.14.
 [33] Y. Du, B. Dong, Z. Chen, P. Gao, and J. Fang, ”Joint sparse graphdetector design for downlink MIMOSCMA systems,” IEEE Wirless Commun. Lett., vol. 6, no. 1, pp. 1417, Feb. 2017.
 [34] J. Dai, G. Chen, K. Niu, and J. Lin, ”Partially active message passing receiver for MIMOSCMA systems,” IEEE Wirless Commun. Lett., vol. 7, no. 2, pp. 222225, Apr. 2018.
 [35] L. Liu, C. Yuen, Y. L. Guan, Y. Li, and Y. Su, “A lowcomplexity Gaussian message passing iterative detection for massive MUMIMO systems,” in Proc. IEEE ICICS, Dec. 2015, pp. 15.
 [36] L. Liu, C. Yuen, Y. L. Guan, Y. Li, and Y. Su, “Convergence analysis and assurance Gaussian message passing iterative detection for massive MUMIMO systems,” IEEE Trans. Wireless Commun., vol. 15, no. 9, pp. 64876501, Sept. 2016.
 [37] L. Liu, C. Yuen, Y. L. Guan, Y. Li, and C. Huang, “Gaussian message passing iterative detection for MIMONOMA systems with massive users,” in Proc. IEEE GLOBECOM, Dec. 2016, pp. 16.
 [38] S. H. Chae, M. Jang, S. K. Ahn, J. Park, and C. Jeong, “Multilevel coding scheme for integerforcing MIMO receivers with binary codes,” IEEE Trans. Wireless Commun., vol. 16, no. 8, pp. 54285441, Aug. 2017.
 [39] L. Liu, C. Yuen, Y. L. Guan, and Y. Li, “Capacityachieving iterative LMMSE detection for MIMONOMA systems,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2016, pp. 16.
 [40] G. Wunder,M. Kasparick, S. ten Brink, F. Schaich, T. Wild, I. Gaspar, E. Ohlmer, S. Krone, N. Michailow, A. Navarro, G. Fettweis, D. Ktenas, V. Berg, M. Dryjanski, S. Peitrzyk, and B. Eged, “5GNOW: Challenging the LTE design paradigms of orthogonality and synchronicity,” Mobile and Wireless Commun. Syst. for 2020 and Beyond, Workshop @ 77th IEEE Veh. Technol. Conf. Spring (VTC 13 Spring), Jun. 2013.
 [41] White Paper, “Rethink mobile communications for 2020+,” FuTURE Mobile Communication Forum 5G SIG, Nov. 2014. http://www.futureforum.org/dl/141106/whitepaper.zip.
 [42] W. E. Ryan and S. Lin, Channel Codes: Classical and Modern, Cambridge, Cambridge University Press, 2009.
 [43] S. Verd, “Optimum multiuser signal detection,” Ph.D. dissertation, Department of Electrical and Computer Engineering, University of Illinois at UrbanaChampaign, Urbana, IL, Aug. 1984.
 [44] Q. Guo and L. Ping, “LMMSE turbo equalization based on factor graphs,” IEEE J. Sel. Areas Commun., vol. 26, no. 2, pp. 311319, Feb. 2008.
 [45] T. J. Richardson and R. L. Urbanke, Modern Coding Theory. Cambridge, U.K.: Cambridge Univ. Press, 2008.
 [46] F. R. Kschischang, B. J. Frey, and H.A. Loeliger, “Factor graphs and the sumproduct algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 498519, Feb. 2001.
 [47] T. J. Richardson and R. L. Urbanke, “The capacity of lowdensity paritycheck codes under messagepassing decoding,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 599618, Feb. 2001.
 [48] X. Yuan, L. Ping, C. Xu and A. Kavcic, “Achievable rates of MIMO systems with linear precoding and iterative LMMSE detector,” IEEE Trans. Inf. Theory, vol. 60, no.11, pp. 70737089, Oct. 2014.
 [49] Y. Fang, S. C. Liew, and T. T. Wang, “Design of distributed protograph LDPC codes for multirelay codedcooperative network,” IEEE Trans. Wireless Commun., vol. 16, no. 11, pp. 72357251, Nov. 2017.
 [50] Y. Fang, P. Chen, L. Wang, and F. C. M. Lau, “Design of protograph LDPC codes over partial response channels,” IEEE Trans. Commun., vol. 60, no. 10, pp. 28092819, Oct. 2012.
 [51] S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 17271737, Oct. 2001.
 [52] A. E. Ashikhmin, G. Kramer, and S. ten Brink, “Extrinsic information transfer functions: model and erasure channel properties,” IEEE Trans. Inf. Theory, vol. 50, no. 11, pp. 26572673, Nov. 2004.
 [53] K. Bhattad and K. R. Narayanan, “An MSEbased transfer chart for analyzing iterative decoding schemes using a Gaussian approximation,” IEEE Trans. Inf. Theory, vol. 53, no. 1, pp. 2238, Jan. 2007.
 [54] G. Song and J. Cheng, “Distance enumerator analysis for interleavedivision multiuser codes,” IEEE Trans. Inf. Theory, vol. 62, no. 7, pp. 40394053, Jul. 2016.
 [55] L. Ping, L. Liu, K. Wu, and W. K. Leung, “Interleave division multipleaccess,” IEEE Trans. Wireless Commun., vol. 5, no. 4, pp. 938947, Apr. 2006.
 [56] M. Ebada, A. Elkelesh, S. Cammerer, and S. ten Brink, “Scattered EXIT charts for finite length LDPC code design,” [online] https://arxiv.org/pdf/1706.09239, Jun. 2017.
 [57] D. Truhachev and C. Schlegel, “Spatially coupled streaming modulation,” in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2013, pp. 34183422.
 [58] C. Liang, J. Ma, and L. Ping, “Towards Gaussian capacity, universality and short block length,” in Proc. 9th Int. Symp. Turbo Codes (ISTC), Sep. 2016, pp. 412416.
 [59] C. Liang, J. Ma, and L. Ping, “Compressed FEC codes with spatialcoupling,” IEEE Commun. Lett., vol. 21, no. 5, pp. 987990, May 2017.