With the rapid traffic growth in telecommunications, systems using multiple-input multiple-output (MIMO) configurations with a large number of antennas have attracted a lot of attention in both academia and industry . The massive MIMO system achieves increased data rate, higher spectral efficiency, enhanced link reliability and coverage over conventional MIMO 
, which becomes one key technology for 5G wireless. However, its large scale brings unbearable pressure to signal detection in terms of computational complexity. In recent years, deep machine learning has led to a revolution in many fields. With the deep learning techniques, computers can recognize relations between input and output data sets and further detect unknown objects from future inputs. The goal of this paper is to apply deep learning in the MIMO detection problem to propose a deep neural network-aided massive MIMO detector.
I-a Belief Propagation MIMO Detectors
Many massive MIMO detection methods were presented, e.g., [3, 4, 5, 6], among which the message passing approach, belief propagation (BP), has been paid intensive attentions and broadly researched in recent years. BP detectors provides a superior performance in comparison to the aforementioned detection algorithms due to its lower-complexity, strong robustness and also the so-called large-dimension behavior, i.e., the detection performance is closer to optimal as the MIMO dimension increases [7, 8]. However, it has some drawbacks when dealing with practical problems:
Loopy factor graph: The factor graphs defined by typical MIMO channels are fully-connected, hence heavily loopy. The BER performance of BP suffers severe degradation due to the loopiness, especially in practical channels which are spatially correlated fading.
Complexity: BP detectors are still of high complexity that implies large delay and implementation difficulties, which are critical for some delay sensitive applications.
Some modifications of BP have been proposed to handle these issues, among which we focus on the following methods:
I-A1 Damped BP
BP with damping, or damped BP, is an efficient way to overcome the poor performance due to the cycles in factor graphs. It is a BP variant by averaging the two successive messages with a weighting factor (also called damping factor). It was observed in many works, e.g., [7, 8, 9, 10], that the damping could improve the convergence of the BP algorithms. Indeed, damping is also applied in other message passing methods like approximate message passing (AMP)  to facilitate convergence .
Challenges: The optimal damping factors are difficult to find. The available method relies on the Monte Carlo simulations which brings overwhelming computation burden. In 
, a heuristic automatic damping (HAD) method is proposed to automatically calculate the damping factor in each BP iteration, which improves the efficiency but still requires extra online calculation.
I-A2 Max-Sum Algorithm
In , a max-sum (MS) algorithm is proposed to further reduce the computational complexity of BP with an approximation strategy. The normalized MS (NMS) and offset MS (OMS) are presented as an extension of MS in order to compensate for the performance degradation resulting from approximation operation.
Challenges: The normalized factor in NMS and offset factor in OMS make a great influence to the performance improvement, however, are hard to decide. 
provides a method to update the factors based on the approximated prior probabilities and pre-computed errors, which also requires extra computation at each iteration.
Overall, the enhancements achieved by the modified BP algorithms mentioned above rely on the selection of the correction parameters including the damping, normalized and offset factors. Further improvements are demanded for:
A framework to optimize the correction factors efficiently with acceptable computational complexity;
Improved robustness against different channel conditions;
Outperming or leveling linear detectors under various antenna configurations and modulations.
I-B Deep Neural Network
Deep learning (DL) has attracted worldwide attentions due to its powerful capabilities to solve complex tasks. With the advances in big data, optimization algorithms and stronger computing resources, such networks are currently state of the art in different problems including speech processing 
and computer vision. In recent years, deep learning methods have been purposed for communication problems. For instance, various channel decoders using deep learning techniques were proposed as in [17, 18, 19]. There were also many works on learning to invert linear channels and reconstruct signals [20, 21, 22].  proposed to learn a channel auto-encoder via deep learning technologies.
In the context of massive MIMO detection, research has also been done. In , a deep learning network for MIMO detection named DetNet is derived by unfolding a projected gradient descent method based on the linear detection algorithm. The work in  is based on virtual MIMO blind detection clustered WSN system and applies improved hopfield neural network (HNN) blind algorithm to this system. Also, deep learning techniques has been applied for symbol detection in MIMO-OFDM systems as introduced in [26, 27].
In particular, one promising approach to design deep architectures is by unfolding an existing iterative algorithm . Each iteration is considered a layer and the algorithm is called a network. The learning begins with the existing algorithm as an initial starting point and uses optimization methods to find optimal parameters and improve the algorithm. From this point of view, the deep learning techniques provide a powerful tool to decide the optimal correction factors for the modified BP algorithms to achieve improved performance.
In this paper, to the best of the authors’ knowledge, a novel DNN MIMO detector based on the modified BP detectors is proposed for the first time. The main contributions are:
We propose a formal framework to design a DNN MIMO detector by unfolding the BP iterations. Two DNN MIMO detectors are introduced based on the damped BP and MS algorithms respectively. The deep learning techniques are utilized to decide the optimal correction factors.
Numerical results are presented to show the improved robustness and advanced performance of the DNN detectors compared with other BP variants and linear methods as the minimum mean-squared error (MMSE) approach.
We show that the proposed framework is universal for various channel conditions and antenna configurations.
The computational complexity of the DNN detectors is discussed. For online detections, the DNN detectors achieve improved performance at the same level of complexity as the other BP variants.
Training methodology is discussed with details. We show the ability of the proposed DNN detector to handle multiple channel conditions with one single training.
I-D Paper Outline
The remainder of this paper is organized as below. Backgrounds of BP MIMO detectors are introduced in Section II, in which the modified BP methods including damped BP, MS, NMS and OMS are also introduced. In Section III, we present the corresponding deep neural network MIMO detector based on modified BP algorithms. Section IV shows details of the proposed deep neural network detector, its training procedure, and numerical results. Section V concludes this paper.
Throughout the paper, we use the following notations. Lowercase letters (e.g., ) denote scalars, bold lowercase letters (e.g.,
) denote column vectors, and bold uppercase letters (e.g.,) denote matrices. Also, the symbol
denotes the identity matrix;denotes the natural logarithm; and denotes the complex Gaussian function.
Ii-a MIMO System Model
In this paper, we consider a MIMO system with transmitting and receiving antennas. Each user sends an independent data stream and the base station detects the spatially multiplexed data through MIMO detection. The received signal vector, , reads
where is the transmitted symbol vector, with the constellation , is determined by modulation mode; is the additive white Gaussian noise (AWGN) following ; denotes the channel matrix which can be described by the Kronecker model
according to , where and are the antenna correlation matrices at the receiver and transmitter side respectively, and
is i.i.d. Rayleigh-fading channel matrix following independent Gaussian distribution.
Ii-B Belief Propagation Detector
MIMO systems can be modeled by a factor graph as in Fig. 1 according to . BP allows observation nodes to transfer belief information with symbol nodes back and forth to iteratively improve the reliability for decision. The message updating at observation and symbol nodes at the -th iteration is summarized in the following equations:
where denotes the prior log-likelihood ratio (LLR), denotes the posterior LLR and is the prior probability of each symbol. The soft output after iteration is given by
and the that maximize is chosen as the final decision of the received signal. More details of BP are given in .
As the factor graph defined by the dense MIMO channel matrix is loopy as shown in Fig. 1, BP is not guaranteed to converge to the MAP. The antenna correlation can even aggravate the looping effect due to the less randomness in the channel matrix which brings degradation in results . Also, for each iteration, one division operation is needed to calculate the prior messages in Eq. (4), which brings difficulty to hardware implementation. From this viewpoint, two modifications of BP have been proposed to enhance the performance.
Ii-C Modified BP Detectors
Ii-C1 Damped BP
Message damping is a judicious option to mitigate the problem of loopy BP without additional complexity. With damped BP, the messages at the -th iteration in Eq. (4) can be smoothed as
where the symbol ”” denotes the assignment, is the damping factor to make a weighted average of the current calculated messages and the previous calculated messages.
It was observed in aforementioned works like [7, 8] that the damping could improve the convergence of the BP algorithms. However, the optimal damping factor is difficult to find. The available method relies on the bulky Monte Carlo simulations. In , the HAD method is proposed to automatically calculate the damping factor in each BP iteration. Specifically, the convergence of the messages can be measured by the closeness between the two successive messages, and
, with the Kullback-Leibler divergence:
As we have message vectors in total, the Kullback-Leibler divergence of the two successive messages can be finally averaged as
The heuristic damping factor in the -th iteration is then defined as
where is a positive constant determined with of the first iteration. This method shows improved convergence performance compared with BP, but requires online updates of the damping factor at each iteration, which leads to extra computational cost. More details can be found in .
Ii-C2 Max-Sum Algorithm
The max-sum (MS) algorithm is an approximation strategy of BP. The calculation of the prior probability at each iteration is simplified to eliminate the division operation, which relieves the great difficulty of hardware implementation with some performance loss. Specifically, by taking logarithm for both sides of Eq. (4) and substitute the resulted summation with the dominant term , we get
It is clearly seen that the elimination of the division in Eq. (11) reduces the hardware complexity greatly. However, the prior probabilities are overestimated owing to the approximation, which results in performance degradation. To compensate the loss while keeping similar computational complexity, we can apply two modified approaches, the normalized MS (NMS) and the offset MS (OMS) algorithm.
Let and denote the prior probability values calculated by Eq.s (4) and (11). As discussed above, will be slightly larger than . NMS aims at multiplying with a positive scale factor to get a better approximation, while OMS is dedicated to subtracting an offset factor from . Combining both modifications, the prior probability is computed as follows:
To accomplish performance enhancement, the values of and should be carefully selected. 
proposed an interpolation method to choose the optimal factors. Basically,and are pre-computed at sampled values of ’s, then the corresponding correction factors can be computed to minimize the error of at each value of . During the detection iterations, the correction factors are picked from the pre-computed list by nearest-neighbor interpolation of . In , this method shows promising performance with QPSK.
Iii Proposed DNN MIMO Detector
In this section, we propose a deep neural network (DNN) MIMO detector based on the modified BP algorithms introduced in Section II-C. The neural network is constructed by unfolding BP algorithms, mapping each iteration as a layer in the network. The damping, normalized and offset factors are the parameters to be optimized, and will be ”learned” by the deep learning techniques.
Iii-a Deep Neural Network
Deep neural network (DNN), also often called deep feedforward neural network, is one of the quintessential deep learning models. A deep neural network model can be abstracted into a function that maps the input to the output ,
where denotes the parameters that result in the best function approximation of mapping the input data to desirable outputs.
In general, a DNN has a multi-layer structure, composing together many layers of function units (see Fig. 2). Between the input and output layers, there are multiple hidden layers. For an
-layer feed-forward neural network, the mapping function in the-th layer with input from -th layer and output propagated to the next layer can be defined as
where denotes the parameters of -th layer, and is the mapping function in -th layer.
According to , a DNN can be designed by unfolding the BP algorithm, mapping each iteration to a layer in the network. This is resulted from the similarities between the BP factor graph and deep neural network, which are summarized in Table I. The BP algorithm is then improved by the deep learning optimization methods. Hence, a DNN-aided MIMO detector can be developed by unfolding the BP detection algorithm, which is introduced in the following section.
|Transmitted signals||Input data|
|Received signals||Output data|
|-th iteration||-th hidden layer|
|Belief messages , ,||Hidden signals|
|Message updating rules Eq. (3)-(5)||Mapping function between layers Eq. (17)|
|Correction factors , ,||Parameters|
Iii-B Multiscale Correction Factors
The purpose of the damping, normalized and offset factors are to ”correct” the iterated BP messages, hence we call them the correction factors. In damped BP, the damping factors are varying at each iteration. In the selection of the normalized/offset factors for MS, we further extend those factors to be different for each message . Actually, all the correction factors can be set distinct for each message at each iteration, and the calculation of the prior probability can be expressed in a more generalized way.
Specifically, by extending the damping factors, Eq. (7) can be re-written as
which is a multiple scaled damped MS approximation.
These extensions aim at further improvement of the performance. However, they also result in a greater number of parameters to be optimized, especially when the number of antennas are large. This is a complex optimization problem for traditional approaches, but can be handled by the powerful tools in deep learning.
Iii-C The DNN Detector
As described in Section II-B, at the -th iteration in BP, with the messages and from the previous layer , we update at the observation nodes, and then and are updated at the symbol nodes. This process counts as a full iteration step in BP, which can be mapped to a hidden layer in a deep neural network. In this way the BP detector is unfolded to construct a DNN detector.
Let denote the set of the parameters to be optimized, our DNN detector can be described as following,
where summarizes the -th iteration in modified BP algorithms with Eq.s (3), (5), and (15) or (16). is the soft output with denotes Eq. (6), and is the output of the DNN while denotes a sigmoid or a softmax function which rescales into range .
DNN-dBP: When we derive the DNN based on damped BP, Eq. (15) is used and , where are the damping factors at each layer. For simplicity, we denote this method as DNN-dBP.
DNN-MS: When the damped MS is applied, ’s are computed by Eq. (16). In this case, , where are the damping factors, are the normalized factors and are the offset factors at each iteration. This algorithm is called DNN-MS in the following context.
An example of the structure of the proposed DNN detectors is shown in Fig. 3 with three BP iterations presented. Suppose the MIMO system considered includes transmitting and receiving antennas. In general, the input layer has elements which are initialized with the prior information. For a detector with BP iterations, the DNN will contain hidden layers, each layer contains blue neurons that corresponds to in Eq. (17), which represents a full iteration in BP of updating the posterior then the prior messages. The choice of depends on the different modified BP algorithms. Finally, the output layer contains the sigmoid/softmax neurons. To increase the number of iterations in the DNN detector, we only need to concatenate a certain amount of identical hidden layers with blue neurons in Fig. 3 between the input and output layers.
The cross entropy is adopted to express the expected loss of the neural network output and the transmitted symbol , which evaluates the performance of the detector as following:
The mini-batch stochastic gradient descent (SGD) method is used to minimize the loss functionand decide the optimal damping factors
. With the aid of advanced DL libraries like Tensorflow, optimizations can be done efficiently.
Iv Numerical Results
For i.i.d. Rayleigh and correlated fading MIMO channels with different antenna configurations, numerical results of the proposed DNN detectors are given. DNN detector based on damped BP, the DNN-dBP, and DNN detector with MS, the DNN-MS, are both considered. MMSE results are set as benchmarks, and the performance of DNN is compared with the plain BP algorithm, the original MS algorithm and HAD. The BP algorithms in this paper are all based on the real domain single-edged BP as introduced in . The modulation of -QAM is used for all simulations. No channel coding is considered.
Iv-a DNN Architecture and Training Details
To numerically demonstrate the performance of the proposed DNN MIMO detector, the architecture of the neural network should be carefully selected. The settings of DNN-dBP and DNN-MS in our simulations are summarized in Table III, and details of these settings are discussed in this section.
Iv-A1 Configurations and neurons
As described in Section III-C, the number of neurons are selected simply according to the number of the transmitting antennas . Define as the system loading factor. Two types of antenna configurations are considered in our simulations: the symmetric configuration () with and the asymmetric configuration () with .
Iv-A2 The depth of DNN
The depth of the DNN relates to the number of BP iterations, which is another vital factor for implementation. As mentioned in Section III-C, if the number of iterations is , the depth of the network will also be . To properly select , it’s important to keep a good balance between the BER performance and the complexity. In our case, is decided with a greedy search method as follows: (i) A searching range of possible values of , , is decided by the BER performance of the original BP. This is based on the observation in the previous researches that with the same number of iterations, damped BP should show better performance. (ii) Starting with the smallest value , we train the DNN detectors and test the trained network to obtain the BER performance, till it plateaus. (iii) For simplicity, this process is done once for each antenna configuration of DNN-dBP and DNN-MS in i.i.d. channels. For instance, in the asymmetric configuration case of DNN-dBP, we set as the searching range, and the BER performance of the trained DNN-dBP is shown in Fig. 4. From which we pick .
Iv-A3 Training details
The DNN is implemented on the advanced deep learning framework Tensorflow . We train the network using a variant of the SGD method for optimizing deep networks, named Adam Optimizer . The signal-to-noise ratios (SNRs) are ranging from dB to dB (every dB). We used batch training with random data samples ( for each SNR step) at each iteration. For DNN-dBP, the network was trained for iterations, and the DNN-MS case was trained for iterations. Notice that only one offline training is performed for each antenna configuration in each case, and all the simulation results in different channel conditions are calculated with this trained network. The training parameters are all initialized as .
|SNRs for training||0510152025 dB|
|Size of training data|
|Optimization method||Adam optimizer|
Iv-B Numerical Results
Iv-B1 Asymmetric Antenna Configuration
In the simulations with asymmetric antenna configuration, , and . The depth of the DNN is set as for DNN-dBP and for DNN-MS. Fig. 5 shows the BER performance curve of DNN-dBP and DNN-MS in i.i.d. Rayleigh fading channels, and the results of MMSE, original BP, MS and HAD are also shown for comparison, together with BER performance in single-input single-output (SISO) channel with AWGN. The proposed DNN-dBP achieves similar performance as the original BP, and shows improved stability and outperforms original BP and MMSE at higher SNRs. For instance, at a BER of , the performance gap between BP and DNN-dBP is negligible, while the HAD result has a degradation of 1 dB. Meanwhile, the MS detection shows very large performance degradation due to the prior approximation, but DNN-MS can achieve a great improvement. However, the loss is still large compared with BP, as at a BER of , the degradation of DNN-MS already reaches 4 dB.
The simulation results in correlated channels are shown in Fig.s 6 and 7, in which the correlation coefficient of transmitting (Tx) or receiving (Rx) antennas is set as 0.3. In Fig. 6, the proposed DNN-dBP is compared with original BP and HAD. With the correlations considered, all the algorithms except MMSE suffer performance loss compared with the i.i.d. channels, among which Tx and Rx-Tx correlated channels show larger degradation. However, DNN-dBP outperforms the other methods greatly in all the correlation types, especially at higher SNRs. The results of the proposed DNN-MS are shown in Fig. 7 along with original BP and MS. In the correlated cases, the performance of MS shows an larger gap compared to BP, while DNN-MS achieves improvements. In the Rx correlated channels, the results from DNN-MS still shows a large degradation from BP. However in the Tx and Rx-Tx correlated channels, the results of DNN-MS is close to BP, with some degradation at lower and medium SNR, but better performance at larger SNR.
Iv-B2 Symmetric Antenna Configuration
In the symmetric antenna configuration, we consider , and hence . The depth of the network is set as . In Fig. 8, the simulation results of MMSE, BP, HAD, MS and the DNN detectors in i.i.d. channels are given. The performance of BP, HAD and DNN-dBP and MMSE are similar in this case. MS results shows large degradation from BP. DNN-MS results achieve some improvements, however, are still far from satisfying. Fig. 9 shows the results of BP, HAD and DNN-dBP in correlated channels with correlation coefficients set as 0.3. In all different types of correlations, DNN-dBP outperforms BP while shows slightly better results compared with HAD. Fig. 10 shows the results of BP, MS and DNN-MS in the correlated channels. Similar to the i.i.d. cases, DNN-MS curves show great improvements compared with MS, but still have great degradation from BP results.
Iv-C Performance Evaluation of the Proposed DNN Detectors
Iv-C1 DNN-dBP reduces BER in correlated channels
As presented in Fig.s 5 and 8, DNN-dBP shows similar performance as the original BP in i.i.d. channels. However, in Figs. 6 and 9, DNN-dBP achieves great improvements in channels with different correlations. This is consistent with the purpose of damping: to mitigate the problem of loopy BP in spatially correlated channels.
Iv-C2 DNN-MS achieves better performance compared to MS
The results of the original MS show large degradation due to the approximation of the priors. With DNN-MS, the BER curves are getting much closer to BP results, especially in the correlated channels according to Figs. 7 and 10. However, the detection performance of DNN-MS is still far from satisfying in the tests.
Iv-C3 DNN detectors perform better with
With the asymmetric antenna configuration, both DNN-dBP and DNN-MS achieve great performance improvements. DNN-BP outperforms BP and HAD as presented in Fig.s 5 and 6, while DNN-MS reaches comparable results with BP. However, when , the gain of the DNN detectors is limited as shown in Fig.s 8 and 10.
Iv-D Complexity Analysis
Iv-D1 Offline Training
In our numerical tests, we train the network once for each antenna configuration with each DNN detector. The training requires a large amount of data according to Table III. The total computational cost of training depends on the amount of these inputs, , hence is of high complexity as shown in Table IV. However, the training is done offline, and the complexity can be handled by powerful computational and storage devices. The trained network can be stored for multiple online uses. Another inevitable issue of the DNN is that the ”optimized” network depends on the range of the training data. In practical problems, the training data should be generated with certain scenarios that we focus on to reach optimal performance.
Iv-D2 Online Detection
The computational complexity of the proposed DNN detectors are compared with the other BP algorithms in Table IV. The BP modifications we consider are based on the real domain single-edged BP detector proposed in , which achieves reduced complexity of order at each iteration. All the presented methods, including original BP, HAD, MS and the proposed DNN-dBP and DNN-MS, share the same posterior message updating rule, which requires complexity per iteration. In the calculation of prior probabilities, original BP, HAD and DNN-BP require division operations at each iteration, which are unnecessary in MS and DNN-MS. And in HAD, the computation for the adaptive damping factors brings extra complexity of order at each iteration. However, the overall complexity of all the methods is of the order . Hence, the proposed DNN-dBP achieves improved BER performance with the same computation complexity as the original BP. DNN-MS detection reduces the complexity by eliminating divisions that are difficult to implement, and it outperforms the MS algorithms significantly without extra computational cost.
The recently proposed DNN based MIMO detector, DetNet
, shows advantages in the sense that the knowledge of the channel noise variance or SNR level is not required. It is based on a linear method which is not our focus and hence is fundamentally different from our work which requires channel estimation knowledge. It achieves great performance at a similar level of complexity for online detection of. However, a large number of hidden layers of DNN is needed to get satisfactory results, which also adds to the burden of the offline training cost.
In this paper, we present a novel framework of deep neural network MIMO detectors. The two proposed DNN detectors, DNN-dBP and DNN-MS, are designed by unfolding damped BP and MS BP algorithms, respectively. The architecture of the DNN detectors and the training strategies are discussed for implementation. Numerical results with different antenna configurations and various channel conditions are illustrated to show the advanced performance of the proposed detection methods. The future work will be directed towards further optimization of the DNN structure and efficient training methods. Also, this framework can be applied to improve other iterative algorithms as AMP.
The authors would like to thank Alex Yufit for useful discussion.
-  T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590–3600, 2010.
-  F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40–60, 2013.
-  X. Yuan, L. Ping, C. Xu, and A. Kavcic, “Achievable rates of MIMO systems with linear precoding and iterative LMMSE detection,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7073–7089, 2014.
-  A. K. Sah and A. Chaturvedi, “An MMP-based approach for detection in large MIMO systems using sphere decoding,” IEEE Wireless Commun. Mag., vol. 6, no. 2, pp. 158–161, 2017.
-  P. Li and R. D. Murch, “Multiple output selection-LAS algorithm in large MIMO systems,” IEEE Commun. Lett., vol. 14, no. 5, 2010.
-  N. Srinidhi, S. K. Mohammed, A. Chockalingam, and B. S. Rajan, “Low-complexity near-ML decoding of large non-orthogonal STBCs using reactive tabu search,” in Proc. of IEEE International Symposium on Information Theory (ISIT), 2009, pp. 1993–1997.
-  J. Yang, C. Zhang, X. Liang, S. Xu, and X. You, “Improved symbol-based belief propagation detection for large-scale MIMO,” in Proc. of IEEE Workshop on Signal Processing Systems (SiPS), 2015, pp. 1–6.
-  J. Yang, W. Song, S. Zhang, X. You, and C. Zhang, “Low-complexity belief propagation detection for correlated large-scale MIMO systems,” Journal of Signal Processing Systems, pp. 1–15, 2017.
K. P. Murphy, Y. Weiss, and M. I. Jordan, “Loopy belief propagation for
approximate inference: An empirical study,” in
Proc. of the 15th conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1999, pp. 467–475.
-  Q. Su and Y.-C. Wu, “On convergence conditions of gaussian belief propagation.” IEEE Trans. Signal Process., vol. 63, no. 5, pp. 1144–1155, 2015.
-  C. Jeon, R. Ghods, A. Maleki, and C. Studer, “Optimality of large MIMO detection via approximate message passing,” in Proc. of IEEE International Symposium on Information Theory (ISIT), 2015, pp. 1227–1231.
-  Mhlaliseni, Khumalo, Wan-Ting, and Chao-Kai, “Fixed-point implementation of approximate message passing (AMP) algorithm in massive MIMO systems,” Digital Communications and Networks, vol. 2, no. 4, pp. 218–224, 2016.
-  Y. Gao, H. Niu, and T. Kaiser, “Massive MIMO detection based on belief propagation in spatially correlated channels,” in Proc. of 11th International ITG Conference on Systems, Communications and Coding (SCC), 2017, pp. 1–6.
-  Y. Zhang, L. Ge, X. You, and C. Zhang, “Belief propagation detection based on max-sum algorithm for massive MIMO systems,” in Proc. of 9th International Conference on Wireless Communications and Signal Processing (WCSP), 10 2017, pp. 1–6.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
-  G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Process. Mag., vol. 29, no. 6, pp. 82–97, 2012.
-  E. Nachmani, Y. Be’ery, and D. Burshtein, “Learning to decode linear codes using deep learning,” in Proc. of 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2016, pp. 341–346.
-  W. Xu, Z. Wu, Y.-L. Ueng, X. You, and C. Zhang, “Improved polar decoder based on deep learning,” in Proc. of IEEE Workshop on Signal Processing Systems (SiPS), 2017, pp. 1–6.
-  L. Lugosch and W. J. Gross, “Neural offset min-sum decoding,” in Proc. of IEEE International Symposium on Information Theory (ISIT), 2017, pp. 1361–1365.
-  K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proc. of the 27th International Conference on Machine Learning (ICML), 2010, pp. 399–406.
-  M. Borgerding and P. Schniter, “Onsager-corrected deep learning for sparse linear inverse problems,” in Proc. of IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2016, pp. 227–231.
-  A. Mousavi and R. G. Baraniuk, “Learning to invert: Signal recovery via deep convolutional networks,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 2272–2276.
-  T. J. O’Shea, K. Karra, and T. C. Clancy, “Learning to communicate: Channel auto-encoders, domain specific regularizers, and attention,” in Proc. of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2016, pp. 223–228.
-  N. Samuel, T. Diskin, and A. Wiesel, “Deep MIMO detection,” arXiv preprint arXiv:1706.01151, 2017.
-  C. Jin, Y. Zhang, S. Yu, R. Hu, and C. Chen, “Virtual MIMO blind detection clustered wsn system,” in Proc. of Asia-Pacific Microwave Conference (APMC), vol. 3, 2015, pp. 1–3.
-  S. Mosleh, L. Liu, C. Sahin, Y. R. Zheng, and Y. Yi, “Brain-inspired wireless communications: Where reservoir computing meets MIMO-OFDM,” vol. PP, no. 99, pp. 1–15, 2017.
-  X. Yan, F. Long, J. Wang, N. Fu, W. Ou, and B. Liu, “Signal detection of MIMO-OFDM system based on auto encoder and extreme learning machine,” in Proc. of International Joint Conference on Neural Networks (IJCNN), 2017, pp. 1602–1606.
-  J. Proakis, Digital Communications, ser. Electrical engineering series. McGraw-Hill, 2001.
-  W. Fukuda, T. Abiko, T. Nishimura, T. Ohgane, Y. Ogawa, Y. Ohwatari, and Y. Kishiyama, “Low-complexity detection based on belief propagation in a massive MIMO system,” in Proc. of IEEE 77th Vehicular Technology Conference (VTC Spring), 2013, pp. 1–5.
-  A. Chockalingam and B. S. Rajan, Large MIMO Systems. New York, NY, USA: Cambridge University Press, 2014.
-  M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv preprint arXiv:1603.04467, 2016.
-  D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.