Deploying large antenna arrays have been considered as one of the key ingredients for future communications such as massive MIMO for sub-6GHz systems [1, 2] and millimeter wave communications [3, 4]. Excessive power consumption, however, has arisen as a bottleneck of realizing such systems due to the large number of high-precision ADCs at receivers. In this regard, employing low-precision ADCs have been widely studied to reduce the power consumption at receivers [5, 6, 7, 8]. As an extreme case of low-resolution ADCs, one-bit ADC systems have attracted large attention by significantly simplifying analog processing of receivers [9, 10, 11, 12, 13, 14, 15, 16].
State-of-the-art detection methods were developed for one-bit ADC systems [11, 12, 13]. In , an iterative multiuser detection was proposed by using message passing de-quantization algorithm. In , a high-complexity one-bit ML detection and low-complxity zero-forcing (ZF)-type detection methods were developed for a quantized distributed reception scenario. Converting the ML estimation problem in  to convex optimization, an efficient near ML detection method was proposed in . Such detection methods, however, require the estimation of channel state information (CSI) with one-bit quantized signals. Although high-performance channel estimation techniques were developed for one-bit ADC systems [14, 13, 15], channel estimation with one-bit quantized signals still suffers degradation in estimation accuracy compared to high-precision ADC systems. In this regard, we investigate a learning-based detection approach which replaces one-bit channel estimation with a probability learning process.
Recently, such a learning-based detection approach was studied [17, 18]. Since the primary challenge of such learning-based detection is large dependency on the training length, different detection techniques were developed such as empirical ML-like detection and minimum-center-distance detection in  to overcome the challenge. In , however, a channel estimation was used to initialize likelihood functions for ML detection, and a learning-based likelihood function was used for post update of the likelihood functions. Unlike previous approaches  that focused on developing robust detection methods, we rather focus on developing robust learning methods of likelihood functions to overcome the large dependency of the learning process on the training length.
In this paper, we investigate a learning-based ML detection approach which replaces one-bit channel estimation with a robust probability learning process as shown in Fig. 1. We propose a biased-learning algorithm which sets the minimum probability for each likelihood function with a small probability to prevent zero-probability likelihood functions from wiping out the obtained information through training. With the knowledge of the SNR, we further propose a dithering-and-learning technique to infer likelihood functions from dithered signals: we first add a dithering signal to the quantization input and then estimate the true likelihood function from the dithered quantized signals. The proposed method allows to estimate the likelihood probability with a reasonable training length by drawing the change of signs in the sequence of the quantized signals within the training phase. Accordingly, the BS can directly perform the ML detection which is optimal in minimizing the probability of detection error for equally-probable transmit symbols. The likelihood probability can be further updated by utilizing correctly decoded data symbols as pilot symbols. Simulation results demonstrate that unlike the conventional learning-based one-bit ML detection, the proposed detection techniques show robust detection performance in terms of symbol error rate, achieving comparable performance to the optimal one-bit ML detection that requires the estimation of the CSI.
Ii System Model
We consider uplink multiuser MIMO communications in which users each with a single antenna transmit signals to the BS with antennas. We assume the number of receive antennas is much larger than that of users, . The uplink transmission is composed of a pilot transmission phase and data transmission phase: the users first transmit pilot symbols during symbol time, and then, transmit data symbols during
symbol time. The total number of pilot symbol vectorsis denoted as , , and each pilot symbol vector is transmitted times during the pilot transmission phase, i.e., .
Let , , denote a data symbol vector at time . Then, the received signal vector at time is
where denotes the average user transmit power, is the channel matrix, and represents the additive noise vector at time, . Here,
represents the identity matrix with proper dimensions. Each user symbolis generated from the set of symbols, and assumed to have zero mean and unit variance, i.e., and , where denotes the th element of . We assume a block fading narrowband channel 111Although we assume a narrowband channel model for convenience, the proposed methods can be applicable to any block fading channel model. where the channel is invariant during the transmission of symbol time. We define the signal-to-noise ratio (SNR) as .
The received signals in (1) are quantized at one-bit ADCs. Accordingly, each real and imaginary parts of the received signals are quantized with one-bit ADCs, thereby outputting only the sign of the quantization input, i.e., either 1 or -1. The quantized signal can be represented as
where is a element-wise quantizer, and and denote the real and imaginary parts of a complex vector , respectively. The received signal in the complex-vector form can be rewritten in a real-vector form as
Accordingly, we can rewrite the quantized signal in a real-vector form as
and each element is quantized to be if or otherwise.
Iii Robust One-Bit ML Detection
In this section, we propose robust learning-based ML detection methods for one-bit ADC systems to achieve the ML detection performance without estimating channels. Being identical to the maximum a posteriori estimation, the ML estimation is optimal in minimizing the probability of detection error when all possible transmit symbols have an equal probability for being transmitted. We first introduce the conventional one-bit ML detection with the CSI in the following subsection.
Iii-a One-Bit ML Detection with CSI
Let be the th pilot symbol of pilot symbol vectors in a real-vector form, which is the th element in the set of all possible symbol vectors . The likelihood probability of the quantized signal vector for a given channel and transmit symbol vector can be approximated as
where denotes the likelihood function for the th element when the symbol is transmitted for the given channel, and it is defined as
Here, where is the th row of , and
is the cumulative distribution function (CDF) of a standard Gaussian distribution. Note that (7) becomes an exact representation of when all elements in are independent to each other. Based on (7), the ML detection rule is given as
Iii-B Robust One-Bit ML Detection without CSI
In this subsection, we introduce a learning-based one-bit ML detection approach which does not require channel estimation and propose robust learning techniques with respect to the training length . During the pilot transmission of length , each pilot symbol vector is transmitted times and the BS learns likelihood functions by measuring the frequency of and during the transmission as
where and is an indicator function which is if is true or otherwise. After learning the likelihood functions by using (10), the BS has the estimate of the likelihood probability for the quantized signal vector in the data transmission phase as
and can perform the ML detection in (9).
With a limited length of training , however, the empirical likelihood function for may have probability of zero after learning through transmissions if the change of signs of quantized output sequences , , is not observed during transmissions for the symbol . The likelihood functions with zero probability make the likelihood probability of the observed signal in (11) zero for many candidate symbols which may include the desired symbol. Consequently, the zero-probability likelihood functions wipe out the entire information obtained during the pilot-based learning phase, thereby severely degrading the detection performance. Note that it is even more likely to have zero probability in the high SNR.
Fig. 2 shows the symbol error rate (SER) performance of the learning-based ML detection with the number of training for receive antennas, users and -QAM modulation. The optimal one-bit ML detection introduced in Section III-A is also evaluated. Note that the optimal one-bit ML detection requires the CSI. As discussed, the SER increases as the number of training for each symbol decreases. In addition, the gaps between the optimal one-bit ML case and learing-based one-bit ML cases become larger as the SNR increase, which also corresponds to the intuition. Accordingly, the primary challenge of such learning-based detection is to make it robust to the training length over any SNR ranges. Therefore, we propose robust learning methods for one-bit ML detection with respect to the training length .
Iii-B1 Robust Learning-Based One-Bit ML without SNR
To address this challenge without requiring the SNR knowledge as well as the CSI, we propose a biased-learning ML detection approach, which is simple but highly robust to the length of the training . Then, we will extend the proposed detection approach to the case with the SNR knowledge to improve learning performance. In this approach, we limit the minimum likelihood function to be . The bias probability needs to be as no change in the sign of quantization output sequences is observed within transmissions for . The proposed biased-learning ML detection approach is summarized as
In the pilot transmission phase, the BS computes the likelihood functions in (10).
If zero probability is observed for any likelihood function , the BS sets and .
Although the proposed biased-learning ML detection approach prevents the loss of information obtained from the measurement during the pilot transmission, it cannot capture the variance of probabilities among the zero-probability likelihood functions. In addition, the number of likelihood functions with zero probability tends to increase as the SNR increases and the proposed biased-learning ML detection needs to replace a large number of the zero-probability likelihood functions with the bias probability . Accordingly, although the proposed biased-learning ML detection improves the detection performance, the large dependency on the bias probability in the high SNR may not be desirable. To resolve such challenges, we further propose a dithering-and-learning one-bit ML detection method with the presence of the SNR knowledge.
Iii-B2 Robust Learning-Based One-Bit ML with SNR
Now, we propose a dithering-and-learning one-bit ML detection under the SNR knowledge at the BS, where the noise variance is known to the BS as well as the transmit power . As shown in Fig. 1, the BS adds dithering signals to the quantization inputs during only the pilot transmission phase to draw the change in the sign of output sequences . After dithering, the quantization input in the real-vector form becomes
We use which follows a Gaussian distribution with zero mean and variance of , i.e., , and the variance is known to the BS. Then, the dithered and quantized signal becomes
Finally, the BS utilizes the estimated and known to estimate the true (non-dithered) likelihood function by using (8). Since the likelihood function of the dithered signal in (15) is much less likely to have zero probability than the non-dithered case, the BS can learn majority of the likelihood functions with a reasonable training length .
Likelihood functions with zero probability can still exist even after dithering when is insufficient. If the BS observes zero-probability likelihood functions, it can also apply the biased-learning approach to the likelihood functions. Unlike conventional dithering approaches [19, 20], the dithered signal is not removed after quantization in the dithering-and-learning one-bit ML detection method since the dithering is used only for drawing the change in the signs of the sequence of received signals within . In addition, the BS only needs to know the variance of the dithering distribution, and the dithering is not used during the data transmission phase.
The estimation of the SNR —equivalently noise variance in this paper—can be performed by offline training as shown in Fig. 3(a). The offline training first collects training data and measures the SNR by estimating channels. Then, the BS obtains data sets of over tested SNR values, where denotes the average number of zero-probability likelihood functions out of for the SNR . Using the collected data sets, the offline training provides the mapping between the average number of likelihood functions with zero probability and the SNR level, providing the estimated value of21]. Fig. 3 shows the example of offline training with 5th order linear regression, which we use for simulations.
Iii-C Post Update of Likelihood Functions
The performance of the proposed algorithms can be further improved by adopting the post update approach which exploits the correctly decoded data symbols to update the initially estimated likelihood functions . To this end, the BS divides the data transmission into subframes of length , i.e., , and appends cyclic-redundancy-check (CRC) bits at each data subframe. Then, when each data subframe is correctly decoded, which can be determined by checking CRC, the BS uses the decoded symbols as pilot symbols to update the initial likelihood functions.
Using the correctly decoded symbols, the likelihood functions for the biased-learning approach can directly be updated: after decoding each data subframe , the likelihood functions are updated as (10) by counting the number of out of , where denotes the number of cases where the decoded data is in the successfully decoded data subframes during the first data subframes, . For the dithering-and-learning method, the likelihood functions are updated after decoding each data subframe as shown in .
where is the number of successfully decoded data frames during the first data subframes, is the update rate for after decoding the th data subframe, is the initially estimated likelihood function from the training phase, and is the likelihood function for the candidate symbol vector at the th quantized signal learned from the correctly decoded data subframes. The optimal value of the parameter , however, needs to be empirically determined. Accordingly, such update approach can provide more benefit to the biased-learning method than the dithering-and-learning method.
Iv Simulation Results
In this section, we evaluate the performance of the proposed learning-based algorithms in terms of the SER. In simulations, we compare the following learning-based detection methods which does not require the channel estimation:
Learning 1-bit ML: conventional learning-based ML
empirical ML Detection (eMLD) in 
Minimum-Mean-Distance (MMD) in 
Minimum-Center-Distance (MCD) in 
Biased learning 1-bit ML (proposed)
Dithered learning 1-bit ML (proposed) with perfect SNR knowledge and with estimated SNR.
In addition, we also evaluate one-bit ADC detection methods that require the channel estimation to provide reference performance: one-bit zero forcing (ZF) detection in  and optimal one-bit ML introduced in Section III-A. We consider receive antennas, users with -QAM transmission, Rayleigh channels whose each element follows , and bias probability for simulations. In addition, we use the proposed offline training with th order linear regression to estimate the SNR for the dithering-and-learning method.
Fig. 5 shows the SER for (a) and (b) with the dithering noise variance of . We note that the proposed algorithms closely follow the SER performance of the optimal one-bit ML case over the considered SNR range in both Fig. 5(a) and Fig. 5(b). Although the one-bit ZF detection show the better performance than the other methods in the low SNR, it shows the large performance degradation in the medium to high SNR. The proposed methods outperform the one-bit ZF detection and the other learning-based methods with the same such as conventional learning-based one-bit ML, eMLD, MMD, and MCD in most cases. In particular, as the number of training increases, the performance gap between the proposed methods and the other learning-based methods increases.
The performance improvement is achieved because the proposed methods provide robust likelihood function learning with the same , and thus, the ML detection can be directly performed, which is optimal for certain cases. We further note that the dithering-and-learning ML method with the estimated SNR achieves similar performance to the perfect SNR case, which shows the effectiveness of the offline learning and robustness of the proposed detection method to the SNR estimation error. Accordingly, the proposed learning-based detection methods achieve a near optimal detection performance over the low to high SNR regime, providing the robust performance with respect to the training length. In addition, the training length can be reduced to half of the desired length by utilizing a symmetric property of constellation mapping and quantization .
Fig. 6 shows the average number of zero-probability likelihood functions versus the SNR level for the non-dithering case and dithering case with training and dithering noise variance. As the SNR increases, the number of zero probability likelihood functions for the non-dithering case rapidly increases, and more than out of (about ) likelihood functions have zero probability in the high SNR. For the dithering case, however, the number of zero-probability likelihood functions slowly increases with the SNR and converges to about (about ) due to the dithering effect. Accordingly, the dithering case provides about nonzero likelihood functions while the non-dithering case offers only about nonzero likelihood functions in the high SNR. Therefore, with dithering, the proposed algorithm can estimate much more likelihood functions— in this case, thereby increasing the detection accuracy. This corresponds to the discussion provided in Section III-B.
The proposed method with the post likelihood function update approach is also evaluated with , , , and -bit CRC in Fig. 7. In the high SNR regime where most subframes can be correctly decoded, the biased-learning method shows noticeable SER improvement and outperforms the dithering-and-learning method while the dithering-and-learning method shows marginal or no improvement. This corresponds to the intuition that the post update approach provides more opportunity for the biased-learning one-bit ML detection to improve its detection accuracy. The dithering-and-learning method, however, still shows high detection accuracy and robustness to the training length and the SNR level. Therefore, the proposed algorithms provide near optimal one-bit ML detection performance with a reasonable training length.
In this paper, we proposed robust learning-based one-bit ML detection methods for uplink massive MIMO communications. Since the performance of a learning-based one-bit detection approach can be severely degraded when the number of training symbols is insufficient, the proposed methods addressed such challenge by adopting bias probability and a dithering technique. Without requiring the channel knowledge, the biased-learning method and the dithering-and-learning method perform ML detection through learning likelihood functions, which is robust to the number of training symbols. Simulation results demonstrate the detection performance of the proposed methods in terms of symbol error rate. Therefore, the proposed robust learning-based one-bit ML detection methods can potentially achieve the improved performance-power tradeoff for one-bit massive MIMO systems.
-  H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral efficiency of very large multiuser MIMO systems,” IEEE Trans. on Commun., vol. 61, no. 4, pp. 1436–1449, 2013.
-  E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Mag., vol. 52, no. 2, pp. 186–195, 2014.
-  Z. Pi and F. Khan, “An introduction to millimeter-wave mobile broadband systems,” IEEE Commun. Mag., vol. 49, no. 6, 2011.
-  J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and J. C. Zhang, “What will 5G be?” IEEE Journal on Sel. Areas in Commun., vol. 32, no. 6, pp. 1065–1082, 2014.
-  C.-K. Wen, C.-J. Wang, S. Jin, K.-K. Wong, and P. Ting, “Bayes-optimal joint channel-and-data estimation for massive MIMO with low-precision ADCs,” IEEE Trans. on Signal Process., vol. 64, no. 10, pp. 2541–2556, 2016.
-  C. Studer and G. Durisi, “Quantized massive mu-mimo-ofdm uplink,” IEEE Trans. on Commun., vol. 64, no. 6, pp. 2387–2399, 2016.
-  J. Choi, B. L. Evans, and A. Gatherer, “Resolution-adaptive hybrid MIMO architectures for millimeter wave communications,” IEEE Trans. on Signal Process., vol. 65, no. 23, pp. 6201–6216, 2017.
-  J. Choi, J. Sung, B. L. Evans, and A. Gatherer, “Antenna Selection for Large-Scale MIMO Systems with Low-Resolution ADCs,” IEEE Int. Conf. on Acoustics, Speech, and Signal Process., 2018.
-  A. Mezghani and J. A. Nossek, “On ultra-wideband MIMO systems with 1-bit quantized outputs: Performance analysis and input optimization,” in IEEE Int. Symposium on Inform. Theory. IEEE, 2007, pp. 1286–1289.
-  J. Mo and R. W. Heath, “Capacity analysis of one-bit quantized MIMO systems with transmitter channel state information,” IEEE Trans. on Signal Process., vol. 63, no. 20, pp. 5498–5512, 2015.
-  S. Wang, Y. Li, and J. Wang, “Multiuser detection in massive spatial modulation MIMO with low-resolution ADCs,” IEEE Trans. on Wireless Commun., vol. 14, no. 4, pp. 2156–2168, 2015.
-  J. Choi, D. J. Love, D. R. Brown III, and M. Boutin, “Quantized Distributed Reception for MIMO Wireless Systems Using Spatial Multiplexing,” IEEE Trans. Signal Process., vol. 63, no. 13, pp. 3537–3548, 2015.
-  J. Choi, J. Mo, and R. W. Heath, “Near maximum-likelihood detector and channel estimator for uplink multiuser massive MIMO systems with one-bit ADCs,” IEEE Trans. on Commun., vol. 64, no. 5, pp. 2005–2018, 2016.
-  J. Mo, P. Schniter, N. G. Prelcic, and R. W. Heath, “Channel estimation in millimeter wave MIMO systems with one-bit quantization,” in Asilomar Conf. on Signals, Systems and Comp., 2014, pp. 957–961.
-  Y. Li, C. Tao, G. Seco-Granados, A. Mezghani, A. L. Swindlehurst, and L. Liu, “Channel estimation and performance analysis of one-bit massive MIMO systems,” IEEE Trans. Signal Process., vol. 65, no. 15, pp. 4075–4089, 2017.
-  C. Mollén, J. Choi, E. G. Larsson, and R. W. Heath Jr, “Uplink Performance of Wideband Massive MIMO With One-Bit ADCs.” IEEE Trans. Wireless Commun., vol. 16, no. 1, pp. 87–100, 2017.
-  Y.-S. Jeon, S.-N. Hong, and N. Lee, “Supervised-Learning-Aided Communication Framework for MIMO Systems with Low-Resolution ADCs,” IEEE Trans. on Veh. Technol., 2018.
Y.-S. Jeon, M. So, and N. Lee, “Reinforcement-learning-aided ML detector for uplink massive MIMO systems with low-precision ADCs,” inIEEE Wireless Commun. and Networking Conf., 2018, pp. 1–6.
-  L. Schuchman, “Dither signals and their effect on quantization noise,” IEEE Trans. on Commun. Technol., vol. 12, no. 4, pp. 162–165, 1964.
-  R. M. Gray and T. G. Stockham, “Dithered quantizers,” IEEE Trans. on Inform. Theory, vol. 39, no. 3, pp. 805–812, 1993.
-  K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989.