I Introduction
The basic procedure of demapping received symbols back into their embedded soft bits, feeding a following stage of error correction decoding, is a crucial component in any modern communication system. The soft bit, quantifying the level of confidence in the quality of the symbol’s demodulation process, is typically expressed in terms of the loglikelihood ratio (LLR).
An exact evaluation of the LLR is achieved via a scheme, known as the logMAP [1]
, computing the logarithm of the ratio between the maximum aposteriori (MAP) probabilities of the latent bit’s two hypotheses given the observed symbol. Though statistically optimal, the computational complexity of the logMAP algorithm scales with the size of the symbol constellation, making its direct implementation impractical in realistic systems.
A popular approximation of the optimal logMAP rule, serving as a feasible golden target in designing practical systems, is the wellknown maxlogMAP algorithm [2]. However, although eliminating the need to compute complex exponential and logarithmic functions, in general the approximated maxlogMAP algorithm still consumes, on a symbolrate basis, an extensive number of operations which scales with the modulation order.
There are certain constellation schemes, like the Graycoded quadrature amplitude modulation (QAM), for which the maxlogMAP boils down to a reducedcomplexity piecewise linear function and can be thus implemented, for instance, via a lookup table [3]. This simplification is attributed to the QAM’s inherent constellation mapping symmetries and its separability into two independent pulse amplitude modulations (PAM) in the real and imaginary parts of the symbol. However, as shall be exemplified in the sequel, such computational ease in the operation of the maxlogMAP algorithm on QAM constellations comes at the expense of significant performance degradation in the estimation of the LLRs compared to the exact logMAP, especially in the low and intermediate signaltonoise ratio (SNR) regimes. Further computationally simplified versions of the maxlogMAP for either QAM [3] or other modulation schemes, like phaseshift keying (PSK) (, [4] and references therein), suffer from an additional performance penalty w.r.t. the original "fullblown" maxlogMAP algorithm.
In this contribution, a machine learning architecture for efficient universal soft demodulation, dubbed "LLRnet", is proposed. One should bear in mind that the demapping procedure can be simply abstracted as nothing but a symboltobit LLRs function. Now, neural networks are wellknown as a great tool in effectively approximating functions (viz. Cybenko’s universal approximation theorem [5]). Hence it seems very natural and beneficial to train a neural network to directly learn the functionality of either the impractical exact logMAP, the cumbersome maxlogMAP, or any other expertbased target demapping rule. As shall be shown, the LLRnet provides an excellent mechanism for achieving the target demodulation rule performance. For the family of QAM constellations, LLRnet is shown to practically reproduce the exact logMAP algorithm with a substantially reduced computational burden (, ten times less the operations for 1024QAM).
The paper is organized as follows. The LLR estimation problem is described in Section II. Section III introduces the proposed LLRnet architecture for soft demodulation. Section IV provides simulation results and discusses the performance of LLRnet in the context of two commercial system applications, namely the cellular 5G and the satellite DVBS.2 standards. Finally, Section V contains some concluding remarks.
Ii Problem Formulation
Consider a generic communication system (as depicted in Fig. 1) with a modulator at the transmitter mapping an incoming stream of encoded bits, , to an outgoing stream of modulated symbols, , chosen from an arbitrary finite set of constellation points in a (possibly hyperdimensional) complex domain, , . To this end, let
, be a vector of some arbitrary
consecutive bits in the stream being modulated to an dimensional complex symbol .^{1}^{1}1Note that in most commercial systems is used.It is assumed that the modulator’s output power is normalized, namely . Hereinafter, it is assumed that all possible bit vectors, , are equiprobable, thus all the modulated symbols in the constellation setare equally likely to be transmitted. Incorporating knowledge on (unequal) prior probabilities of the modulated symbols in the discussed demodulation schemes is straightforward.
The symbol is transmitted through a composite channel comprising, in addition to the physical channel, an abstraction of other critical transceiver processing steps. On the transmitter side, these may consist of (but not limited to) precoding for transmission over multipleinput multipleoutput (MIMO) systems, mapping to orthogonal frequencydivision multiplexing (OFDM), digitaltoanalog conversion (DAC) and analog mixing on a radio carrier. On the receiver side, the composite channel may encapsulate, , analogtodigital conversion (ADC), downmixing to baseband, synchronization, filtering, OFDM demodulation, MIMO combining and channel estimation and equalization. These transceiver processing steps are standard and do not explicitly pertain to the problem under study of demapping, or soft demodulation, of the received symbol to bit LLRs.
The demodulator at the receiver observes the composite channel’s complexvalued symbol estimate, , where the effect of the composite channel is captured by the relation between the transmitted and observed symbols, and , respectively. Although not necessarily accurate, for the purpose of soft demodulation to bit LLRs, the composite channel is typically modeled as an additive white Gaussian noise (AWGN) channel, to yield ^=+, where the dimensional AWGN vector and . The scalar
denotes the standard deviation of the ambient noise in the physical channel, corresponding to some SNR. It is important to note that this work does not focus on optimizing the, so called, "composite channel" itself (, via improving MIMO demodulation), but in streamlining symbol demapping by adopting a machine learning approach.
Now mainly for the purpose of facilitating an improved decoding in a following stage, the demodulator demaps the observed symbol, , into a vector of estimates of the bit LLRs, , corresponding to the confidence in the inference of each of the original coded bits in . The ’th entry of the LLRs vector, , adheres to the logarithm of the MAP relation l_i≜log(Pr(ci=0^)Pr(ci=1^)), i=1,…,M. An exact computation of the logMAP expression (II), under the examined model (II), yields l_i=log∑∈Ci0exp(^22σ2)∑∈Ci1exp(^22σ2), i=1,…,M, where is the subset of constellation points, known at the receiver side, for which the ’th bit is equal to . Applying the approximation log(∑_jexp(x_j^2))≈max_j(x_j^2) on the exact logMAP operation (II), approximated LLR estimates can be derived from a simplified rule, wellknown as the maxlogMAP l_i≈1σ2(min_∈C_i^1^_2^2min_∈C_i^0^_2^2), i=1,…,M.
The computational complexity per received symbol of a brute force implementation of either the logMAP (II) or the maxlogMAP (II) algorithms scales with the size of the constellation, , to yield operations. However, the approximated maxlogMAP demapping rule has the advantageous property of eliminating the need for computing complex exponential and logarithmic functions. For dealing with the popular (Graycoded) QAM constellations, although being severely suboptimal the maxlogMAP is typically the common demapper of choice due to the fact that it can be reduced to a single LUT implementation of piecewise linear functions, which scales linearly with SNR. Note that the exact logMAP algorithm can be only crudely approximated via multiple LUTs, essentially one for each SNR working point [6].
Iii Soft Demodulation with LLRnet
LLRnet is a neural network designed to learn a desired soft demodulation scheme, demapping a complex symbol to its conveyed bits’ realvalued LLRs. The architecture of LLRnet is illustrated in Fig. 2. The input layer of the LLRnet is fed by the received symbol estimate vector, , where it is divided into its real and imaginary parts. The output vectors of the two input nodes are injected to a hidden layer of neurons. The scalar output of the ’th neuron is determined by , where and are the ’th neuron’s input vector, weights vector and bias, respectively. The operator
denotes the neuron’s transfer, or activation, function. Unless otherwise stated, in its hidden layer LLRnet uses a rectified linear unit (ReLU), namely
. The outputs of the hidden neurons are then inserted to a linear output layer of nodes. The ’th output of the LLRnet, estimating the ’th bit LLR, is determined by , where is a concatenation of the hidden layer outputs, while and are the ’th output node’s weights vector and bias, respectively. The set of all trainable parameters of the LLRnet, namely all the weights , and biases , , are denoted by .Fig. 3 schematically describes the training process of the neural demodulator. Since the LLRnet’s inputtooutput function is fully differentiable w.r.t. the set of all trainable parameters, , a gradientbased training approach can be adopted. The trainable parameters are first randomly initialized. For a certain batch of received symbols, , , compute the corresponding bit LLRs via a desired demapping algorithm, logMAP or maxlogMAP (for implementing these specific target demodulation algorithms, an estimate of is also required). Then feed both the conventionally computed LLRs, , along with the LLRs computed through LLRnet,
, into a loss function which is evidently a function of the trainable set of parameters
. Such a loss function could be, for instance, the meansquared error (MSE) L^MSE(θ)≜1B∑_b=1^B^^(b)^(b)_2^2, or alternatively the crossentropy function L^CE(θ)≜1B∑_b=1^B∑_m=1^Ml_m^(b)log(^l_m^(b))+(1l_m^(b))log(1^l_m^(b)). Define a stop criterion which can be either a fixed number of iterations, a threshold on the loss or a number of iterations during which the loss has not decreased. Unless the stop criterion is met, update the parameter set, , based on a learning algorithm using gradient descent , where is the learning rate. Compute the loss function again under the newly learned set of trained parameters. When the stop criterion is met and training is complete, one moves to the inference stage where the LLRnet serves as the sole demodulator, efficiently imitating the functionality of the desired soft demodulator. In the following section LLRnet is simulated and its performance is evaluated in two endtoend system usecases.Iv Simulation Examples
Iva Throughput of PDSCH in 5GNR
LLR, derived via logMAP, maxlogMAP and LLRnet, as a function of the real part of the symbol for the odd bits in 16/64/256QAM.
16QAM  64QAM  256QAM  
logMAP  LLRnet  logMAP  LLRnet  logMAP  LLRnet^{*}  
(K=8)  (K=16)  (K=32)  
Multiplication & division  52  40  198  112  776  288 (320) 
Addition & subtraction  104  40  564  112  2800  288 (352) 
Exponent & logarithm  20  0  70  0  264  0 (64) 
Comparator  0  8  0  16  0  32 (0) 
Total  176  88  832  240  3840  608 (736) 

Parentheses correspond to using , rather than ReLU, activation function at the hidden layer.
1024QAM  
logMAP  LLRnet  
(K=64)  
Multiplication & division  3082  704 
Addition & subtraction  13292  704 
Exponent & logarithm  1034  0 
Comparator  0  64 
Total  17408  1472 
# of 10ms frames  200 

Bandwidth  20MHz 
# of (12 subcarriers) resource blocks  51 
Subcarrier spacing  30KHz 
Cyclic prefix  Normal 
# of Tx. antennas  8 
# of Rx. antennas  2 
# of layers  2 
Transport channel coding  LDPC 
Target code rate  0.4785 
Modulation  16QAM, 64QAM, 256QAM 
PDSCH precoding  SVD, single matrix 
Waveform  CPOFDM 
Channel model  Clustered delay line (CDL) 
Channel estimation  Ideal 
MIMO equalization  Linear MMSE 
Synchronization  Perfect 
HARQ  Enabled, 16 processes 
The proposed LLRnet demodulation engine is utilized in the decoding of the physical downlink shared channel (PDSCH) in a 5G New Radio (NR) link, as defined by the 3GPP NR standard. The key configuration parameters of the simulated link are listed in Table I. In the simulated configuration, a frame (10ms) is divided into 10 subframes, where each subframe (1ms) is composed of 2 slots, and each slot (0.5ms) consists of 14 symbols. For each slot, out of the resource elements in the received resource grid, convey PDSCH symbols.
The training of the LLRnet relies on only about ( symbols) of the received PDSCH symbols, taken evenly across a single slot which is, in the reported results, the first slot in a 2 seconds transmission per evaluated SNR point. These symbols are randomly divided to three sets: (1) about of them are used for training, (2) about used for validation that the trained network has low enough generalization error and for avoiding excessive training which may result in undesirable overfitting, and (3) lastly, about of the symbols are used for a completely independent testing of the LLRnet generalization.
The chosen training algorithm in this example is the LevenbergMarquardt backpropagation algorithm
[7] and the training process continues until the validation error fails to decrease for consecutive iterations. The chosen loss function is the MSE, where the desired, or target, demodulation algorithm the LLRnet is trained to reproduce is the exact, yet costly, logMAP scheme (with ). The number of neurons in the hidden layer is set to for the constellations QAM (thus ), respectively. As mentioned previously, the neuron’s activation function is a ReLU. Upon completion of the training stage, PDSCH symbols in the consecutive slots are demodulated exclusively by the LLRnet. Retraining, so the LLRnet can readapt and learn the new demapping functions, is required only when a (nonmarginal) SNR working point change is identified. Alternatively, training for different SNR values can be performed offline or in a quasioffline manner, with the obtained trained sets being stored in the receiver’s memory. Another option is to provide the SNR level as an additional input to the LLRnet and train it within a wide range of SNRs. By doing so, the LLRnet can then also generalize across SNR points.Figs. 4(a)(c) present, for 16QAM, 64QAM and 256QAM (in which each symbol conveys , and bits, respectively), in three different SNR levels, the explicit demapping functions of the real part of the (received) symbol into its carried oddnumbered bits’ LLR for three different soft demodulation implementations: exact logMAP (II), approximate maxlogMAP (II) and the proposed LLRnet. As expected, in the higher SNR regimes (lower row in each figure) for the examined three QAM modulations the maxlogMAP exhibits a good estimate of the optimal logMAP rule. However, for the low and intermediate SNR regimes (two upper rows in each figure) the maxlogMAP only serves as a crude (low SNR) to reasonable (intermediate SNR) approximation to the exact demapper. On the other hand the soft bits inferred by the LLRnet practically coincide with the optimal ones in all three modulation cases across the entire relevant SNR range.
Next, the performance of LLRnet is evaluated in terms of measuring the PDSCH throughput. Figs. 5(a)(c) plot the PDSCH throughput as a function of SNR. The throughput is displayed both in terms of absolute values, in Mbps, and relative throughput, in percentage, w.r.t. the link’s maximum possible throughput under the given configuration. It is evident that LLRnet (again, with only neurons populating the hidden layer, respectively) essentially provides the same throughput as the optimal logMAP algorithm.^{2}^{2}2For high SNR points under QAM modulation, it was observed that using a hyperbolic tangent sigmoid transfer function yields slightly better performance than ReLU, at the expense of marginally more operations as depicted in Table II. It is also observed that in the low and intermediate SNR levels the throughput corresponding to the tractable maxlogMAP substantially lags behind the throughput associated with the LLRnet (and logMAP).
The capability of LLRnet to successfully imitate the operation of the logMAP algorithm is even more remarkable when comparing their computational complexity. Table II lists the number of (real) operations, per symbol, required for the execution of logMAP and LLRnet for the three different QAM constellations. Note that without any loss in throughput, LLRnet costs , , and less total operations than the logMAP algorithm. Bear in mind again these algorithms are run on symbol rate. For completeness, and although currently not yet being an integral part of 5GNR, but of WiFi 6 (802.11ax), the enumeration of the complexity savings (more than 90%) of LLRnet vs. logMAP for the 1024QAM are listed in Table III
. The overhead complexity of the training phase itself in this application is relatively small since the learning relies on only few symbols within a single received slot. In addition, the number of epochs required for training the LLRnet was typically below
. In this particular case of QAM constellations which can be divided into two separate PAM constellations, it is interesting to observe how LLRnet successfully learns, as anticipated, to null half of the hidden layer’s weights. Alternatively in such a case, one can simply use two smaller LLRnet architectures processing separately the real and imaginary parts of the symbol.IvB Packet Error Rate in DVBS.2
In this example, LLRnet is incorporated in the second generation digital video broadcasting standard (DVBS.2) for broadband satellite communications. In this application example, LLRnet, with neurons in the hidden layer, is trained to mimic the operation of the approximate maxlogMAP algorithm in the demapping of 8PSK modulation (thus ). A LDPC (decoded with a maximum of 50 iterations) and BCH codes serve as the inner and outer codes, respectively, in an AWGN channel. DVBS.2 frames were simulated for each SNR point. The training stage for this application example follows the procedure described for the training in Section IVA.
Fig. 6 plots the simulated packet error rate as a function of the SNR using two implementations of soft demodulation: maxlogMAP and LLRnet, this time learned to imitate the approximate maxlogMAP (with ), rather than the exact logMAP as in Section IVA. It can be observed that a perfect alignment in the performance curves of the two algorithms is achieved. However, it should be noted, that in this particular case, using LLRnet with neurons, the two implementations require roughly the same total number of operations (about 64 per symbol). It may happen that deep, rather than shallow, learning architectures could also demonstrate computational savings for 8PSK and higher PSK, or amplitude and PSK (APSK) constellations.
V Conclusion
In this paper, a "machine LLRning" approach is revealed for which a simple neural network architecture is trained to efficiently soft demodulate symbols to their bit LLRs, thus utilizing an artificial intelligence paradigm straight into one of the most generic building blocks of the physical layer processing. The proposed concept of LLRnet can be extended not only for multilayer deep learning architectures, but also be integrated in a more holistic trainable receiver structure, jointly carrying out the tasks of neural demodulator (as proposed in this contribution), along with trainable quantization of the LLRs, and a trainable channel decoding. Furthermore, by feeding the channel estimation itself into the LLRnet, the tasks of MIMO and multiuser detection could also be potentially tackled in a straightforward manner in such a framework.
References
 [1] J. Erfanian, S. Pasupathy, and G. Gulak, “Reduced complexity symbol detectors with parallel structure for ISI channels,” IEEE Transactions on Communications, vol. 42, no. 234, pp. 1661–1671, February 1994.
 [2] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison of optimal and suboptimal MAP decoding algorithms operating in the log domain,” in Proceedings IEEE International Conference on Communications ICC ’95, vol. 2, June 1995, pp. 1009–1013 vol.2.
 [3] F. Tosato and P. Bisaglia, “Simplified softoutput demapper for binary interleaved COFDM with application to HIPERLAN/2,” in 2002 IEEE International Conference on Communications. Conference Proceedings. ICC 2002 (Cat. No.02CH37333), vol. 2, April 2002, pp. 664–668 vol.2.
 [4] Q. Wang, Q. Xie, Z. Wang, S. Chen, and L. Hanzo, “A universal lowcomplexity symboltobit soft demapper,” IEEE Transactions on Vehicular Technology, vol. 63, no. 1, pp. 119–130, Jan 2014.

[5]
G. Cybenko, “Approximation by superpositions of a sigmoidal function,”
Mathematics of control, signals and systems, vol. 2, no. 4, pp. 303–314, 1989.  [6] Y. Yao, Y. Su, J. Shi, and J. Lin, “A lowcomplexity soft QAM demapper based on firstorder linear approximation,” in 2015 IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Aug 2015, pp. 446–450.
 [7] M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989–993, Nov 1994.