Deep Learning science has been studied since many decades. It experienced two technological revolutions in the past and currently, we can affirm that this discipline is empowering the third [Sejnowski2018]. This versatility of Deep Learning algorithms is pushing new applications in many areas of research.
In particular, Image Processing and Acoustics are studying different types of Deep Neural Networks (DNN) and their benefits to a wide variety of scenarios, such as image classification [Hou2015a], acoustics classification [Mun2017]Akbulut2017, Zeng2017] and solving regression problems [Lathuiliere2019]Sainath2013]
and Long-Short Term Memory (LSTM) networks are specially indicated for exploiting time series patterns[Karim2018]
. All these cases are grouped into the supervised learning area.
In the unsupervised learning area, Autoencoding Neural Networks (ANN) are specially interesting for data compression and reconstruction[Deng2014]. It enables multiple wide applications, such as image compression [Toderici2015]Sakurada2014]. The basic idea of ANN is to reduce the dimensionality of the data by consecutive hidden layers and recompose it. The results are particularly interesting not only for the aforementioned cases, but also for denoising [Vincent2008].
Despite of the majority of the Deep Learning progresses are carried out in the signal processing areas, radio communications can also benefit of the aforementioned approaches. For instance, DNN may be used for interference detection and classification [Henarejos2019], for theoretical mutual information computation by regression algorithms [Tato2018, Tato2018a]
, anomaly and outliers detection[Rajendran2019] or modulation identification [Karra2017, Rajendran2018], among others.
In this paper, we apply the combination of unsupervised learning of ANN for denoising and supervised learning for regression to signal processing. This approach is introduced by [LinZhou2016, Xie2017] for biomedics and we extend it to signal processing, aiming at increasing the performance of demapping and decoding tasks. In detail, we employ ANN for denoising and highlighting hidden features of radio signals and DNN for regression and prediction of the current wireless standard 5G New Radio (5G-NR) , which defines the channel coding by using Low-Density Parity-Check (LDPC) codes. The proposed scheme, named Autoencoding Deep Neural Networks (ADNN), replaces the classical approach of symbol demapping and decoding, since both operations are performed by the ADNN. Finally, our results unveil that the combination of ANN and DNN produces lower BER compared with traditional demapping and decoding implementations.
2 System Model
We consider a single-input single-output (SISO) point-to-point wireless system with an Additive White Gaussian Noise (AWGN) channel. The system is composed by a constellation mapper of order , that maps bits to complex baseband symbols, and a channel encoder, composed by a typical LDPC encoder. For a particular time instant , the system model is described as follows
where is the received symbol, is the encoded symbol and
is the AWGN. The input vectoris obtained by mapping the bits generated by the LDPC encoder to a constellation known by the receiver. The LDPC encoder has a rate of , where bits are produced for every bits at the input and . Fig. 1 illustrates the system model.
At the receiver side, the received symbol is demapped from the constellation and log-likelihood bits are obtained, the logbits. The logbits are decoded by the LDPC decoder, whose outputs are the information bits.
The LDPC channel coding is based on the transmission of blocks of bits. The length of these blocks is denoted by and, therefore, the length of the output blocks is . As explained in the next section, this behaviour of per-block basis can be exploited by CNN.
3 Autoencoding and Deep Neural Networks for Denoising and Decoding
In this section we introduce the novel approach for denoising and decoding the LDPC encoded blocks, though it can be generalized for other channel codes. ANN can be used for denoising signals and enhancing the SNR. Although it is used in image and speech processing, we propose to apply the fundamental concept of denoising to radio communications systems. After the denoising stage, the DNN decodes the symbols by performing a regression, producing soft decoded bits, the logbits.
The autoencoder is composed by an encoding CNN and a decoding CNN, stacked sequentially. The encoder compresses the data blocks and reduces the dimensionality of the input. The data is passed to the decoder, that increases the dimensions and restores the original dimension. The crucial aspect of this approach is the fact that the autoencoder is trained with known sequences and it is able to recreate the input if it is similar to some of the trained.
CNN are specially indicated for fixed size inputs and they are capable to extract hidden patterns from the data. At the same time, LDPC codes are perfectly suitable for this approach since they have fixed size and they use a parity matrices that introduce predefined patterns, given a particular 5G-NR numerology. Hence, this motivates the use of ANN for denoising radio signals that are encoded with LDPC codes.
The DNN is composed by several hidden layers, whose neurons perform the following operation:
where is the output vector, is the weight matrix, is the input, is the bias and
is the activation function.
Our proposed ANN encoder is composed by several convolutional and decimating layers, placed sequentially:
Input layer. This layer defines the entry point to the encoder and specifies the size of the input.
Noise layer. This layer introduces Gaussian white noise to the inputs for avoiding overfitting.
Filter layers. These layers are composed by consecutive convolutional 2D layers and pooling layers.
Note that filter sizes (i.e., the number of neurons) decrease sequentially by a factor of . For example, the first filter’s size is , the following is … and successively.
The ANN’s decoder is composed by convolutional layers and interpolating layers, placed sequentially:
Input layer. This layer defines the entry point of the decoder and the size is the same as the output of the encoder.
Defiltering layers. These layers are the counterpart of filtering layers.
In contrast to the encoder, filter sizes increase sequentially by a factor of until the original size is achieved.
The DNN is composed by the input layer, dense layers performing (2) and the activation layer.
depicts the proposed architecture of the ADNN for denoising and decoding. The size of data is depicted in brackets between layers. The inputs of the ADNN are the baseband complex symbols (real and imaginary parts) and the outputs are the decoded bits. The labelled names are taken from Keras API of Tensorflow 2.0.
3.2 Training, Validation and Testing
One of the most critical aspect of Deep Learning is the training process. During the training, the ADNN is fed with the baseband symbols (namely IQ samples) and the output is compared with bits at transmitting side. All weights and biases are adjusted to produce the output close to the bits at the transmitter. To produce a valid output, first we define the type of loss. In our case, we define two different losses:
Mean Squared Error (MSE): this metric computes the MSE between the output and the input.
BER: computes the BER between the output bits and the input bits at the transmitter.
The inputs of the ADNN are the baseband complex symbols, split in two columns (real and imaginary parts). From the neuronal network perspective, all inputs are real numbers, composed in a matrix of shape .
The processes are repeated sequentially until an acceptable convergence is reached. Each iteration, named epoch
, refines weights and biases in order to reduce the previous described losses (MSE and BER). To validate the output, a validation data is provided, which is not used during the training but at the end of each epoch for computing the validation losses.
Finally, the last data set is used for testing. This last process uses the input in order to emulate the production environment. At this stage, there is not losses analysis.
3.3 Noise Layer
In order to work with noisy scenarios, the training data set has to contain noise. However, since every epoch uses the same inputs, it may derive into overfitting. Using a noisy signal at the input it is equivalent to use the same noise realization at each epoch. To circumvent this, we employ the noise layer in order to generate different realizations for each epoch [Vincent2010]
. Hence, the inputs do not contain noise, but it is added by the Noise Layer. This approach decreases the probability to get into overfitting, despite of it is trained for a single SNR value.
3.4 Convolution and Pooling Layers
The Convolution Layer performs the convolution in two dimensions: IQ and time. The convolution process is described by
where Y is the output, X is the input, is the kernel and is the kernel size. For the sake of simplicity, we use the square convolution.
After the convolution process, the Pooling Layer takes a value from the output based on a criteria. In our proposal we employ the max-pooling criteria, though there are other mechanisms, such as average-pooling.
At a given position , the max-pooling criteria outputs the maximum value of the input that falls within the kernel. It can be described as
is the stride at vertical dimension. Note that we only consider two features, I and Q samples, which are completely independent. Hence, we do not perform max-pooling in the horizontal dimension, only in the vertical dimension.
The Upsampling Layer performs a repetition pattern of the input. It can be expressed as
where is the Kronecker product, and is the all-ones vector, whose size is .
3.5 Activation Layer
The Activation Layer is the last layer and produces the outputs of the ADNN. There are several activation methods, such as ReLu, sigmoid or hyperbolic tangent. Whilst ReLu and sigmoid are usually used by image processing due to the values of images are positive real numbers, in radio communications the IQ samples are zero-mean, making the distribution centered at. Hence, we employ hyperbolic tangent (namely tanh).
Since all neurons manage positive and negative real numbers, all neurons use tanh as activation function and the output of each neuron is symmetric with respect to . Hyperbolic tangent has the advantages of symmetry at but also a continuous derivative, which causes a smooth gradient avoiding discontinuities at the output.
In the next section we implement the proposed ADNN and we simulate a AWGN channel, where the receiver uses the proposed ADNN for denoising and decoding jointly.
In this section we simulate the proposed system. At the transmitter side, we generate
random bits uniformly distributed. The bits are grouped by segments oflength. Each segment is the input of LDPC encoder and its states are reset for every segment. At the output of LDPC encoder, there are bits. All bits are grouped in columns of size , corresponding to bits necessary for constellation mapping. We implement the classical approach [Chang2008] depicted in Fig. 1, which is considered the benchmark, and compare with the system proposed in Fig. 3.
The symbols at the output of constellation mapper are the baseband complex symbols, which are passed through an AWGN channel. At the receiver side, we assume perfect synchronization. For the sake of clarity, we do not consider other imperfections (carrier frequency offset, timing misalignments, etc.).
To test the proposed architecture, we use the framework TensorFlow 2.0. This framework provides an API and a set of layers that can be used for deploying these types of architectures. BER results are plotted by using the MATLAB Python API. In addition to the software, two dedicated GPU NVIDIA RTX 2080 Ti and CUDA 10.1 are used. Thanks to this hardware, all the simulations are boosted by hundreds orders of magnitude, compared to traditional computation in CPUs.
For the simulations, we assume the following hyperparameters:, , , training SNR dB and epochs. We use two coding rates: , , resulting a coding rate of , and , , resulting a coding rate . For the sake of the space, we use two coding rate, though it can be extended to other coding rates.
The system is tested by a SNR range of dB. Adam optimizer, MSE and BER losses are used for the training process. Note that, even though the ADNN is trained for a single value of SNR ( in our case), the testing process accepts a wide range of SNR at the receiver.
Fig. (a)a and Fig. (b)b depict the evolution of the accuracy ( means input completely equal to output) and the BER at each epoch, for training (green) and validation (gray) data sets. At each epoch, the network refines the weights and biases and the convergence is achieved after several decades. Note that, since we are using a noise layer, the accuracy of training cannot reach the maximum.
Fig. 7 depicts the BER of the proposed architecture (circle marker) compared with the benchmark (diamond marker), an implementation of demapper and decoder by using the Belief Propagation algorithm for AWGN channels [Chang2008]. We can observe that for low and medium SNR the performance of the proposed network is higher compared with the classical approach, which implies that obtains lower BER compared to the benchmark and it specially indicated for low and medium SNR regimes. In detail, at a BER target the proposed scheme requires dB less of SNR. The immediate consequence is the possibility to use the proposed ADNN for higher coding rates and increase the network throughput. Despite the most time consuming process is the training, once it is finished, the decoding of production is set is faster compared with iterative LDPC decoders.
In this paper we introduced a novel approach for decoding 5G data frames based on Deep Learning neural networks. Despite of ANN are used for anomaly detection and image compression, we also detailed how the ANN combined with DNN can be applied to future 5G and beyond radio communications. Furthermore, we introduced a meticulous architecture for the implementation of the proposed scheme. Finally, we showed the results of the performance of the proposed system compared with traditional implementations based on constellation demapping and LDPC decoders. Our approach unveiled higher performance and lower BER compared with the classical approach. The results describe that the proposed ADNN can obtain a gain of dB of SNR. As a future work, other channels can be extended to multipath for exploiting frequency correlations and also to spatial domain through a Multiple-Input Multiple-Output approach.