I Introduction
Ia Background and Motivations
Detection of modulated signals based on noisy channel observations in the presence of interference is one of the most basic building blocks in communication systems. It has been a longstanding challenge to design a highaccuracy and lowcomplexity signal detection method that performs well for general communication systems. Intensive research endeavors have been focused on exploiting special structures of communication systems to design efficient signal detection methods. For example, the channel sparsity in massive multipleinput multipleoutput (MIMO) systems [zhang2017blind] and cloud radio access networks [fan2017scalable] has been utilized to design messagepassingbased detection algorithms with low complexity. Liu et al. proposed a discrete firstorder detection method for largescale MIMO detection with provable guarantees based on the independent and identically (i.i.d.) distributed channel coefficients [liu2017discrete]. In this paper, we focus on banded linear systems, in which the channel matrices are banded matrices. The banded structure of a system can be caused by, e.g., intercarrier interference (ICI) and intersymbol interference (ISI). For instance, in frequency selective channels, ISI arises between adjacent received symbols, yielding a banded channel matrix [leus2011estimation]. Similarly, 2D magnetic recording (TDMR) systems typically suffer from 2D banded ISI caused by a combination of downtrack ISI and intertrack interference at the read head [carosino2015iterative]. In doubly selective channels, orthogonal frequency division multiplexing (OFDM) systems may experience significant ICI from adjacent subcarriers, which implies that the channel matrix in the frequency domain can be approximated as a banded matrix [liu2015banded]
. Traditional detectors ignoring the banded structure will lead to inferior performance. For example, detectors designed for interferencefree systems will cause a low estimation accuracy. Meanwhile, detectors designed for general interference systems usually have very high computational complexity.
The banded structure of the channel has been extensively studied to reduce the complexity in signal detection. For example, the wellknown BahlCockeJelinekRaviv (BCJR) algorithm [cocke1974optimal] can be employed in a banded system to achieve the optimal maximum a posterioriprobability (MAP) detection. Nevertheless, this approach is disadvantageous in communication systems with large signal dimensions due to its intrinsic serial algorithm and exponential complexity in band width. In [rugini2005simple], Rugini et al. proposed to reduce the complexity of the linear maximum mean square error (LMMSE) detector through LDL factorization. However, there exists a considerable performance gap between the linear detector and the MAP detector. Iterative algorithms including iterative MMSE [schniter2004low] and belief propagation [ochandiano2011iterative] have been proposed as nearoptimal solutions. These iterative algorithms typically require a large number of iterations to obtain an estimate with high accuracy. Moreover, it is difficult to efficiently implement the iterative algorithms in parallel, which significantly limits the computational efficiency. In a nutshell, there is a fundamental tradeoff between computational complexity and detection accuracy in signal detection problems. It is highly desirable to design a detection algorithm that achieves both high accuracy and low complexity for banded linear systems, which is the focus of this paper.
IB Contributions
Motivated by the recent advances in deep learning [lecun2015deep]
, we aim to design highaccuracy lowcomplexity signal detectors based on deep neural networks (DNNs). Instead of using a general DNN, we propose to design the detector based on a convolutional neural network (CNN) that consists of only convolutional layers. The reasons for CNNbased signal detection are explained as follows. First, it is well known that DNNs with fully connected layers suffer from the curse of dimensionality, i.e., the number of tunable parameters significantly grows as the system size increases. In a CNN, all neurons in a layer share the same set of tunable parameters, which addresses the curse of dimensionality. Secondly, a DNN with fullyconnected layers has to be retrained once the system size changes. In contrast, when the tunable parameters are welltrained, a CNN can be applied to systems with different sizes without the need of retraining.
Despite the advantages of being scalable and robust to the system size, it is nontrivial to employ CNN for signal detection. The success of CNN is based on the assumption that if one set of parameters is useful to extract a feature at a certain spatial position, then the same set of parameters is also useful to extract the feature at other positions. Such shiftinvariance assumption, although holds in many computer vision problems, does not hold in a signal detection problem. To address this challenge, we propose a novel CNNbased detection architecture consisting of three modules: an input preprocessing module, a CNN module, and an output postprocessing module. The input preprocessing module reorganizes the input (i.e., the channel matrix and the received signals in this paper) based on the banded structure to obtain the shiftinvariance property. Then, the shiftinvariant input is fed into the CNN, the output of which is processed through the output postprocessing module to give an estimate of the transmitted signals. To the best of our knowledge, our work is the first attempt to design a CNNbased detector for banded linear systems.
We conduct extensive numerical experiments to show that the CNNbased detector performs much better than existing detectors with comparable complexity. Moreover, the proposed CNN demonstrates outstanding robustness for different system sizes. It achieves a high accuracy even if there is a mismatch between the system sizes in the training set and the testing set. In addition, we extend the proposed CNNbased detector to nearbanded channels, such as 1D nearbanded channels in doubly selective OFDM systems and 2D nearbanded channels in TDMR systems with 2D ISI. Specifically, we propose a cyclic CNN (CCNN) for 1D nearbanded channels, and propose a 2D CNNbased detector for 2D nearbanded channels. Through simulations, we show that the proposed detector still performs well in these systems, where the channel matrix is not in a strictly banded structure.
In summary, the benefit of the proposed CNNbased detector is at least fourfold.

The proposed CNN approach relieves the burden to establish a sophistical mathematical model for the communication system, since it provides a universal detector that automatically adapts to any channel and noise distributions.

The CNNbased detector achieves much better error performance than the other detectors with comparable computational complexity, and is ideally constructed for parallel computing.

Thanks to the parametersharing property, the proposed CNN is robust to mismatched system sizes in the training set and the testing set.

The CNNbased detector can be readily extended to systems without a strictly banded structure. As such, the proposed CNN approach sheds lights on how to design CNNbased algorithms for other problems in communication systems with a near banded structure.
IC Related Work
Recently, there have been two threads of research on the application of deep learning for signal detection in communication systems. The first thread is to design deep learning based detectors by unfolding existing iterative detection algorithms. That is, the iterations of the original algorithm are unfolded into a DNN with each iteration being mimicked by a layer of the neural network. Instead of predetermined by the communication model (i.e., the channel matrix, the modulation scheme, the distribution of noise, etc.), the updating rule at each layer is controlled by some tunable parameters, which are learned based on the training data. For example, [gregor2010learning] unfolded two wellknown algorithms, namely iterative shrinkage and thresholding algorithm (ISTA) [beck2009fast] and approximate message passing (AMP) [rangan2011generalized], for a fixed channel matrix. It is shown that the proposed neural networks significantly outperform the original algorithms in both computational time and accuracy [gregor2010learning]. The second thread is to treat the transmission procedure as a black box, and utilize conventional DNNs for signal detection. [ye2018power] showed that a fully connected neural network is able to detect signals for various channel realizations. Specifically, [ye2018power] utilized deep learning to realize joint channel estimation and signal detection in OFDM systems, where the channel matrix is diagonal. It is demonstrated that the deep learning approach achieves a higher detection accuracy than existing modelbased detection approaches with comparable complexity. In [farsad2018neural], Farsad et al.
presented a recurrent neural network (RNN) for detection of data sequences in a Poisson channel model, which is applicable to both optical and chemical communication systems. The proposed RNN can achieve a performance close to the Viterbi detector with perfect CSI.
Besides signal detection, deep learning has demonstrated its potential in other areas of communication systems. Nachmani et al. studied the problem of channel decoding through unfolding traditional belief propagation (BP) decoders [nachmani2016learning]. Most recently, Liang et al. proposed an iterative belief propagationCNN architecture for channel decoding under a certain noise correlated model [liang2018iterative]. A standard BP decoder is used to estimate the coded bits, followed by a CNN to remove the estimation errors of the BP decoder, and obtain a more accurate estimation. In [dorner2017deep], Dorner et al. presented an endtoend communication system to demonstrate the feasibility of overtheair communication with deep neural networks. As shown in [dorner2017deep], the performance is comparable with traditional modelbased communication systems.
ID Organization
The rest of the paper is organized as follows. In Section II, we present the system model as well as its extensions to nearbanded systems, and discuss the challenges of utilizing traditional DNNs to detect signals. In Section III, we propose the CNNbased detector based on the banded structure of the channel matrix, and illustrate the robustness of the proposed detector. In Section IV, we extend the proposed detector to nearbanded systems. In Section V, the performance of the proposed deep learning approach is evaluated in different channel models, and is compared with existing algorithms. In Section V, we also show the performance of the proposed CNN in practical OFDM systems and TDMR systems. Conclusions and future work are presented in Section VI.
Ii System Model
Iia Linear Banded Systems
In this paper, we consider a linear channel model with the received signal written as
(1) 
where is the channel matrix,
is the vector of transmitted signals
^{1}^{1}1For simplicity, we use BPSK as the modulation method, but the proposed deep learning approach can be readily extended to systems with other modulation methods., and is the noise vector. Furthermore, we assume that the channel matrix is a banded matrix with bandwidth . That is,(2) 
where is the th element in the channel matrix and is the bandwidth of the channel matrix (see Figure 1). Under this assumption, the th entry of in (1) can be rewritten as
(3) 
where and are the th entries of and , respectively.^{2}^{2}2In (3), we assume for and . We assume perfect channel state information at the receiver, i.e., the channel matrix is exactly known by the receiver.
The banded system in (3) may be idealized in practical scenarios. We next introduce two nearbanded systems with the channel matrices obtained from real applications. We will show that the CNNbased detector can be readily modified to handle the nearbanded systems.
IiB NearBanded Systems
IiB1 1D NearBanded systems
In certain systems, such as systems with a doubly selective OFDM channel, in addition to the nonzero entries on the diagonal band, the channel matrix has nonzero entries in the bottomleft corner and the topright corner due to the nonnegligible ICI [schniter2004low]. The structure of the channel matrix is shown in Figure 2, where the entries of the channel matrix satisfy
(4) 
IiB2 2D NearBanded systems
A TDMR system usually suffers from 2D banded ISI modeled by convolving the data with a 2D spatial impulse response [wu2003iterative]. The output of the channel is a matrix with the th element given by
(5) 
where is the noise, is a 2D read head impulse response, and is the number of elements over which the ISI extends in each dimension. As shown in Figure 3, the TDMR ISI system is actually a 2D extension of the banded linear system. That is, each received signal in a TDMR system is a linear combination of the neighbouring transmitted signals in the 2D space.
Signal detection in a nearbanded system is usually more challenging due to the more complicated structure of the interference. As shown in Section IV, the proposed CNNbased detector can be readily extended to these nearbanded systems, and hence is more flexible than traditional modelbased detectors.
IiC Architecture of a DNNBased Detector
In this subsection, we briefly introduce the architecture of a DNNbased detector. As shown in Figure 4, the DNNbased detector treats both the channel matrix and the received signal as input and outputs a vector of estimated symbols . This implies that once welltrained, the proposed DNNbased detector can adapt to various channel realizations. Moreover, unlike most existing detection approaches based on the probability model of the system in (1
), the DNN based approach does not rely on the probability distributions of the channel coefficients and the noise
. Instead, the proposed neural networks are able to learn the model information from the training data.Typically, a DNN may consist of fullyconnected layers, denselyconnected layers, convolutional layers, or their mixture. Due to the huge amount of connections between neurons, a DNN with fullyconnected or denselyconnected layers suffers from the curse of dimensionality, and does not scale well to large systems. More specifically, the number of weights and biases associated with each fullyconnected or denselyconnected neuron grows linearly with the size of the input. This means that the total number of tunable parameters increases quadratically with the size of input, which renders it difficult to train a DNN for a large system. Moreover, a DNN has to be retrained once the system size changes, because the number of tunable parameters varies with the system size. Noticeably, the DNN training is a timeconsuming task, as it usually involves a large amount of data and requires high computational complexity. To deal with these challenges, we propose to detect signals through a DNN that consists of only convolutional layers (or called CNN). In a CNN, all neurons in a layer share the same set of tunable parameters, implying that the number of tunable parameters does not scale with the system size. Nonetheless, to achieve good performance with a CNN, the input is required to have shiftinvariant properties, and the convolutional filter is required to be carefully designed. In the next section, we introduce the proposed CNNbased detector for strictly banded linear systems. The extension to nearbanded systems will be discussed in Section IV.
Iii CNNBased Detector
In this section, we first describe the design details of the CNNbased detector. Then, we demonstrate the robustness of the proposed detector in the sense of adapting to various system sizes.
To address the challenges discussed in Section II, we propose to use a CNN consisting of only convolutional layers for signal detection in a banded linear system. In a convolutional layer, each neuron is only connected to a small portion of neurons in the previous layers, and all neurons in a layer share the same set of parameters (i.e., weights and biases). This significantly reduces the total number of parameters in learning. CNN is a very efficient class of DNNs for solving problems with a largesized input, such as image/video recognition [he2016deep]
[kim2014convolutional], speech recognition [abdel2014convolutional], etc. Nonetheless, the success of CNN is based on the shiftinvariance assumption. That is, if one set of parameters is useful to extract a feature at a certain spatial and temporal position, then it is also useful to extract the feature at other positions. Such assumption generally holds in image, video, and audio inputs. However, it does not necessarily hold in the signal detection problem over the channel in (1). For example, directly shifting the channel matrix will significantly change the transmission model and thereby change the detection result. Hence, in the proposed CNNbased detector, the input as well as the tunable convolutional filter needs to be appropriately organized before fed into the CNN. As illustrated in Figure 5, we propose a CNNbased detector that consists of three modules: an input preprossing module, an CNN module, and an output postprocessing module. The input preprocessing module is used to reorganize the input to ensure the shiftinvariance property. The CNN module is a CNN to extract the features from the shiftinvariant input. The output postprocessing module is applied to obtain an estimate of the transmitted signals based on the features extracted by the CNN. In the following subsections, we will discuss the detailed design of the three modules.
Iiia Input Preprocessing
In the input preprocessing module, we use an input reshaping approach to ensure the shiftinvariance property of the input and .^{3}^{3}3The realization of the input preprocessing module to achieve shiftinvariance is not unique. The input reshaping approach proposed in this paper is just an example.
As illustrated in Figure 6(a), we reshape the channel coefficients and the received signals into a vector . Recall that the channel matrix is a banded matrix, with the nonzero entries confined to a diagonal band. Hence, we only need to store the nonzero entries on the band into the vector . Specifically, the nonzero channel coefficients and the received signal corresponding to receiving position are stored in a vector with the entries given by
(6) 
and
(7) 
where and represent the real and imaginary parts of the complex input, respectively. Then, vector is fed as an input into the subsequent CNN module. With the above preprocessing, the input vector has a certain shiftinvariance property. For example, if we shift the input vector by (i.e., the length of a subvector ), we only need to shift the output vector by to obtain the same inputoutput relationship. With the preprocessing, a CNN can be employed to extract features of the input.
IiiB Cnn
As shown in Figure 6(b), the CNN module consists of multiple convolutional hidden layers and one convolutional output layer. The input is and the output is the symbols , where
(8) 
We use ReLU as the activation function for the hidden layers:
(9) 
where is the input, and is the output of the activation function. To map the output to interval
, we choose the sigmoid function as the activation function for the output layer:
(10) 
In the first convolutional layer, we use zeropadding with stride size
, and set the filter size to , where is the depth of the filter. That is, the th output subvector of the first layer is given by(11) 
where and are the learnable weight and bias of the first layer, and with for or . As such, each filter takes subvectors as the input. This setting is based on the observation that each subvector is strongly correlated with neighbouring subvectors due to the banded structure of channel . Hence, we propose to extract features from every consecutive subvectors. Similarly, in the th layer (), the filter is performed over subvectors with stride size and filter size . To summarize, the structure of a CNN is determined by its number of layers and the filter depth in each layer. These parameters need to be decided before training the network. As shown in simulations later, such a CNN outperforms a DNN consisting of fullyconnected layers in both accuracy and complexity.
Remark 1.
A conventional CNN typically consists of convolutional layers as well as pooling layers and fullyconnected layers. However, the fullyconnected layers and pooling layers are not used in our design for the following reasons. First, a pooling layer is typically used after a convolutional layer to perform a downsampling operation along the spatial dimensions. Recall that in the proposed CNN, the filter in the convolutional layer is used to extract features for each receiving position, which means that every output of the filter is useful. Discarding features will cause performance loss. Second, the fullyconnected layers involve high complexity and are also difficult to train. As shown in the simulation section, the fullyconnected layers do not provide any performance gain over the convolutional layers. Hence, we have not included any pooling layers and fullyconnected layers in the proposed CNN. Dropout and batch normalization are also very important components in the conventional CNN architecture. However, we have tested their performance and found that they do not provide any gain either.
IiiC Output Postprocessing
In the output postprocessing module, we map the output of the CNN to the estimate of the transmitted signals. Recall that we use the sigmoid function as the activation function of the output layer. As such, the output of the CNN lies in the interval . Here, we use an indicator function to map the continuous value of output to a discrete estimate of the transmitted signal :
(12) 
Comments
There are no comments yet.