# Deep Learning Based Frequency-Selective Channel Estimation for Hybrid mmWave MIMO Systems

Millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems typically employ hybrid mixed signal processing to avoid expensive hardware and high training overheads. However, the lack of fully digital beamforming at mmWave bands imposes additional challenges in channel estimation. Prior art on hybrid architectures has mainly focused on greedy optimization algorithms to estimate frequency-flat narrowband mmWave channels, despite the fact that in practice, the large bandwidth associated with mmWave channels results in frequency-selective channels. In this paper, we consider a frequency-selective wideband mmWave system and propose two deep learning (DL) compressive sensing (CS) based algorithms for channel estimation. The proposed algorithms learn critical apriori information from training data to provide highly accurate channel estimates with low training overhead. In the first approach, a DL-CS based algorithm simultaneously estimates the channel supports in the frequency domain, which are then used for channel reconstruction. The second approach exploits the estimated supports to apply a low-complexity multi-resolution fine-tuning method to further enhance the estimation performance. Simulation results demonstrate that the proposed DL-based schemes significantly outperform conventional orthogonal matching pursuit (OMP) techniques in terms of the normalized mean-squared error (NMSE), computational complexity, and spectral efficiency, particularly in the low signal-to-noise ratio regime. When compared to OMP approaches that achieve an NMSE gap of $[{4-10}]dB$ with respect to the Cramer Rao Lower Bound (CRLB), the proposed algorithms reduce the CRLB gap to only $[{1-1.5}]dB$, while significantly reducing complexity by two orders of magnitude.

## Authors

• 4 publications
• 17 publications
• 9 publications
• 10 publications
• ### Wideband mmWave Channel Estimation for Hybrid Massive MIMO with Low-Precision ADCs

09/11/2018 ∙ by Yucheng Wang, et al. ∙ 0

• ### Phase-Noise Compensation for OFDM Systems Exploiting Coherence Bandwidth: Modeling, Algorithms, and Analysis

Phase-noise (PN) estimation and compensation are crucial in millimeter-w...
07/19/2020 ∙ by MinKeun Chung, et al. ∙ 0

• ### Dictionary Learning for Channel Estimation in Hybrid Frequency-Selective mmWave MIMO Systems

Exploiting channel sparsity at millimeter wave (mmWave) frequencies redu...
09/19/2019 ∙ by Hongxiang Xie, et al. ∙ 0

• ### Beamspace Channel Estimation for Massive MIMO mmWave Systems: Algorithm and VLSI Design

Millimeter-wave (mmWave) communication in combination with massive multi...
10/02/2019 ∙ by Seyed Hadi Mirfarshbafan, et al. ∙ 0

• ### A Survey on Deep-Learning based Techniques for Modeling and Estimation of MassiveMIMO Channels

Why does the literature consider the channel state-information (CSI) as ...
10/08/2019 ∙ by Makan Zamanipour, et al. ∙ 0

• ### Versatile Compressive mmWave Hybrid Beamformer Codebook Design Framework

Hybrid beamforming (HB) architectures are attractive for wireless commun...
09/21/2019 ∙ by Junmo Sung, et al. ∙ 0

• ### Framework on Deep Learning Based Joint Hybrid Processing for mmWave Massive MIMO Systems

For millimeter wave (mmWave) massive multiple-input multiple-output (MIM...
06/05/2020 ∙ by Peihao Dong, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Millimeter wave (mmWave) communication has emerged as a key technology to fulfill beyond fifth-generation (B5G) network requirements, such as enhanced mobile broadband, massive connectivity, and ultra-reliable low-latency communications. The mmWave band offers an abundant frequency spectrum (30-300 GHz) at the cost of low penetration depth and high propagation losses. Fortunately, its short-wavelength mitigates these drawbacks by allowing the deployment of large antenna arrays into small form factor transceivers, paving the way for multiple-input multiple-output (MIMO) systems with high directivity gains [Pi2011InroMmwave, Heath2016OverviewMmwave, Khateeb2014covcap, Andrews20145g].

Hybrid MIMO structures have been introduced to operate at mmWave frequencies because an all-digital architecture, with a dedicated radio frequency (RF) chain for each antenna element, results in expensive system architecture and high power consumption at these frequencies [Heath2016OverviewMmwave]

. In these hybrid architectures, phase-only analog beamformers are employed to steer the beams using steering vectors of quantized angles. The down-converted signal is then processed by low-dimensional baseband beamformers, each of which is dedicated to a single RF chain

[Khateeb2014mmwave, Heath2016shiftOrSwitches]. The number of RF chains is significantly reduced with this combination of high-dimensional phase-only analog and low-dimensional baseband digital beamformers [Heath2016shiftOrSwitches]. Moreover, optimal configuration of the digital/analog precoders and combiners requires instantaneous channel state information (CSI) to achieve spatial diversity and multiplexing gain [Khateeb2014Chanest]. However, acquiring mmWave CSI is challenging with a hybrid architecture due to the following reasons [Khateeb2014mmwave]: 1) There is no direct access to the different antenna elements in the array since the channel is seen through the analog combining network, which forms a compression stage for the received signal when the number of RF chains is much smaller than the number of antennas, 2) the large channel bandwidth yields high noise power and low received signal-to-noise-ratio (SNR) before beamforming, and 3) the large size of channel matrices increases the complexity and overheads associated with traditional precoding and channel estimation algorithms. Therefore, low complexity channel estimation for mmWave MIMO systems with hybrid architecture is necessary.

### I-a Related Work

Channel estimation techniques typically leverage the sparse nature of mmWave MIMO channels by formulating the estimation as a sparse recovery problem and apply compressive sensing (CS) methods to solve it. Compressive sensing is a general framework for estimation of sparse vectors from linear measurements [eldar2011CS]. The estimated supports of the sparse vectors using CS help identify the indices of Angle-of-Arrival (AoA) and Angle-of-Departure (AoD) pairs for each path in the mmWave channel, while the amplitudes of the non-zero coefficients in the sparse vectors represent the channel gains for each path. Therefore, these supports and amplitudes are key components to be estimated to obtain accurate CSI. Moreover, it has been shown that pilot training overhead can be reduced with compressive estimation, unlike the conventional approaches such as those based on least squares (LS) estimation [Heath2016shiftOrSwitches].

Several channel estimation methods based on CS tools that explore the mmWave channel sparsity have been investigated in the literature [Heath2016shiftOrSwitches, Gao2016CEfreqSelec, Heath2017TDCE, Heath2018CEMain, Ma2018CE]. A distributed grid matching pursuit (DGMP) channel estimation scheme is presented in [Gao2016CEfreqSelec], where the dominant entries of the line-of-sight (LoS) channel path are detected and updated iteratively. In [Heath2017TDCE], an orthogonal matching pursuit (OMP) channel estimation scheme to detect multiple channel paths support entries is also considered. Likewise, a simultaneous weighted orthogonal matching pursuit (SW-OMP) channel estimation scheme based on a weighted OMP method is developed in [Heath2018CEMain] for frequency-selective mmWave systems. A sparse reconstruction problem was formulated in [Heath2018CEMain] to estimate the channel independently for every subcarrier by exploiting common sparsity in the frequency domain. However, such optimization and CS-based channel estimation schemes detect the support indices of the mmWave channel sequentially and greedily, and hence are not globally optimal [Ma2018CE].

Alternatively, deep learning (DL) approaches and data-driven algorithms have recently received much attention as key enablers for beyond 5G networks. Traditionally, signal processing and numerical optimization techniques have been heavily used to address channel estimation at mmWave bands [Gao2016CEfreqSelec, Heath2017TDCE, Heath2018CEMain, Ma2018CE]

. However, optimization algorithms often demand considerable computational complexity overhead, which creates a barrier between theoretical design/analysis and real-time processing requirements. Hence, the prior data-set observations and deep neural network (DNN) models can be leveraged to learn the non-trivial mapping from compressed received pilots to channels. DNNs can be used to approximate the optimization problems by selecting the suitable set of parameters that minimize the approximation error. The use of DNNs is expected to substantially reduce computational complexity and processing overhead since it only requires several layers of simple operations such as matrix-vector multiplications. Moreover, several successful DL applications have been demonstrated in wireless communications problems such as channel estimation

[Ye2018PwrDLCE, DOng2019DLDNNCE, He2018DLCE, M2019DLCE, Ma2020SparseDLCE, wei2019knowledge, Chun2019DLCEmassiveMIMO, Jin2019CellFreeDLCE, andrew2020MIMOCEDL, Bjornson2020CEbayes], analog beam selection [long2018DD], [Hodge2019RFmeta], and hybrid beamforming [long2018DD, Khateeb2018DLbeam, Huang2019DLHP, Elbir2019CNNHP, Elbir2020JASHP, elbir2019online]. Besides, DL-based techniques, when compared with other conventional optimization methods, have been shown [DOng2019DLDNNCE, Elbir2019CNNHP, Elbir2020JASHP, dorner2018DLair] to be more computationally efficient in searching for beamformers and more tolerant to imperfect channel inputs. In [He2018DLCE], a learned denoising-based approximate message passing (LDAMP) network is presented to estimate the mmWave communication system with lens antenna array, where the noise term is detected and removed to estimate the channel. However, channel estimation for mmWave massive MIMO systems with hybrid architecture is not considered in [He2018DLCE].

Prior work on channel estimation for hybrid mmWave MIMO architecture [Khateeb2018DLbeam, Huang2019DLHP, Elbir2019CNNHP, M2019DLCE, Xu2019DLCEMultiuser, He2018DLCE, Kang2018DLCEEnergy, Ma2020SparseDLCE, wei2019knowledge, Chun2019DLCEmassiveMIMO, Jin2019CellFreeDLCE, andrew2020MIMOCEDL, Bjornson2020CEbayes, Asmaa2020CF, Asmaa2019CF] consider the narrow-band flat fading channel model for tractability, while the practical mmWave channels exhibit the wideband frequency-selective fading due to the very large bandwidth, short coherence time and different delays of multipath[Heath2018CEMain, Heath2016FSF, emil2019sub6mmWave]. MmWave environments such as indoor and vehicular communications are highly variable with short coherence time [emil2019sub6mmWave] which requires channel estimation algorithms that are robust to the rapidly changing channel characteristics 111The coherence time is within few milliseconds such as when operating at with bandwidth [emil2019sub6mmWave].. Accordingly, this paper presents combination of DL and CS methods to identify AoA/AoD pairs’ indices and estimate the channel amplitudes for frequency-selective channel estimation of hybrid MIMO systems.

### I-B Contributions of the Paper

In this paper, we propose a frequency-selective channel estimation framework for mmWave MIMO systems with hybrid architecture. By considering the mmWave channel sparsity, the developed method aims at reaping the full advantages of both CS and DL methods. We consider the received pilot signal as an image, and then employ a denoising convolutional neural network (DnCNN) from [Zhang2017Dncnn] for channel amplitude estimation. Thereby, we treat image denoising as a plain discriminative learning problem, i.e., separating the noise from a noisy image by feed-forward convolutional neural networks (CNNs). The main motivations behind using CNNs are twofold: First, deep CNNs have been recognized to effectively extract image features [Zhang2017Dncnn]

. Second, considerable advances have been achieved on regularization and learning methods for training CNNs, including Rectifier Linear Unit (ReLU), batch normalization, and residual learning

[he2016deep]. These methods can be adopted in CNNs to speed up the training process and improve the denoising performance. The main contributions of the paper can be summarized as follows:

1. We propose a deep learning compressed sensing channel estimation (DL-CS-CE) scheme for wideband mmWave massive MIMO systems. The proposed DL-CS-based channel estimation (DL-CS-CE) algorithm aims at exploiting the information on the support coming from every subcarrier in the MIMO-OFDM system. It is executed in two steps: channel amplitude estimation through deep learning and channel reconstruction. We train a DnCNN using real mmWave channel realizations obtained from Raymobtime 222 Available at https://www.lasse.ufpa.br/raymobtime/. The correlation between the received signal vectors and the measurement matrix is fed into the trained DnCNN to predict the channel amplitudes. Using the obtained channel amplitudes, the indices of dominant entries of the channel are obtained, based on which the channel can be reconstructed. Unlike the existing work of [Gao2016CEfreqSelec, Heath2017TDCE, Heath2018CEMain] that estimates the dominant channel entries sequentially, we estimate dominant entries simultaneously, which is able to save in computational complexity and improve estimation performance.

2. Using the DL-CS-CE for support detection, we propose a refined DL-CS-CE algorithm that exploits the spatially common sparsity within the system bandwidth. A channel reconstruction with a low complexity multi-resolution fine-tuning approach is developed that further improves NMSE performance by enhancing the accuracy of the estimated AoAs/AoDs. The channel reconstruction is performed by consuming a very small amount of pilot training frames, which significantly reduces the training overhead and computational complexity.

3. Simulation results in the low SNR regime show that both proposed algorithms significantly outperform the frequency domain approach developed in [Heath2018CEMain]. Numerical results also show that using a reasonably small pilot training frames, approximately in the range of 60-100 frames, leads to substantially low channel estimation errors. The proposed algorithms are also compared with existing solutions by analyzing the trade-off between delivered performance and incurred computational complexity. Our analysis reveals that both proposed channel estimation methods achieve the desired performance at significant lower complexity. The developed approaches are shown to attain an NMSE gap of with the Cramer Rao Lower Bound (CRLB) compared to the gap attained by the SW-OMP technique, while reducing the computational complexity by two orders of magnitude.

### I-C Notation and Paper Organization

Bold upper case, bold lower case, and lower case letters correspond to matrices, vectors, and scalars, respectively. Scalar norms, vector norms, and Frobenius norms, are denoted by , , and , respectively. We use to denote a set. denotes a identity matrix. , , , and stand for expected value, transpose, complex conjugate, and Hermitian. stands for the Moore-Penrose pseudo-inverse of . represents element of a vector . The entry of a matrix is denoted by . In addition, and denote the column vector of matrix and the sub-matrix consisting of the columns of matrix with indices in set . means modulo .

refers to a circularly-symmetric complex Gaussian distribution with mean

and covariance matrix . The operations , , , and correspond to transforming a matrix into a vector, transforming a vector into a matrix for a defined size (), transforming the row and column subscripts of a matrix into their corresponding linear index, and transforming the linear index into its corresponding row and column subscripts for a matrix of a defined size (), respectively. is the Kronecker product of and . Key model-related notation is listed in Table I.

The rest of the paper is organized as follows. The system model for the frequency selective mmwave MIMO system is described in Section II. In Section III, the proposed two deep learning-based compressive sensing channel estimation schemes in the frequency domain are introduced. Moreover, complexity analysis in terms of convergence and computational analysis is presented in Section IV. Case studies with numerical results are simulated and analyzed based on the proposed schemes in Section V. Section VI concludes the paper.

## Ii System Model and Problem Formulation

This section first provides the system and channel models of frequeny-selective hybrid mmWave transceivers. Then, it formulates a sparse recovery problem to estimate the sparse channel in the frequency domain.

### Ii-a System Model

As shown in Fig. 1, we consider an OFDM-based mmWave MIMO link employing a total of subcarriers to send data streams from a transmitter with antennas to a receiver with antennas. The system is based on a hybrid MIMO architecture, with and radio frequency (RF) chains at the transmitter and receiver sides. Following the notation of [Heath2018CEMain], we define a frequency-selective hybrid precoder , , where and are the analog and digital precoders, respectively. Although, the analog precoder is considered to be frequency-flat, the digital precoder is different for every subcarrier. The RF precoder and combiner are deployed using a fully connected network of quantized phase shifters, as described in [Heath2016shiftOrSwitches]. During transmission, the transmitter (TX) first precodes data symbols at each subcarrier by applying the subcarrier-dependent baseband precoder . The symbol blocks are then transformed into the time domain using parallel

-point inverse Fast Fourier transform (IFFT). After adding the cyclic prefix (CP), the transmitter employs the subcarrier-independent RF precoder

to form the transmitted signal. The complex baseband signal at the subcarrier can be expressed as

 x[k]=FRFFBB[k]s[k], (1)

where denotes the transmitted symbol sequence at the subcarrier of size .

#### Ii-A1 Channel Model

We consider a frequency-selective MIMO channel between the transmitter and the receiver, with a delay tap length of in the time domain. The delay tap of the channel is denoted by an matrix , . Assuming a geometric channel model [Heath2018CEMain], can be written as

 Hd=√NtNrLρL% L∑ℓ=1αℓprc(dTs−τℓ)aR(ϕℓ)a∗T(θℓ), (2)

where represents the path loss between the transmitter and the receiver; corresponds to the number of paths; denotes the sampling period; is a filter that includes the effects of pulse-shaping and other lowpass filtering evaluated at ; is the complex gain of the path; is the delay of the path; and are the AoA and AoD of the path, respectively; and and are the array steering vectors for the receive and transmit antennas, respectively. Both the transmitter and the receiver are assumed to use Uniform Linear Arrays (ULAs) with half-wavelength separation. Such an ULA has steering vectors obeying the expressions

 [aT(θℓ)]n=√1Ntejnπcos(θℓ),n=0,…,Nt−1,
 [aR(ϕℓ)]m=√1Nrejmπcos(ϕℓ),m=0,…,Nr−1.

The channel can be expressed more compactly in the following form:

 Hd=ARΔdA∗T (3)

where is diagonal with non-zero complex diagonal entries, and and contain the receive and transmit array steering vectors and , respectively. The channel at subcarrier can be written in terms of the different delay taps as

 H[k]=Nc−1∑d=0Hde−j2πkKd=ARΔ[k]A∗T. (4)

where is diagonal with non-zero complex diagonal entries such that , .

#### Ii-A2 Extended Virtual Channel Model

According to [Heath2016OverviewMmwave], we can further approximate the channel using the extended virtual channel model as

 Hd≈~ARΔvd~A∗T, (5)

where corresponds to a sparse matrix that contains the path gains in the non-zero elements. Moreover, the dictionary matrices and contain the transmitter and receiver array response vectors evaluated on a grid of size for the AoA and a grid of size for the AoD, i.e., and , respectively:

 ~AT =[aT(~θ1)…aT(~θGt)], (6) ~AR =[aR(~ϕ1)…aR(~ϕGr)]. (7)

Since we have few scattering clusters in mmWave channels, the sparse assumption for is commonly accepted. To help expose the sparse structure, we can express the channel at subcarrier in terms of the sparse matrices and the dictionaries as follows

 (8)

where , , is a complex sparse matrix containing the channel gains of the virtual channel.

#### Ii-A3 Signal Reception

Considering that the receiver (RX) applies a hybrid combiner , the received signal at subcarrier can be expressed as

 y[k]=W∗BB[k]W∗RFH[k]FRFFBB[k]s[k]+W∗BB% [k]W∗RFn[k],

where corresponds to the circularly symmetric complex Gaussian distributed additive noise vector. The received signal model in  (II-A3) corresponds to the data transmission phase. As explained in Section III, during the channel acquisition phase, frequency-flat training precoders and combiners will be considered to reduce complexity.

### Ii-B Problem Formulation

During the training phase, transmitter and receiver use a training precoder and a training combiner for the pilot training frame, respectively. The precoders and combiners considered in this phase are frequency-flat to keep the complexity of the sparse recovery algorithms low. The transmitted symbols are assumed to satisfy , where is the total transmitted power and . The transmitted symbol is decomposed as , with is a frequency-flat vector and is a pilot symbol known at the receiver. This decomposition is used to reduce computational complexity since it allows simultaneous use of the

spatial degrees of freedom coming from

RF chains and enables channel estimation using a single subcarrier-independent measurement matrix. Moreover, each entry in and in are normalized such that their squared-modulus would be and , respectively. Then, the received samples in the frequency domain for the training frame can be expressed as

 (10)

where denotes the frequency-domain MIMO channel response at the subcarrier and , , represents the frequency-domain combined noise vector received at the subcarrier. The average received SNR is given by . Furthermore, the channel coherence time is assumed to be larger than the frame duration and that the same channel can be considered for several consecutive frames.

#### Ii-B1 Measurement Matrix

In order to apply sparse reconstruction with a single subcarrier-independent measurement matrix, we first remove the effect of the scalar by multiplying the received signal by . Using the following property , the vectorized received signal is given by

 vec{y(m)[k]}=(q(m)TF(m)Ttr⊗W(m)∗tr)vec{H[k]}+n(m)c[k].

The vectorized channel matrix can be expressed as

 vec{H[k]}=(¯~AT⊗~AR)vec{Δv[k]}. (12)

Furthermore, we define the measurement matrix :

 Φ(m)=(q(m)TF(m)Ttr⊗W(m)∗tr), (13)

and the dictionary as

 Ψ=(¯~AT⊗~AR), (14)

Then, the vectorized received pilot signal at the training symbol can be written as

 vec{y(m)[k]}=Φ(m)Ψhv[k]+n(m)c[k], (15)

where is the sparse vector containing the complex channel gains. Moreover, we use several training frames to get enough measurements and accurately reconstruct the sparse vector , especially in the very-low SNR regime. Therefore, when the transmitter and receiver communicate during training steps using different pseudorandomly built precoders and combiners,  (15) can be extended to received signals given by

 ⎡⎢ ⎢⎣y(1)[k]⋮y(M)[k]⎤⎥ ⎥⎦y[k]=⎡⎢ ⎢⎣Φ(1)⋮Φ(M)⎤⎥ ⎥⎦TΦΨhv[k]+⎡⎢ ⎢ ⎢ ⎢⎣n(1)c[k]⋮n(M)c[k]⎤⎥ ⎥ ⎥ ⎥⎦nc[k]. (16)

Hence, the vector can be estimated by solving the sparse reconstruction problem as done in [Heath2018CEMain],

 min∥hv[k]∥1subject to ∥y[k]−ΦΨhv[k]∥22<ϵ, (17)

where represents a tunable parameter defining the maximum error between the reconstructed channel and the received signal. In realistic scenarios, the sparsity (number of channel paths) is usually unknown, therefore the choice of is critical to solve  (17) and estimate the sparsity level. The choice of this parameter is explained in Section III-D.

Interestingly, the matrices in  (8) exhibit the same sparse structure for all , since the AoA and AoD do not change with frequency in the transmission bandwidth. This is an interesting property that can be leveraged when solving the compressed channel estimation problem defined in  (17). Moreover, we denote the supports of the virtual channel matrices as , . Then, knowing , with , , the supports of are defined as

 supp{hv[k]}=Nc−1⋃d=0supp{vec{Δvd}}k=0,…,K−1, (18)

where the union of the supports of the time-domain virtual channel matrices is due to the additive nature of the Fourier transform. Therefore, as shown in  (18), where the union is independent of the subcarrier , has the same supports for all .

#### Ii-B2 Correlation Matrix

To estimate multi-path components of the channel, i.e., AoAs/AoDs and channel gains, we first need to compute the atom, which is defined as the vector that produces the largest sum-correlation with the received signals in the measurement matrix. The sum-correlation is especially considered as the support of the different sparse vectors is the same over the subcarriers. The correlation vector is given by

 c[k]=Υ∗y[k], (19)

where , represents the equivalent measurement matrix which is the same and is the received signal for a given , .

One can note that if there exists a correlation between noise components, the atom estimated from the projection in  (19) might not be the correct one. In order to compensate for this error in estimation, we consider the noise covariance matrix when performing the correlation step. In particular, we consider two arbitrary (hybrid) combiners , for two arbitrary training steps and a given subcarrier . Hence, the combined noise at a given training step and subcarrier is represented as , with , which results in noise cross-covariance matrix given by . We can further write the noise covariance matrix of as a block diagonal matrix ,

 Cw=blkdiag{Wtr(1)∗Wtr(1),…,Wtr(M)∗Wtr(M)}. (20)

Moreover, Cholesky factorization can be used to factorize into , where is an upper triangular matrix. Then, by taking into consideration the noise covariance matrix, the correlation step is given by

 c[k]=Υ∗wyw[k], (21)

where represents the whitened measurement matrix given by . And, the whitened received signal is given by . The matrix is given by , where can be considered as a frequency-flat baseband combiner used in the -th training step. Therefore, by applying the whitened measurement matrix, the resulting correlation would simultaneously whiten the spatial noise components and estimate a more accurate support index in the sparse vectors .

## Iii Deep Learning and Compressive-Sensing Based Channel Estimation (DL-CS-CE)

To solve the CS channel estimation problem formulated above, this section proposes two DL-based algorithms. Both leverage the common support between the channel matrices for every subcarrier and provide different complexity-performance trade-offs. The former simultaneously estimate the support using an offline-trained DnCNN and then reconstruct the channel. On the other hand, the latter applies further fine-tuning to accurately estimate the AoAs and AoDs with higher resolution dictionary matrices while keeping computational complexity low.

### Iii-a Offline Training and Online Deployment of DnCNN

Before delving into the proposed solutions’ details, let us first provide insights into the considered DnCNN architecture as well as its offline training and online deployment.

#### Iii-A1 DnCNN Architecture

Fig. 2 illustrates the network architecture of the DnCNN denoiser that consists of convolutional (Conv) layers. Each layer uses different filters. The first convolutional layer is followed by a rectified linear unit (ReLU). The succeeding convolutional layers are followed by batch-normalization (BN) and a ReLU. The final convolutional layer uses one separate filter to reconstruct the signal. Here, , and are the convolutional kernel dimensions, and is the number of filters in the layer.

We present three pseudo-color images of the noisy channel, residual noise, and estimated output channel in Fig. 2. The DnCNN considers the amplitude of the correlation matrix, i.e.,

 Cα[k]=vec2mat(|c[k]|,[Gr,Gt]),∀k, (22)

as input and produces residual noise as an output, rather than estimated channel amplitudes, where we define a matrix of channel amplitudes as

 G[k]=|Δv[k]|∈RGr×Gt,∀k. (23)

The DnCNN aims to learn a mapping function to predict the latent clean image from noisy observation . We adopt the residual learning formulation to train a residual mapping where is the residual noise, and then we have . Instead of learning a mapping directly from a noisy image to a denoised image, learning the residual noise is beneficial [Zhang2017Dncnn, he2016deep]

. Furthermore, the averaged mean squared error between the desired residual images and estimated ones from noisy input is adopted as the loss function to learn the trainable parameters

of the DnCNN. This loss function is given by

 ℓ(Θ)=12NN∑i=1∥R(Cα[k]i;Θ)−(Cα[k]i−G[k]i)∥2F (24)

where represents noisy-clean training patch pairs. This method is also known as residual learning [he2016deep] and renders the DnCNN to remove the highly structured natural image rather than the unstructured noise. Consequently, residual learning improves both the training times and accuracy of a network. In this way, combining batch normalization and residual learning techniques can accelerate the training speed and improve the denoising performance. Besides, batch normalization has been shown to offer some merits for residual learning, such as alleviating internal covariate shift problem in [Zhang2017Dncnn, Jin2019CellFreeDLCE].

#### Iii-A2 Offline Training of the DnCNN

During offline training of the DnCNN, the dataset of and is generated based on the realistic Raymobtime dataset for mmWave frequency selective channel environment333Raymobtime is developed based on collecting realistic datasets collected by ray-tracing and realistic 3D scenarios that considers mobility, time, frequency, and space. Available at https://www.lasse.ufpa.br/raymobtime/. With the mmWave channel amplitude in  (23) and the correlation of the received signals and the measurement matrix in  (22), the training data of and can be obtained. In particular, the process to obtain and involves the following four steps: i) generation of channel matrices based on the mmWave channel model from the Raymobtime dataset ii) obtaining based on  (23); iii) computing the whitened received signal vector ; and iv) acquiring the amplitudes of the correlation vector and transforming it into a matrix form as per  (22).

#### Iii-A3 Online Deployment of the DnCNN

During the online deployment of the DL-CS-CE, we obtain the measured received signal from the realistic mmWave channel environments. We compute based on (22), which is then fed to the offline-trained DnCNN. Then, the trained DnCNN would predict , from which we can estimate the supports of . An interesting and noteworthy issue is that we can feed the trained DnCNN a subset of subcarriers of the amplitudes of the correlation matrices , to eventually estimate the support of , since as shown in Section II-B1 have the same support for all . In particular, the support can be estimated if a small number of subcarriers is used instead. This will eliminate the need for computing for all subcarriers and eventually reduce the overall computational complexity at the cost of a negligible performance degradation. By leveraging from triangle inequality, , such that the selected signals are expected to exhibit the strongest channel response. Therefore, the subcarriers having largest -norm will be exploited to derive an estimate of the support of the already defined sparse channel matrix , .

### Iii-B Algorithm 1: DL-CS-CE

The state-of-the-art sparse channel estimation schemes [Heath2018CEMain, and references therein] depend on greedy algorithms to detect the supports sequentially, which naturally yield suboptimal solutions. This motivated us to exploit the neural networks to estimate all supports simultaneously rather than sequentially. The algorithmic implementation of the proposed DL-CS-CE solution is presented in Algorithm 1. After initialization steps between lines 1-3 and the computation of the whitened equivalent observation matrix in line 4, DL-CS-CE is structured based on three main procedures:

• Estimation of the channel amplitudes by using an offline-trained DnCNN,

• Sorting the estimated channel amplitudes in descending order to select the supports of dominant entries,

• Reconstruction of the channel according to the selected indices,

which are explained in the sequel.

#### Iii-B1 Strongest Subcarriers Selection

This procedure is represented in lines 8-11 of Algorithm 1, where the algorithm iteratively finds a subset containing the strongest subcarriers which are expected to exhibit the strongest channel response as explained in Section III-A3.

#### Iii-B2 Amplitude Estimation

As depicted in Fig. 3, the lines 13 and 14 of Algorithm 1 first compute the correlation vector as per (21) and then create the DnCNN input by putting correlation vectors into a matrix form as per (22), respectively. In line 15, the offline trained DnCNN is used as the kernel of the channel amplitude estimation to obtain the DnCNN output of size , which is the estimate of given in (23). It is worth noting that we only use a subset of the correlation matrices as an input to the DnCNN. In line 16, the output channel amplitude estimation matrix is then vectorized into the following vector form

 ^g[k]=vec(^G[k]),∀k∈K (25)

where the indices of the maximum amplitudes of will be exploited for support detection.

#### Iii-B3 Multicarrier Channel Reconstruction

This procedure corresponds to the last block depicted in the last stage of the block diagram in Fig.3.b. It detects supports by iteratively updating residual until the MSE falls below a predetermined threshold, . After initialization steps in lines 19 and 20, line 19 first sums the amplitudes of predicted over the subcarriers as the supports are the same for all [c.f. Section  II-B1]. Then, IndexSortDescend function sorts the sum vector in descending order and return corresponding index set , . Thereafter, the while loop between lines 22 and 28 follows the below steps until the termination condition is satisfied:

Line 23 updates the detected support set by adding the element of ordered index set . Then, line 24 projects the input signal onto the subspace given by the detected support using Weighted Least-Squares (WLS)