Massive multiple-input multiple-out (m-MIMO) is a key enabling technology of the 5G wireless communication systems due to its high spectral efficiency. Nevertheless, a necessary yet challenging prerequisite for m-MIMO is to obtain high-dimensional channel state information (CSI) at the transmitter. In frequency division duplex (FDD) systems, downlink CSI is acquired at the user equipment (UE) and then reported to the base station (BS) through a feedback channel. The large number of antennas in m-MIMO systems results in a high-dimensional channel matrix, which imposes a heavy burden on the feedback link.
The application of deep learning in wireless communication systems has been a research hot spot in recent years and successful deployment of neural networks (NNs) in CSI feedback, end-to-end communication 
, and channel estimation
has been realized. Benefiting from considerable progresses of NN in computer vision and wireless communication, a convolutional NN-based approach, namely CsiNet, has been proposed in 
, exhibiting superior performance in CSI compression by treating the angular-delay domain channel matrix as a sparse 2D image. Particularizing the design of CsiNet in various scenarios, e.g., with high mobility and temporal correlation, a series of works then developed methods using NN for the m-MIMO CSI feedback. Existing works are mainly classified into two categories. One category uses more sophisticated network architectures, including the joint convolutional residual JC-ResNet, CsiNet+  and multi-resolution CRNet . The other category is to exploit extra correlation information. Recurrent NN architectures were employed in  and  to exploit the temporal correlation of CSI in time-varing channels. In , the uplink channel was used as an auxiliary input of the NN to assist the reconstruction of compressed downlink CSI, which assumes reciprocity between the uplink and downlink channels. In , a cooperative recovery network, named CoCsiNet, was proposed to cut back on the feedback overhead by exploiting the shared information of UEs in proximity. A new module, named Anci-block, was devised in  by exploiting the visualized characteristics of the angular-delay domain channel matrix.
Because the channel gains are complex-valued, existing methods typically stacked the real and imaginary parts as an entire real-valued input to fit the requirement of popular NN design. However, it is natural to conjecture that there exists some kind of similarity between the two parts even though the real part is proven statistically independent of the imaginary part if the channel is symmetrically complex Gaussian distributed. In this article, we observe that the real and imaginary parts of the complex-valued channel matrix in fact share almost the same correlation even though there exist distinct correlations in the angular and delay domains. Based on this finding, we devise an efficient NN, namely ENet, for m-MIMO CSI feedback with a substantially reduced size. We adopt individual compression strategies for the angular and delay domains to exploit domain-specific correlations. Instead of stacking the real and imaginary parts together, as in existing NN architectures for CSI feedback, we propose the scheme of one-part training and two-parts deployment to take advantage of the similarity in the correlations of the real and imaginary parts of CSI. Particularly in our proposed ENet, only the real part of the CSI matrix is utilized for the network training while both the real and imaginary parts are compressed and fed back using the same trained network. While the network size is reduced by nearly an order of magnitude, the ENet exhibits brilliant performance compared to the existing NN-based methods.
Ii System Model
We consider the downlink of an FDD m-MIMO system, in which antennas are deployed at the BS and a single antenna is installed at the UE. Orthogonal frequency division multiplexing (OFDM) with subcarriers is used, where the received signal at the th subcarrier is expressed as
denotes the channel vector of theth subcarrier, is the precoding vector, is the transmit symbol, and is the additive noise.
The downlink channel matrix in the spatial-frequency domain is denoted by, which requires a total feedback of complex-valued scalars if no compression is adopted. This can be extremely large in m-MIMO systems where a typical antenna array size is about a few hundreds. In order to compress the CSI for feedback through a bandwith-limited feedback channel, we transform to the angular-delay domain, which explicitly presents the channel sparsity 
. By using a 2D discrete Fourier transform (DFT), the CSI in the angular-delay domain is expressed as
where and are DFT matrices. As contains only values in a small delay duration , we focus on the first rows of , denoted by , for CSI compression and feedback.
Further compression is still necessary because the amount of feedback, , can still be very large with a large . Fortunately, there are more to be exploited to design an efficient m-MIMO CSI feedback architecture if we go deeper into the channel structure in the angular-delay domain. On the one hand, the correlation in the angular domain behaves differently from that in the delay domain. On the other hand, the real part and imaginary part of the CSI matrix are discovered to share similar correlations.
Based on the above two findings, we design an efficient NN architecture for the m-MIMO CSI feedback. The proposed ENet consists of an Encoder and a Decoder. The Encoder is responsible for generating the compressed representation of in terms of and , where and are the real and imaginary parts of . The Encoder is denoted by
where and are -dimensional codewords of the real and imaginary parts of the compressed CSI, and is the compression operation of the Encoder. Defining , the compression ratio is . At the other side, the Decoder is used to reconstruct and from the compressed codeword. It yields
where is the reconstruction process by the Decoder.
Iii Channel Characteristics and ENet
In this section, we elaborate the new architecture of the proposed ENet for the m-MIMO CSI feedback, which mainly exploits the two findings of the inherent nature of the m-MIMO channel matrix: 1) the difference of angular-delay domain correlations, 2) the similarity in the correlations of the real and imaginary parts of CSI. Before introducing the details of ENet, we first characterize the correlation of in the following.
Iii-a Correlation Difference and Similarity
The channel correlations between the two ends of the communication link usually depend on the scatterers in the propagation paths. In the delay domain, a resolvable path typically consists of multiple unresolvable paths coming from the same scatterer. Thanks to the large number of antennas in m-MIMO systems, high-resolution in the angular domain is available, which enables unresolvable paths in the delay domain to be resolved in the angular domain. Therefore, a strong correlation can be observed in the adjacent angles. However, the resolvable paths in the delay domain most probably come from different scatterers and a weak correlation is therefore observed in the delay domain. The correlation ofin the angular domain is defined as
where represents the interval in the angular domain. The correlation of in the angular domain, , is similarly defined. The correlation of in the delay domain is defined as
where represents the interval in the delay domain. The correlation of in the delay domain, , is defined in a similar way. Without loss of generality, we set in the subsequent discussion, where .
On the other hand, there exists another feature of the statistics of the CSI that is elaborated in the following theorem, proved in Appendix A.
For a channel with the probability density function satisfying (13) and (15), the correlation of the real part of the CSI values is equal to the correlation of the imaginary part, that isand , where .
To illustrate the correlation difference and similarity, we depict in Fig. 1 the correlation of 150,000 CSI samples of and generated from the COST 2100 indoor channel model . From the figure, there is a strong correlation in the angular domain whereas a weak correlation exists in the delay domain. In addition, correlation similarity exists between the real and imaginary parts of the complex-valued CSI matrix in the angular domain and similar phenomenon can be observed in the delay domain.
Iii-B The Proposed ENet
Autoencoder-based NNs for m-MIMO CSI feedback , - utilize convolutional layers in the encoder to extract features before a fully-connected layer, where the disadvantage lies in that the compression is done by the fully-connected layer alone. Note that a single fully-connected layer results in a sharp dimension reduction, leading to excessive impairment of the CSI information. Moreover, the parameters of the fully-connected layer constitutes the majority of the entire network at low compression ratios, which remains large if no compression is done before the fully-connected layer. Taking the correlation difference into consideration, this disadvantage can be well addressed. From the perspective of information theory, the domain with a stronger correlation, i.e., the angular domain, can be compressed to a greater extent. Therefore, it is natural to adopt different compression strategies for the angular and delay domains. Our ENet compresses the CSI matrix in the angular domain before the fully-connected layer.
Stacking the real and imaginary parts as an entire real-valued input is the method that most existing deep learning (DL)-based approaches adopt for complex-valued CSI feedback , -, , . However, the correlation similarity established in Theorem 1 makes it feasible that only the real part of the CSI matrix is trained while the trained network can be directly applied to the imaginary part. With the input and output size reduced to half of the stacked CSI, the total number of network parameters can be saved by at least a half, resulting in a lightweight network with better performance that is easier to train.
By exploiting the similarity in correlations of the real and imaginary part of CSI and the correlation difference in the angular and delay domains, we propose the ENet for the m-MIMO CSI compression and feedback. Fig. 2 displays the architecture of ENet in detail. We use the three-dimensional values, e.g.,
, to represent the depth, width, and height of the input tensor, respectively. The four-dimensional values, e.g.,, denote the number, depth, width, and height of the convolution and deconvolution kernels, respectively. The two-dimensional values,
, represent the convolution and deconvolution stride for the width and height of the input tensor, respectively. Since a stronger correlation exists in the angular domain, we use convolution with the stride 2 to compress the CSI in this dimension. It is revealed that the convolution kernel with the size 5 in such a stride produces good performance. The reason of a larger stride convolution employed only in the first layer of the Encoder is that the correlation distribution ofand is deterministic, whereas the correlation distribution may change after the first layer. Correspondingly, we use a deconvolution with stride 2 in the last layer of Decoder to recover CSI in the angular domain. For the other layers and in the delay domain, we use convolutions with stride 1 and convolution kernels with size 3 to process the data stream.
The ENet uses a symmetric architecture for the CSI compression and recovery. After the first layer in Encoder, the size of the CSI matrix in the angular domain is compressed from to . Then, two identical convolutional layers are placed in sequence to strengthen network expression capability. Experiments validate that two additional layers enhance the network performance efficiency. A convolution of size is then used to compress the feature size to . At the end of the Encoder, the feature maps is reshaped into a vector before a fully-connected layer which is utilized to generate the compressed codeword, .
The Decoder function in ENet serves as an inverse operation of Encoder, which reconstructs the original real and imaginary parts of the CSI matrix, i.e., and
, from the received codeword. Except for the last deconvolutional layer in the Decoder, we use batch normalization and the Leaky Rectified Linear Unit (Leaky ReLU) activation function for all convolutional and deconvolutional layers. The batch normalization is used to speed up the convergence of network training and the Leaky ReLU function is chosen as
For the last layer in the Decoder, we use a sigmoid activation function to constrain the output values into [0, 1].
The correlation difference and similarity not only makes ENet a reasonable data compression and recovery method, but also grants a lightweight architecture for the m-MIMO CSI feedback. With the help of the correlation similarity, the parameters of the fully-connected layer decrease from to , which is a significant reduction in the number of parameters. Further with the correlation difference, the number of parameters drops to .
Furthermore, we introduce a tunable parameter, , the number of kernels, to control the complexity of ENet. For performance-oriented applications, a large , e.g., , is recommended, while for complexity-limited applications, a small , e.g., , is preferred.
In order to train ENet, we adopt the end-to-end method and use the mean-squared error (MSE) loss function as
where is the total number of training samples and is the Euclidean norm. The network training objective is to minimize the MSE of the reconstructed and the corresponding ground-truth values of .
Iv Experimental Results
This section presents the performance evaluation of the proposed ENet. A total of 150,000 CSI samples generated with the COST 2100 indoor channel model  at 5.3 GHz is used to validate our method and we divide the training, validation and test sets to contain 100,000, 30,000, 20,000 samples, respectively. In practical applications, the training data can be obtained either through computer simulations by generating CSI samples according to 3GPP channel models or by conducting offline channel measurements. antennas is placed at the BS with a uniform linear array (ULA) and subcarriers are assumed. After the angular-delay domain transformation, rows of in the delay domain are reserved. and are normalized. The Adam optimizer is used for the network training with a batch size of 1,000.
We compare the network complexity of CsiNet , CRNet-cosine  and our proposed ENet with respect to the number of network parameters in Table I. The parameters of the convolutional layer, the deconvolutional layer, and the fully-connected layer account for the majority of the complexity of the NN, and thus we focus on these components. From Table I, ENet is a parameter-saving NN compared to both CsiNet and CRNet-cosine. For , ENet reduces the number of network parameters by over 80% at all compression ratios. For , 60% parameters are saved. Fewer parameters not only mean memory-saving, but also require fewer data samples for training and can alleviate overfitting.
We use the normalized MSE (NMSE) defined as,
to evaluate the CSI reconstruction performance. Fig. 3 compares the NMSE performance by CsiNet , CRNet-cosine , and ENet, where the performance of CsiNet and CRNet-cosine is tested under , and the performance of ENet is the average NMSE tested for both and . From the figure, ENet outperforms CsiNet at all compression ratios and CRNet-cosine at high compression ratios. This is due to the distinct compression strategies designed upon the correlation difference. For better illustration, we visualize the reconstructed real and imaginary part in Fig. 4, where the strength of a pixel represents the magnitude of the channel values. For both and , ENet is able to recover the CSI in a more accurate manner than CsiNet at all compression ratios while preserves more subtle features than . We also summarize the NMSE performance tested under and in Table II. From the table, ENet achieves similar performance for both and , which validates the effectiveness of our finding in Theorem 1 on the correlation similarity.
|Test Set NMSE (dB)|
In this article, we have proposed a DL approach, named ENet, for m-MIMO CSI compression and feedback based on the angular-delay domain channel characteristics. We have shown that by utilizing the correlation difference in the angular and the delay domains and the similarity between correlations in the real and imaginary parts of CSI under some mild conditions, we can significantly reduce the size of the NN while still achieving desirable performance. Moreover, such correlation similarity across real and imaginary domains as demonstrated in Theorem 1 can be useful for other NN designs for complex-valued calculations.
Appendix A proof of Theorem 1
For an angular-domain channel response with independent magnitude and phase , the correlation of the real part is calculated as
where the equality is due to the independence of the magnitude and the phase. Similarly, the correlation of the imaginary part equals
Let us consider the second term of (10). It follows
where is the probability density function. Assuming that channel phase follows
Defining , we rewrite (12) as
Assuming that channel phase further satisfies
where . In particular, for a channel phase
that follows uniform distribution, which holds in an ideal and popular channel model, (13) and (15) hold. Let. We further have
Therefore, the correlation of the real part is equal to the correlation of the imaginary part in the angular domain, if the mild conditions in (13) and (15) hold.
-  C. Wen, W. Shih, and S. Jin, “Deep learning for massive MIMO CSI feedback,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018.
-  H. Ye, L. Liang, G. Y. Li, and B. Juang, “Deep learning-based end-to-end wireless communication systems with conditional GANs as unknown channels,” IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3133–3143, May 2020.
-  P. Dong, H. Zhang, G. Y. Li, I. S. Gaspar, and N. NaderiAlizadeh, “Deep CNN-based channel estimation for mmWave massive MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 13, no. 5, pp. 989–1000, Sep. 2019.
-  H. Ye, G. Y. Li, and B. Juang, “Power of deep learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
-  C. Lu, W. Xu, S. Jin, and K. Wang, “Bit-level optimized neural network for multi-antenna channel quantization,” IEEE Wireless Commun. Lett., vol. 9, no. 1, pp. 87–90, Jan. 2020.
J. Guo, C. Wen, S. Jin, and G. Y. Li, “Convolutional neural network-based multiple-rate compressive sensing for massive MIMO CSI feedback: Design, simulation, and analysis,”IEEE Trans. Wireless Commun., vol. 19, no. 4, pp. 2827–2840, Apr. 2020.
-  Z. Lu, J. Wang, and J. Song, “Multi-resolution CSI feedback with deep learning in massive MIMO system,” in Proc. IEEE Int. Conf. Commun. (ICC), Dublin, Ireland, Jun. 2020, pp. 1–6.
-  T. Wang, C. Wen, S. Jin, and G. Y. Li, “Deep learning-based CSI feedback approach for time-varying massive MIMO channels,” IEEE Wireless Commun. Lett., vol. 8, no. 2, pp. 416–419, Apr. 2019.
-  C. Lu, W. Xu, H. Shen, J. Zhu, and K. Wang, “MIMO channel information feedback using deep recurrent network,” IEEE Commun. Lett., vol. 23, no. 1, pp. 188–191, Jan. 2019.
-  Z. Liu, L. Zhang, and Z. Ding, “Exploiting bi-directional channel reciprocity in deep learning for low rate massive MIMO CSI feedback,” IEEE Wireless Commun. Lett., vol. 8, no. 3, pp. 889–892, Jun. 2019.
-  J. Guo, X. Yang, C. Wen, S. Jin, and G. Y. Li, “DL-based CSI feedback and cooperative recovery in massive MIMO,” 2020. [Online]. Available: https://arxiv.org/abs/2003.03303
-  Y. Sun, W. Xu, L. Fan, G. Y. Li and G. K. Karagiannidis, “AnciNet: an efficient deep learning approach for feedback compression of estimated CSI in massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 9, no. 12, pp. 2192–2196, Dec. 2020.
-  A. M. Sayeed, “Deconstructing multiantenna fading channels,” IEEE Trans. Signal Process., vol. 50, no. 10, pp. 2563–2579, Oct. 2002.
-  L. Liu et al., “The COST 2100 MIMO channel model,” IEEE Wireless Commun., vol. 19, no. 6, pp. 92–99, Dec. 2012.