I Introduction
Partially connected hybrid structures for millimeter wave (mmW) massive MIMO systems are drawing considerable attention from researchers due to the benefits of spatial multiplexing coupled with low complexity, low power consumption and high energy efficiency [1], [2]. However, such systems can only accommodate a limited number of users due to restrictions on the number of RF chains [3]. In the environments where the number of users is large compared to the number of RF chains, user selection is essential. In the existing literature, many model driven approaches are proposed for user or antenna selection. The performance of optimization and model driven methods for user selection is evaluated in terms of either biterrorrate (BER) or achievable data rate. Evolutionary algorithms for user selection are employed in [4] to find a suboptimal solution with a reduced computational complexity compared to user selection based on exhaustive search. A greedy algorithm for user selection with downlink beamforming is applied in [5] to further reduce the computational complexity, but at the cost of a reduction in achievable rate. User selection based on matching theory is proposed in [6]
. A method based on an Adaptive Markov Chain Monte Carlo (AMCMC) algorithm for multicell multiuser massive MIMO downlink systems is proposed in
[7]. The authors in [8] utilized a generalized power iteration precoding (GPIP) algorithm to generate a joint solution for user selection, power allocation, and downlink precoding. However, all these algorithms are sub optimal in nature. The optimal solution to the user selection problem can be obtained by employing an exhaustive search (ES) algorithm which enumerates all possible combinations of active users, but leads to an exponential growth of the complexity with the increase of the number of active users. Thus, the ES algorithm is not suitable for massive MIMO systems due to high computational complexity.Recently, data driven deep learning (DL) methods have shown great potential in dealing with the problems of channel estimation
[9], [10] and RF precoder design [11]. Similarly, data driven methods were applied to the transmit antenna selection problem, to obtain the best antenna subset [11], [12]. Based on the availability of appropriate datasets, classifiers such as knearest neighbors (KNN) and support vector machine (SVM) can be trained to map the available channel dataset into the selected antenna subset. The author in
[12] has proved that the performance of KNN and SVM based classifiers exceeds that of conventional optimization driven methods. Compared to optimization driven methods, data driven models are computationally efficient as well, which makes them convenient to be implemented in reallife applications. Note that, even though data driven methods have been widely applied to address multiple problems [9][12], no work has been done so far on their application to user selection. Therefore, inspired by the efficiency and great potential of data driven techniques, in this letter we apply DL based convolutional neural networks (CNN) to the user selection problem in a massive MIMO wireless communication system. Our motivation can be explained as follows:
For the user selection problem, it is well known that the computational complexity of the ES algorithm grows exponentially with the number of active users. Hence, this algorithm is not appropriate for implementation in a real time environment.

Traditional user selection algorithms require many serial iterations which generally result in a high complexity. However, in a neural network (NN), the training phase is generally performed offline, and therefore the online deployment has very low complexity and is limited to a few matrix multiplications and additions.

The availability of a large dataset and the use of a huge number of iterations in the training phase enable CNNs to understand the complicated features of wireless channels.
Ia Novelty and Contribution
In this letter, we propose a DLbased user selection design approach and develop a CNN which is trained using the channel realizations as the dataset and learns how to optimize the user selection to maximize the sum rate. The contributions can be summarized as follows:

New Design Approach: A CNN based technique is proposed to directly perform the user selection without the involvement of any iterative algorithm after the CNN has been trained. The dataset is obtained after performing exhaustive search on each channel realization to select a subset of the active users which maximizes the sum rate. It is pertinent to mention that both training and dataset generation are performed offline.

Robustness to imperfect CSI: The proposed algorithm ensures robustness to imperfect channel state information (CSI). Firstly, during offline training, the CNN learns how to optimize the user selection to approach the ideal sum rate using practical channel estimates. Secondly, during the online deployment, the CNN can itself adapt to imperfect CSI and achieve robust performance in the presence of channel estimation errors.
Ii System Model
We consider a massive MUMIMO downlink system with a transmitter equipped with transmit antennas, data streams and RF chains serving single antenna users. We assume a multiuser precoding case where the transmitter communicates with each user via only one stream. Hence, the number of data streams is equal to the number of users. Further, we assume that the maximum number of users that can be simultaneously served by the transmitter equals the number of RF chains. We assume that . Hence, we have to choose a subset of users so that = . Our proposed CNN method will select users out of the available users and afterwards we will perform hybrid precoding for the selected users. For the sake of simplicity, let = = = . On the downlink, the transmitter applies an baseband precoder followed by an RF precoder . The transmitted signal is represented as
(1) 
where is the transmitted signal vector. Furthermore, and are evaluated using the algorithm proposed in [3]. We have adopted a narrowband block fading channel [13], where the nth user observes the received signal as
(2) 
where denotes the channel between the transmitter and the nth user. Since the mmW propagation environment has a limited number of scatterers, in this paper we follow the geometric SalehValenzuela model to represent the low rank mmW channel [13]:
(3) 
represents the number of effective channel paths linked to the limited number of scatterers for each user, is the pathloss, is the complex gain associated with the lth path, and are the spatial signatures of the receiver and the transmitter, respectively, and represent the elevation and azimuth angle of arrival (AoA) and and represent the elevation and azimuth angle of departure (AoD) of the lth path at the receiver and at the transmitter, respectively. For uniform planar arrays (UPAs), the spatial signatures are estimated as given in [1].
The aim is to select users from a total of single antenna users so that the sum rate is maximized. The received signal to interference plus noise ratio (SINR) for the kth user after precoding is given as
(4) 
where is the noise power and is the channel between the transmitter and the kth selected user. Furthermore, let represent the kth row of which consists of rows of channel matrix that correspond to the channel of selected users obtained using our proposed CNN architecture. The sum rate of the system after user selection is expressed as
(5) 
Iii CNN Based User Selection
In this section, we first introduce the dataset and the label generation which will serve as the input to the CNN model for training. Then, we will describe the CNN architecture in detail.
Iiia Data Normalization
A channel matrix realization is a data sample. Since we have assumed single antenna users, each row of represents the channel for the respective user. Since the CNNs do not accept complex entries as input, complex entries () of channel matrix are divided into real and imaginary parts and concatenated into a single matrix.
IiiB Label Generation
In order to select users from the available users, we have = combinations. Every combination is represented as a class pattern, so we have a total of class labels. We have a one to one correspondence between each set of a selected users and a class label. For example, the first pattern of combinations of selected users is given label 1, the second pattern of combinations of selected users is given label 2, and so on.
In order to generate a dataset for the training of our CNN, an exhaustive search is performed on each channel realization. The search algorithm examines all possible combinations generated and selects the combination of users that maximizes the sum rate. Hence, each channel realization is labeled with a class and given as input for the training of the CNN.
IiiC Deep CNN Architecture
We have adopted a LeNet architecture for our CNN as shown in Fig 2. There are 2 convolutional layers, 2 pooling layers, 1 fully connected layer and 1 softmax output layer. The input of the CNN is the normalized channel matrix. The first convolutional layer filters the normalized input channel matrix with 16 kernels of size . Then the first pooling layer normalizes and pools the input into a
output response. The maxpooling kernels have a size of
and a stride of 2. The second convolutional layer filters the response with 32 kernels of size
. The second pooling layer converts the input response to a output response. The fifth layer is a dense fullyconnected layer with 1024 fullyconnected kernels of size, which is a dense layer accelerating the convergence. Then a dropout layer, which randomly resets the output of each hidden neuron to zero, is added behind the fullyconnected layer in order to avoid overfitting. The output of our CNN is the softmax layer, which produces a class label. There is a onetoone correspondence between a class label and a user selection. For the convolutional and fullyconnected layers the rectified linear unit (ReLU) is used as the nonlinear activation function. The cross entropy is employed as the loss function with gradient descent optimizer.
Iv Simulations and Results
Python 3.6 with tensor flow is used for CNN training and prediction, whereas the comparison with model driven algorithms is performed in MATLAB. The dataset for the simulations is generated for 100,000 independent channel realizations. The number of transmit antennas
is set to 144. An exhaustive search method is used to generate the class label for each channel realization. The number of epochs for the training of CNN is kept as 200. Batch size is kept as 100. Fig
3 shows the result obtained by selecting 6 out of 10 active users and Fig 4 shows the result obtained by selecting 3 out of 10 active users.The results verify the performance of our CNN based algorithm over conventional iterative algorithms. It can be seen that the CNN based algorithm outperforms the sum rate achieved by the iterative binary particle swarm optimization (BPSO) and greedy algorithms. The greedy algorithm has the worst performance due the fact that it cannot select the “best” set of users from the available users. Although our proposed CNN approach only marginally outperforms the BPSO algorithm in terms of sum rate, it achieves a large gain in terms of reduction in computational complexity, as discussed in Section
IVA. Even though the proposed CNN algorithm has a performance gap with respect to the exhaustive search algorithm, the latter is not suitable due to high computational complexity.Iv1 Performance Analysis with Imperfect CSI
Next we evaluate the case of uncertainties affecting the channel model. The estimated channel matrix with imperfect CSI can be modeled as [2]
(6) 
where H is the actual channel matrix, [0,1] is the CSI accuracy parameter and E is an error matrix with entries following an i.i.d. (0,1). Fig. 5 shows the effect of imperfect CSI on selecting 6 out of 10 active users with = 0.9 and 0.7. It is evident that the proposed system is not overly sensitive to CSI accuracy. It can also be seen that the achievable rate of the proposed system is quite close to that of the perfect CSI scenario even when .
Iva Complexity Analysis
In this section, we perform a complexity analysis of the proposed CNN method and compare it with that of conventional model driven methods. We consider = 144 transmit antennas and = 10 total available users. For our proposed CNN, the complexity in the offline training stage is normally not counted [14], and only that in the online deployment stage is counted. With the parameters mentioned in Section IIIC, the total number of operations for CNN is around 6 million.
The exhaustive search algorithm needs to compute complex determinants to evaluate the sum rate and each determinant requires O() complex operations to compute the hybrid beamforming matrices ( and ) [14]. So for =144, =10 and =6, the total number of operations is around 156 million. The BPSO based algorithm requires determinants, where is the population size and is the total number of iterations for the BPSO algorithm. Both variables are set to 10 during the simulations. So, the total number of operations is around 74 million for the BPSO algorithm. The greedy algorithm needs to compute O() complex operations for a total number of users. So, the total number of operations is around 7.5 million for the greedy algorithm.
It can be seen that the proposed CNN has competitive computational complexity when compared with traditional modelbased algorithms. In addition, the main operation of any CNN based algorithms only involves largescale matrix multiplications and additions, which can be effectively accelerated by using graphics processing units. Conversely, most traditional modelbased algorithms involve serial iterations where the optimization of the next iteration depends on the result of the previous iteration, which is not suitable for parallel computing.
V Conclusions
We have proposed a CNN based design approach for user selection involving large antenna arrays. It is shown that the proposed CNN based approach is more efficient in terms of sum rate than model driven methods and performs close to the optimal exhaustive search based user selection. In addition, the computational complexity of data driven methods is acceptable and calculations only involve matrix multiplications and additions in the online phase.
References
 [1] O. El Ayach, et al., ”Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Trans. Wireless Commun., vol. 13, no. 3, pp. 14991513, Mar. 2014.
 [2] X. Gao, et al., ”Energyefficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 9981009, Apr. 2016
 [3] A. Alkhateeb, et al., ”Limited Feedback Hybrid Precoding for MultiUser Millimeter Wave Systems,” in IEEE Transactions on Wireless Communications, vol. 14, no. 11, pp. 64816494, Nov. 2015
 [4] M. Naeem and D. C. Lee, ”A Joint Antenna and User Selection Scheme for Multiuser MIMO System,” Applied Soft Computing, vol. 23, pp. 366374, Oct. 2014.
 [5] G. Dimic, et al., ”On downlink beamforming with greedy user selection: Performance analysis and a simple new algorithm,” IEEE Trans. Signal Process., vol. 53, pp. 38573868, Oct. 2005.
 [6] J. Cui, et al., ”User Selection and Power Allocation for mmWaveNOMA Networks,” 2017 IEEE Global Communications Conference, Singapore, pp. 16.
 [7] Maimaiti, et al., ”A lowcomplexity algorithm for the joint antenna selection and user scheduling in multicell multiuser downlink massive MIMO systems”, J Wireless Com Network, 2019
 [8] J. Choi, N. Lee, S. Hong and G. Caire, ”Joint User Selection, Power Allocation, and Precoding Design With Imperfect CSIT for MultiCell MUMIMO Downlink Systems,” in IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 162176, Jan. 2020.
 [9] H. Ye, G. Y. Li, and B. Juang, ”Power of deep learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114117, Feb. 2018.
 [10] H. He, C. Wen, S. Jin and G. Y. Li, ”Deep learningbased channel estimation for beamspace mmWave massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 852855, Oct. 2018.
 [11] A. M. Elbir and K. V. Mishra, ”Joint Antenna Selection and Hybrid Beamformer Design Using Unquantized and Quantized Deep Learning Networks,” in IEEE Transactions on Wireless Communications, vol. 19, no. 3, pp. 16771688, Mar. 2020

[12]
J. Joung, ”Machine LearningBased Antenna Selection in Wireless Communications,” in
IEEE Communications Letters, vol. 20, no. 11, pp. 22412244, Nov. 2016.  [13] A. Alkhateeb, et al., ”Channel estimation and hybrid precoding for millimeter wave cellular systems”, IEEE Journal of Selected Topics in Signal Processing, vol. 8, no. 5, pp. 831846, Oct. 2014.
 [14] X. Gao, et al., “ComNet: combination of deep learning and expert knowledge in OFDM receivers,” IEEE Commun. Lett., vol. 22, no. 12, pp. 26272630, Dec. 2018.
 [15] Huang, et al., ”DFT codebookbased hybrid precoding for multiuser mmWave massive MIMO systems”. EURASIP J. Adv. Signal Process. 2020.