I Introduction
With the recent release of 5Gnew radio (NR) standard, early 5G deployment has already started in various countries including South Korea, Canada, and China. 5GNR has a multitude of advantages over the longterm evolution (LTE)/LTEadvanced technology, i.e., higher data rates ( 0.1 Gbps), low latency ( 1  10 msec), higher mobility ( 500 km/h), and support to 10 devices per sq. km. The intriguing use cases of 5GNR, such as ultrareliable low latency communication (uRLLC), enhanced mobile broadband (eMBB), and massive machinetomachine communication (mMTC) leverage on three disruptive technologies, i.e., millimeterwave (mmwave) communication, largescale antenna arrays (i.e., massive MIMO), and ultradense deployment of access points.
Despite the aforementioned advancements, the global mobile traffic volume is expected to grow from 7.462 EB/month in 2010 to 5016 EB/month in 2030 [1]. Thus, the launch of sixth generation (6G) wireless networks is inevitable. 6G networks will observe coexisting RF and mmwave deployments [2] and coexisting RF and visible light communication (VLC) [3, 4] deployments. In addition, higher frequencies in the terahertz (THz) band [0.110 THz] will be central to ubiquitous wireless communications in 6G. THz frequencies promise to support ample spectrum, above hundred Gigabitpersecond (Gbps) data rates, massive connectivity, denser networks, and highly secure transmissions. Multiple leading 6G initiatives probe THz communications, including the “6Genesis Flagship Program (6GFP)”, the European Commission’s H2020 ICT09 THz Project Cluster, and the “Broadband Communications and New Networks” in China. In the US, THz technology was identified in 2014 by the US Defense Advanced Research Projects Agency (DARPA) as one of the four major research areas that could impact society more than the internet. Similarly, the US National Science Foundation and the Semiconductor Research Consortium (SRC) also identify THz as one of the four essential components of the next IT revolution.
The THz spectrum exists between the mmWave and the farinfrared (IR) bands and has, for long, been the least investigated electromagnetic spectrum. However, recent advancements in THz signal generation, modulation, and radiation methods are closing the socalled THz gap. Nevertheless, channel propagation at THz frequency bands is susceptible to molecular absorption, blockages, atmospheric gaseous losses due to oxygen molecule and water vapor absorption. On the other hand, conventional RF spectrum is characterized with strong transmission powers and wider coverage; however, the spectrum is limited and extremely congested.
Evidently, THz networks have reduced coverage but plenty of spectrum, so there exists a tradeoff based on users’ channel quality and available spectrum. To overcome the tradeoffs between different frequencies, opportunistic spectrum selection mechanisms should be designed considering a coexisting network where RF BSs and THz BSs coexist. The first work that considered a coexisting RF and dense THz network was presented in [5]. In a coexisting network, due to the enhanced signal power from RF BSs, it is highly likely that the user will be biased towards RF SBSs, albeit the THz BSs can provide very large transmission bandwidth yielding very high data rates and ultralow latencies. As such, new traffic offloading and user clustering schemes will be crucial where users can be offloaded to different BSs and the resource utilization can be improved by balancing the traffic load among BSs.
In this paper, we propose two user equipment (UE) association algorithms based on unsupervised learning in a coexisting RF/THz network. Our algorithm captures the network heterogeneity observed at RF and THz frequencies. The algorithms cluster UEs to every base station (BS) such that the traffic load across the network can be balanced, i.e., by minimizing the standard deviation of network traffic load. Numerical results show that the proposed algorithms outperform the classical algorithms in terms of data rate, traffic load balancing, and user’s fairness. Unlike typical unsupervised clustering algorithms (e.g. kmeans, kmedoid, etc.) that search for appropriate cluster centers’ locations, our algorithm identifies the appropriate UEs to be associated to a certain BS such that the overall network load standard deviation (STD) can be minimized subject to rate constraints. We use standard deviation as a measure to identify how many BSs are overloaded and underloaded and need to redistribute UEs among them. Standard deviation depends on distribution of UEs at every BS, so when distribution changes standard deviation will change as well. We consider every BS as an independent cluster and associate UEs to each BS. STD minimization enables traffic load balancing among BSs and reduces load variations in every BS from the mean value.
Ii Stateoftheart: User Association
To date, several user association methods have been explored for 5G heterogeneous networks (HetNets) with the objective of traffic load balancing (a survey is provided in [6]). Some popular methods include cellrange expansion (CRE) [7] and resourceaware UE association algorithms [8].
In [7], the authors combined UE association algorithms using SBS along with traffic offloading using CRE. The drawback of CRE is that the MBS acts as a strong interferer and almostblank subframes based strategies are required. In [8], a resource aware user association scheme has been proposed. In [9], the authors proposed a new user association algorithm that considers the highly directional mm Waves, network interference, and their vulnerability to small channel variations. The proposed algorithm is dependent on the network interference structure and user association adjusting the interference according to the association, and under the maxmin fairness. Also, in [10], the authors considered the spectrum heterogeneity of mmWave frequency bands by introducing two access schemes; the singleband and multiband access schemes. For the first access scheme, the authors developed an iterative algorithm based on the Lagrangian dual decomposition methods and the NewtonRaphson method for joint user association and power allocation. For the multiband access scheme, a Markov approximation framework is used to develop a nearoptimum user association algorithm. The results revealed that different users can only access one band simultaneously. In [11], the authors solved a mixedinteger optimization to maximize the network throughput of timevariant mmWave networks, and suggested that distributed association techniques will solve the problems of wireless channel variations (due to obstacles) and client mobilities.
For THzonly network, [12]
introduced a user association algorithm to maximize the total throughput that takes into account the directivity and position of the BSs’ and UEs’ antennas besides the minimum rate requirements using the grey wolf optimizer. The proposed framework proved to be more efficient than the commonly used particle swarm optimizer (PSO) approach and that the only required control parameters are the population size and number of iterations.
In [13], the authors devised an algorithm to increase the load for the lightly loaded small cell BSs. They solved a logarithmic utility maximization problem considering multiassociation to BSs, where the equal resource allocation converges to nearoptimal solution. In [14], the authors employed a selective method of UE association, where the MBS coverage is divided into center and edge regions and SBSs are only active in the edges. In [15]
authors derived connection probability and the average ergodic capacity for two types of multiconnectivity, such as closest lineofsight access point and reactive connectivity. One of the important analysis from their model is that authors are taking the blockage into account for user association problem. However, there are plenty of BS that can be present in the same territory, which is ignored.
Iii Network Model
The conventional RF macro basestations (MBSs) and THz basestations (TBSs) are randomly deployed. MBSs and TBSs are equipped with and antennas with and orthogonal streams. Total number of MBSs and TBSs is . The users are randomly deployed and their total number is . From Eq. 2 in [16], we can determine data rate for a user associated to tier . where denotes the bandwidth available at tier . The massive MIMO regime refers to as the case where . The factor in the numerator is the massive MIMO gain at the user. In what follows, we describe the channel propagation and signaltointerferenceplusnoise ratio (SINR) at THz and RF frequencies.
Iii1 RF Channel and SINR Model
The RF channel experiences both the channel fading and pathloss. Thus, the received signal power at the typical user can be modeled as , where
is the exponentially distributed (Rayleigh fading) channel power with unit mean from the tagged SBS,
is the pathloss exponent, and is the distance of the considered user to the serving SBS. Also, , where MHz is the RF carrier frequency, and m/s is the speed of light. Based on this, the SINR of a typical user on RF channel can be calculated as , where is the transmit power of the SBSs and is the thermal noise at the receiver.Iii2 THz Channel and SINR Model
Since the molecular absorption loss is high in THz, the impact of multipath fading and NLOS transmission is negligible. Thus, we model the LOS channel power between users and TBSs as where , is the molecular absorption coefficient, is the distance between the transmitter and receiver, is the frequency at which the THz devices are operating, and is the speed of light. The LOS SINR of the typical user associated to its desired TBS can be calculated as where is the transmit power of the TBS and , noise power denotes the thermal noise and the noise resulted from the molecular absorption.
Iv Proposed Algorithms
In this section, we propose two unsupervised clustering algorithms which are different from conventional unsupervised clustering algorithms referred to as Least Standard Deviationbased clustering and mean traffic loadbased clustering algorithms. For example, Kmeans clustering uses square of the distance from a centroid to minimize the clustering error. Other modified Kmeans methods depend on standard deviation where they search for the location of the cluster having the maximum standard deviation from a centroid [17, 18].
Iv1 Least Standard Deviation User Clustering Algorithm
In our algorithm, a binary matrix is generated by choosing an acceptable level of SINR for UEs and discarding the UEs (by assigning a logical value of 0) who cannot connect to the corresponding BSs and have less SINR levels. If a UE has acceptable SINR value from a BS, the UE assignment takes the logical value of 1. We define number of possible BSs per UE, and start from a UE with the least possible available BSs, where UE is assigned if there is only one possible BS. If UE has more than one possible BS, then choose the BS with least possibilities first and attempt connection to the least loaded BS. For other UEs in the network, consider the least loaded BS then add number of UEs associated to every BS. Calculate load STD of the network and check whether if it is less than a certain threshold. Repeat for next UE with least available BSs until algorithm converges and network load STD is less than a chosen threshold. The procedure is detailed in Algorithm 1
and in the steps below, (i) Initialize number of BSs, number of UEs, SINR threshold, load per BS vector, and standard deviation threshold is 1, (ii) Calculate SINR matrix and generate binary matrix
, (iii) Compute number of BSs per UE and number of UEs per BS , (iv) Start from UE with smallest , (v) If then associate UE to BS, otherwise attempt connecting the UE to a BS with smallest , (vi) Calculate final load per BS L and STD (vii) If STD then optimum load per BS obtained, otherwise repeat the steps until STD.Iv2 Redistribution of BSs Load (RBL)based Clustering
We propose a nonparametric unsupervised learning algorithm. Our algorithm forms BS clusters based on UEs’ calculated SINR levels and not according to Euclidean distance, as signal strength is the main concern. First, UEs are associated to BS clusters based on maximum SINR value. Our algorithm learns from the load per BS and defines BS status (that some BSs might be overloaded and some are under loaded). Reassociation of UEs is carried out where over loaded BS clusters lay off some of their UEs (The UEs with the strongest signals are chosen) and donate them to BSs with less load. As UEs getting the strongest signals are chosen, then signal quality will not be affected when load balancing is carried out. Our algorithm is realtime, where the load can be redistributed instantly from one cluster to another. Mean value of UEs per BS () varies for every tier as every BS has a certain capacity to associate UEs (due to different transmission bandwidth available in RF and THz). Mean value is defined as the maximum number of UEs that can be associated to a BS (considering UEs’ traffic demand) divided by number of BSs in a tier.
The procedure is detailed in Algorithm 2 and in the steps below, (i) Initialize number of BSs , number of UEs , mean value of UEs per BS , number of UEs per BS . (ii) Calculate SINR Matrix and associate UEs to BSs based on MaxSINR. (iii) Calculate per tier and . If then BS status is ”overloaded”. (iv) Sort UEs associated to ”overloaded” BSs based on highest SINR. If then BS status is ”accepting”. (v) Sort accepting BSs based on highest SINR value. Move UE on top of the ”overloaded” BSs list to BS on top of the ”accepting” BSs list. Repeat for rest of UEs of first ”overloaded” BS. Repeat for rest of ”overloaded” BSs. (vi) If then finalize UEs associated with that BS to it. For moving or newly added UEs, calculate the new SINR matrix and repeat. Repeat for the other tier (TeraHertz) and generate final load distribution per BS.
V Numerical Results and Discussions
Matlab simulation is conducted to analyze the performance of our proposed algorithms. The values for the parameters of our simulation are shown below:

RF frequency=300 MHz and THz frequency=300 GHz

Total number of TBSs =76

Total number of MBSs =10

No. of RF BS Antennas =1000

No. of THz BS Antennas =200

Working area =

Min Allowed Distance Between RF BSs=400m

Min Allowed Distance Between THz BSs=100m

Path Loss Exponent =3

Molecular absorption coefficient (=0.0016m

SINR Threshold =0.5

Standard Deviation Threshold () = 1
Fig. 1 shows the standard deviation of the first proposed algorithm as it converges with iterations. It is obvious that as there are more UEs to be associated in the network, the objective function takes slightly more time to converge. Next, Fig. 2 shows data rate of various algorithms compared with our proposed algorithms. For maxSINR scheme, most UEs select the MBSs so less resources are available to them. SINRbased scheme provides a slight improvement in data rate. CRE and ratebased scheme provide a significant improvement over maxSINR scheme, where more UEs are offloaded from MBSs to TBSs due to the biasing factor. As a result, more resources are available for MBS UEs. Finally, Fig. 3 shows Jain’s Index of our proposed algorithms compared to maxSINR scheme. It is interesting to note that fairness is improved with more UEs associated in the network for our proposed algorithms. As more UEs are available in the network, BSs get more opportunities to associate UEs to them and network load is balanced in a better way so that fairness is improved. LSTD and RBL achieve nearly the same performance with LSTD yielding a slight improvement over RBL (LSTD=0.96 and RBL=0.89 for 500 UEs).
Vi Conclusion and Future Research Directions
In this paper, we analyzed the performance of a massive MIMOenabled coexisting RF and THz network. We noted that conventional user association schemes may not be wellsuited in 6G coexisting RF and THz networks. Subsequently, we proposed two user association algorithms using tools from unsupervised learning. Several unique challenges, however, have still to be addressed to achieve the full potential of THz communications. For instance, THz transmissions incur very high propagation losses, which significantly limit the communication distances. Hence, while in aerial, satellite, and vehicular networks, THz frequencies can provide lowlatency communication, the propagation losses can hinder the gains. Furthermore, the coexistence of mmWave, sub 6GHz, and optical wireless communications and networking is not yet fully understood. Furthermore, reconfigurable intelligent surfaces, ultramassive MIMO configurations, and integrated access and backhaul, can boost the gains of THz communications.
References
 [1] A. Yastrebova, R. Kirichek, Y. Koucheryavy, A. Borodin, and A. Koucheryavy, “Future networks 2030: architecture & requirements,” in 2018 10th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 2018, pp. 1–8.
 [2] H. Ibrahim, H. Tabassum, and U. T. Nguyen, “The meta distributions of the SIR/SNR and data rate in coexisting sub6GHz and millimeterwave cellular networks,” arXiv preprint arXiv:1905.12002, 2019.
 [3] H. Tabassum and E. Hossain, “Coverage and rate analysis for coexisting RF/VLC downlink cellular networks,” IEEE Transactions on Wireless Communications, vol. 17, no. 4, pp. 2588–2601, 2018.
 [4] H. Elgala, M.S. Alouini, H. Haas, M. Rahaim, H. Tabassum, and T. Watanabe, “Introduction to the special section on coexisting radio and optical wireless deployments (CROWD),” IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 4, pp. 1178–1181, 2019.
 [5] J. Sayehvand and H. Tabassum, “Interference and coverage analysis in coexisting RF and dense terahertz wireless networks,” IEEE Wireless Communications Letters, 2020.
 [6] D. Liu, L. Wang, Y. Chen, M. Elkashlan, K.K. Wong, R. Schober, and L. Hanzo, “User association in 5g networks: A survey and an outlook,” IEEE Communications Surveys & Tutorials, vol. 18, no. 2, pp. 1018–1044, 2016.
 [7] F. Muhammad, Z. H. Abbas, and F. Y. Li, “Cell association with load balancing in nonuniform heterogeneous cellular networks: coverage probability and rate analysis,” IEEE Transactions on Vehicular Technology, vol. 66, no. 6, pp. 5241–5255, Jun. 2016.
 [8] E. Hossain, M. Rasti, H. Tabassum, and A. Abdelnasser, “Evolution toward 5G multitier cellular wireless networks: An interference management perspective,” IEEE Wireless Communications, vol. 21, no. 3, pp. 118–127, 2014.
 [9] A. Alizadeh and M. Vu, “Load balancing user association in millimeter wave MIMO networks,” IEEE Transactions on Wireless Communications, 2019.
 [10] R. Liu, Q. Chen, G. Yu, and G. Y. Li, “Joint user association and resource allocation for multiband Millimeterwave heterogeneous networks,” IEEE Transactions on Communications, vol. 67, no. 12, pp. 8502–8516, 2019.
 [11] Y. Xu, H. S. Ghadikolaei, and C. Fischione, “Adaptive distributed association in TimeVariant Millimeter wave networks,” IEEE Transactions on Wireless Communications, vol. 18, no. 1, pp. 459–472, 2018.
 [12] A. A. Boulogeorgos, S. K. Goudos, and A. Alexiou, “Users association in ultra dense THz networks,” in 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Aug. 2018, pp. 1–5.
 [13] Q. Ye, B. Rong, Y. Chen, M. AlShalash, C. Caramanis, and J. G. Andrews, “User association for load balancing in heterogeneous cellular networks,” IEEE Transactions on Wireless Communications, vol. 12, no. 6, pp. 2706–2716, Jun. 2013.
 [14] Z. H. Abbas, F. Muhammad, and L. Jiao, “Analysis of load balancing and interference management in heterogeneous cellular networks,” IEEE Access, vol. 5, pp. 14 690–14 705, Jul. 2017.
 [15] A. Shafie, N. Yang, and C. Han, “Multiconnectivity for indoor terahertz communication with self and dynamic blockage,” 2020.
 [16] H. Tabassum, A. H. Sakr, and E. Hossain, “Analysis of massive MIMOenabled downlink wireless backhauling for fullduplex small cells,” IEEE Transactions on Communications, vol. 64, no. 6, pp. 2354–2369, 2016.
 [17] K. Thangavel and D. A. Kumar, “A combined standard deviation based data clustering algorithm,” Journal of Modern Applied Statistical Methods, vol. 5, no. 1, pp. 1–9, 2006.
 [18] D. T. Kuttiyannan and A. K. D, “A new clustering technique using standard deviation,” in National Seminar on Recent Developments in Concrete Mathematics, Mar. 2002, pp. 1–8.
Comments
There are no comments yet.