The demand for cellular data is experiencing an unprecedented increase. The next generation,
G wireless cellular network is estimated to support afold increase in wireless data traffic by 2030 . To cope with this exponential increase in demand, there has been growing interest in network densification for cellular systems as a means to improve spectrum efficiency and cellular network capacity .
The need for additional base stations (BSs) is more pronounced in cellular hotspot areas that exhibit a steep surge in data demands during temporary events, such as concerts and football games. To satisfy such temporary surges in traffic, the use of an unmanned aerial vehicle (UAV) as an aerial BS can be a more flexible and cost-effective approach, compared with a traditional, ground BS . A mobile UAV can intelligently change its position, which is suitable to provide on-demand wireless service to ground users, thus overcoming coverage holes and alleviating congestions .
In order to deploy UAVs in a timely and flexible manner, network operators must be able to predict potential hotspots and congestion events a priori. To this end, there is a need to apply machine learning (ML) techniques to analyze demand patterns . The ability of ML to exploit big data analytics enables a comprehensive prediction of a network’s traffic amount and data distribution. By using such predictions, aerial UAV BSs can be optimally deployed to the target area beforehand thus providing an on-demand, delay-free and power-efficient wireless service to ground users.
The use of UAVs as cellular BSs has been addressed in [6, 4, 7, 8, 9, 10]. Meanwhile, in  and , the authors studied the use of UAVs as flying BSs to provide energy-efficient service to wireless users. Moreover, the work in  and  focus on using UAVs as relays, and the work in  studies an energy-efficient trajectory design. However, most of the existing works assume a time-invariant wireless network, or a given distribution of cellular users. To properly analyze an on-demand deployment of UAV BSs, the temporal and spatial patterns of the cellular traffic data must be predicted so as to optimally deploy UAVs to satisfy a time-varying data demand.
There are existing woks, such as [9, 11], and , that apply ML techniques to optimize UAV deployment. In , a neural model is formulated to study the map of UAVs to each hotspot areas. The authors in 
studied the trajectory optimization using neural networks, while a segmented regression approach is proposed in for UAV channel modeling, based on the terrain topology. However, none of these works demonstrates the benefit of applying ML to deploy UAVs on-demand and improve power efficiency and network performance. In order to analyze the data traffic of cellular networks, the authors in  studied a BS sleeping strategy for minimizing power consumption. However, the authors focused only on a low-traffic cellular network, which is not scalable for the more practical, congested scenarios.
The main contribution of this paper is a novel machine learning framework that enables operators to predict congestions and hotspot events, and subsequently, deploy temporary UAV BSs to provide aerial wireless service to mobile users, while minimizing the UAV power needed for downlink communications and mobility. We consider a heterogeneous cellular network, in which ground BSs can offload the wireless service to aerial UAVs when the predicted data demand of mobile users exceeds the network capacity. To guarantee a no-delay wireless service, a Gaussian mixture model (GMM) is introduced based on a weighted expectation maximization (WEM) algorithm  to predict the cellular data traffic. Then, the optimal deployment of UAVs is studied to minimize the power needed for UAV transmission and mobility, given the predicted traffic. To this end, we first study the division of service areas, based on a fairness principle. Then, we derive the optimal UAV locations that can minimize the total power consumption of the network. To the best of our knowledge, this is the first work that leverages ML to predictively deploy UAVs as aerial BSs. Simulation results show that the proposed approach can reduce the required downlink transmit power and improve the power efficiency by over , compared with an optimal deployment of UAVs with no ML prediction.
Ii System Model and Problem Formulation
Consider a time-variant heterogeneous cellular network that serves a group of cellular users distributed in a geographical area . The cellular network consists of a set of UAVs and a set of BSs. Each user can receive data from both ground BSs and UAVs. Initially, a traditional BS will be chosen to serve the wireless users. However, if the downlink of the ground cellular system is overloaded due to heavy traffic, the ground BS will request the deployment of UAV BSs to offload some of its users.
Ground BSs and UAVs employ different frequency bands for downlink communications. Each UAV is equipped with directional antennas that enable beamforming. Therefore, interference among UAVs is negligible. Furthermore, each UAV adopts a frequency division multiple access (FDMA) technique and assigns a dedicated channel to one of its downlink users. Hereinafter, we use the notion of an aerial cell to indicate the service area of each UAV, and aerial cellular users to indicate users that are served by UAV cellular BSs.
Each UAV has a limited energy resource, that must be efficiently used for joint communications and mobility. To this end, the UAVs should intelligently change their positions to meet the required users’ data rates, as well as to minimize their transmission power. However, given the cellular network is time-variant, the cellular traffic demand will change over time, which complicates the efficient deployment. To guarantee timely aerial service without having UAVs continuously moving, the network operator can use ML techniques to predict its network’s data demand, and then, request the deployment of UAV BSs to the predicted hotspot areas, before the congestion occurs.
Ii-a Air-to-ground channel model
Given a typical ground receiver located at and a UAV located at , the path loss of the downlink communication from UAV to the receiver will be :
where is the distance between the ground receiver and UAV , is the carrier frequency, is the speed of light, and is the average additional loss to the free space propagation loss which depends on the environment. If the wireless link between UAV and a ground user is line-of-sight (LOS), ; otherwise, the non-line-of-sight (NLOS) link has an additional loss of
. The NLOS link will experience a high path loss due to shadowing and reflection. The probability of existence of LOS links between UAVand the ground user will then be :
where and are constant values which depend on the environment, and is the elevation angle of UAV with respect to the receiver. Then, the probability of having a NLOS link is .
Consequently, the average path loss from UAV to the ground reciever at in the linear scale can be given as
Therefore, the downlink capacity that UAV can provide to a mobile user located at will be:
where is the transmission bandwidth of UAV , is the transmission power, is the antenna gain of UAV , and is the average noise power spectral density. For tractability, we assume a perfect beam alignment between the UAV and the mobile receiver, and each UAV has the same antenna gain. Therefore, , which is a constant for all and . Assume that the total available bandwidth of UAV is and the number of mobile users associated with UAV is , then the downlink bandwidth of each channel will be .
The number of aerial users that are served by UAV within its aerial cell is given by:
where is the total number of aerial users, is the service area of UAV , and is the distribution of aerial users. In order to provide a universal wireless service, the aerial cells of all UAVs should fully cover the geographical area without overlap. That is, , and for and , . However, note that, the value of and the user distribution will change, according to the offloading requests from ground BSs.
Ii-B Cellular traffic analysis
To provide an on-demand service, network operators need to change the UAVs’ locations frequently, according to the offload requests from ground BSs, to satisfy the instant traffic demand. However, such continuous movement will consume excessive power. To efficiently deploy UAVs while guaranteeing a no-delay wireless service, a dataset of the cellular traffic history can be exploited by the network operator for traffic prediction. This dataset, represented by a matrix , records discrete data during each time period for days:
where is a discrete set of time, and the unit of is hour. The first item represents the number of aerial users that are offloaded from a BS at to a UAV during a time interval from to , and the second item denotes the amount of cellular traffic that a UAV needs to provide for the aerial users from a BS at during the period from to .
Let be the total number of aerial users, be the total amount of aerial cellular traffic, be the spatial distributions of aerial users, and be the spatial distribution of aerial data traffic in . Without a comprehensive analysis of , the values of , , and will change over time, based on the offloading requests of ground BSs, which causes a frequent movement of UAVs to meet the instant traffic demand, and excessive power consumed on mobility.
Therefore, our goal is to develop a centralized ML approach to predict and based on , and and based on , such that at the beginning of each period , network operators can optimally deploy UAVs to minimize the power consumptions, while during each interval the locations of UAVs remain fixed.
Ii-C Data rate requirement
Given the predicted information on the total amount of aerial cellular traffic , and the distribution of aerial cellular traffic , the average data rate requirement within a service area of UAV can be given by
Since the communication capacity of UAV should be greater than or equal to the rate demand of all users in its aerial cell , we formulate the data rate requirement as follows,
Note that, the values of and in (10) will depend on the output of the cellular traffic analysis.
Consequently, the total transmit power of all UAVs needed to satisfy the data demand of all aerial users in will be:
Without loss of generality, we assume that the maximum transmission power of UAVs is sufficient to meet the data demand of aerial users. Meanwhile, the total power for each UAV to move from its current location to the new location will be:
where is the rate of energy consumption a UAV needs to move by one meter.
Then, the second objective is to jointly find the optimal location and the partition of the service area for each UAV , such that the total power used for downlink transmissions and mobility can be minimized, i.e.,
where is the available power of UAV , and is a constant for all . The first constraint represents a fairness principle, whereby the ratio of the data traffic offloaded to each UAV equals to the ratio of the available power of each UAV. The second and third constraints guarantee that the service areas of all UAVs fully cover without overlap.
Note that, without an ML analysis, the function , as well as , will change, based on the offloading requests of ground BSs. Thus, the network operator needs to reorganize the aerial cellular system to meet the instant traffic demand frequently. However, with the predicted information of cellular traffic, the optimal problem (13) is fixed within each period . Therefore, at the beginning of each interval, UAVs are deployed according to the solution of (13), and within the period, the location and aerial cell of each UAV remain unchanged.
Iii Proposed Prediction and UAV Deployment Framework
Next, we propose a novel approach to address the aforementioned problems. First, a centralized ML approach will be proposed to predict the values of , , and for each time interval . With the prediction information, the power minimization problem in (13) will be solved to optimally deploy each UAV.
Iii-a Cellular traffic prediction
In order to have a robust and practical analysis, we use the real dataset 111Our approach can accommodate other datasets without loss of generality. of City Cellular Traffic Map , which records the time, the location of each BS, the number of mobile users, and the total amount of data that each BS serves during each hour, from Aug. 19 to Aug. 26, 2012, in a median-size city in China. We assume that the maximum number of mobile users that each BS can serve within one hour is a fixed number of , and the maximum amount of cellular data is a constant for all BSs. Thus, a new dataset is generated to capture the traffic of the aerial cellular network as , in which is the number of aerial users from hour to , and is the amount of aerial cellular traffic. For notation simplicity, hereinafter, we use to denote the aerial traffic dataset, instead of . Since and have an analogous data structure, a similar approach will be applied to analyze and . For simplicity, we keep the following discussion only on . Therefore, the objective is to use ML to formulate the temporal and spatial pattern of .
There are three key assumptions in the following ML analysis. First, due to the periodicity of human activity, the cellular traffic presents a repetitive daily pattern . Based on this observation, we assume that the total cellular traffic during a specific hour of different days follows the same distribution. Therefore, we divide the dataset into subsets, by merging the data of the same hour from different days. Second, we assume that the traffic amount between each hour of one day is independent. Therefore, given the sub-datasets, independent models will be built to study the pattern of each objective value of each hour. Furthermore, we assume that the temporal feature of is independent from the spatial distribution. As a result, two separate models will be studied to identify the temporal feature and the spatial feature of for each hour.
The model to capture the temporal and spatial characteristics of relies on a GMM, which assumes that the data distribution can be modeled by the sum of multiple Gaussians with different weights as  , where is a general data point,
is the probability distributed at, is the number of individual Gaussian models in GMM, and , is the mixing coefficient for each Gaussian.
Iii-A1 Spatial distribution model
First, we study the modeling approach of the spatial feature of . Given a time , the data distribution of the cellular traffic from to can be calculated by
Then, a dataset is formed by all the distribution of days for the specific hour , and we seek to build a GMM to capture a pattern of data distribution for time as
is the location vector. To find the parameters of, , , and , for a given , and , first, a classification approach based on a weighted K-means method is used to group the data into clusters, and the weight is the data amount at . Then, the WEM algorithm will be used to find the optimal parameters of GMM. The convergence of the WEM iterative approach can be evaluated by the log likelihood function as
whose value will increase as the iteration time increases. Our detailed approach is summarized in Algorithm 1.
Iii-A2 Temporal distribution model
Given a time , the total aerial traffic amount in the system from to can be calculated by . By gathering the data of days, we have a dataset . The GMM that captures the temporal pattern of is . The approach to model the temporal distribution for , is similar to the algorithm in Algorithm 1. However, both the K-means and EM algorithm do not add weight to each data point. As a result, by ignoring all used in Table 1 and substituting its value by one, Algorithm 1 can be applied to find the temporal pattern . The mixture Gaussian model. The predicted data amount can be estimated by the CDF with a threshold. For example, with a threshold of , the predicted traffic amount over the aerial networks can be given by . The ML analysis of the temporal feature and the spatial feature of can follow the approach of Algorithm 1.
Iii-B On-demand, optimal UAV deployment
In order to optimally deploy UAVs to minimize the total power, problem (13) is formulated, which jointly considers the aerial cell partition and the UAVs’ locations. With the prediction information, network operators only need to move UAVs at the beginning of each time interval, according to the solution of (13). However, solving (13) is challenging due to the mutual dependence between and with and . For tractability, we solve (13) in two sequential steps. First, given the current location of each UAV , we seek to find the optimal partition of the service area for each UAV, that minimizes the power for transmissions. Then, for each UAV , given its fixed service area , the optimal location is derived to minimize the required power for downlink communications and mobility.
Iii-B1 Optimal partition of service areas
Given the current location of each UAV , we aim to find the best partition of service areas , such that the total power for downlink communications of all UAVs is minimized. The optimal partition problem can be formulated as follows,
To solve this problem, we use our previously developed gradient-based method in [19, Theorem 1, Algorithm 1].
Iii-B2 Optimal locations
Given the optimal partition of the service area , the power minimization problem can be reduced into subproblems for each UAV as
Based on [4, Theorem 1], we focus on two scenarios in the following discussions. One is a high-altitude UAV, where , and the other is the low-altitude UAV, where . In scenario one, the value of in (2) is approximately , thus, and . Then, can be rewritten as
where is a coefficient that does not depend on , and . It is obvious that is a convex function with respect to and . By setting the first partial derivatives to be zero, we have the optimal locations for UAV that minimize the transmission power as
Although the objective function is convex with respect to and , deriving a closed-form solution of (18), which minimizes both the transmit and mobility power for each UAV, is challenging. However, it is easy to find the optimal solution of (18) based on a gradient-based algorithm. Using a similar approach, we can find the optimal location for scenario two.
Iv Simulation Results and Analysis
For simulations, we consider a UAV cellular network operating in a GHz frequency band for downlink communications. The total available bandwidth for each UAV is MHz. The noise power spectral is set to dBm/Hz. For each UAV, the antenna gain is dB, and the rate of energy consumption for moving per meter is Joules per meter. For ML, we use of the dataset to train the model, and the remaining data is used to evaluate the performance.
Fig. 1 shows the total and average communication power per UAV required to satisfy the users’ data demands for two scenarios: the proposed approach and a solution with no ML predictions. In each case, the proposed optimal partition of service areas and the optimal location deployment are employed. Fig. 1 shows that, as the number of UAVs increases, both the total required power and the average communication power will decrease. When more UAVs are available, each aerial BS can serve a smaller coverage area, yielding a lower average path loss. Therefore, the needed total transmit power decreases, given a fixed amount of the total cellular traffic. As the total required transmit power decreases, the average power reduces accordingly. Fig. 1 further shows that compared with the solution without ML predictions, the proposed approach yields a significant improvement of power consumptions. The power reduction varies from to , as the number of UAVs increases from to .
Fig. 2 shows the power efficiency, defined as the average percentage of the transmit power out of total power . As the number of UAVs increases, the power efficiencies in both scenarios will decrease. Here, we note that UAV mobility will often require more power than wireless transmission. By deploying more UAVs, network operator is more likely to send a UAV to meet an instant communication in a relatively far hotspot area, which causes more power consumed for mobility. Also, as shown in Fig. 1, more UAVs requires using a less communication power , which further reduces the power efficiency. Moreover, Fig. 2 shows that compared with the solution without ML, the proposed method can improve the power efficiency of UAV communication by up to .
Fig. 3 shows the required transmit power as a function of the total bandwidth, assuming nine UAVs. As the available bandwidth increases, the transmit power will decrease. However, a wider bandwidth results in a higher noise power, which prevents the reduction of transmit power, especially when the bandwidth is greater than MHz. For such noise-sensitive system, a lower spectrum efficiency cannot save additional power.
In this paper, we have proposed a novel approach for predictive deployment of UAV aerial BSs to provide an on-demand wireless service to the cellular users. We have formulated a power minimization problem to optimize the partition of the service area of each UAV, while minimizing the UAV power needed for downlink communications and mobility. In order to predict hotspots, a novel ML framework based on GMM and WEM has been developed. The results have shown that the proposed ML approach can reduce the required downlink transmit power, and improve the average power efficiency by over , compared with an optimal deployment of UAVs with no ML prediction.
-  S. Rangan, T. S. Rappaport, and E. Erkip, “Millimeter-wave cellular wireless networks: Potentials and challenges,” Proceedings of the IEEE, vol. 102, no. 3, pp. 366–385, Feb 2014.
-  N. Bhushan, J. Li, D. Malladi, R. Gilmore, D. Brenner, A. Damnjanovic, R. Sukhavasi, C. Patel, and S. Geirhofer, “Network densification: the dominant theme for wireless evolution into 5G,” IEEE Communications Magazine, vol. 52, no. 2, pp. 82–89, Feb 2014.
-  M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, and M. Debbah, “A tutorial on UAVs for wireless networks: Applications, challenges, and open problems,” arXiv preprint arXiv:1803.00680, 2018.
-  M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Optimal transport theory for power-efficient deployment of unmanned aerial vehicles,” in Proc. of IEEE International Conference on Communications (ICC), Kuala Lumpur, Malaysia, May 2016, pp. 1–6.
-  M. Chen, U. Challita, W. Saad, C. Yin, and M. Debbah, “Machine learning for wireless networks with artificial intelligence: A tutorial on neural networks,” arXiv preprint arXiv:1710.02913, 2017.
-  M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Efficient deployment of multiple unmanned aerial vehicles for optimal wireless coverage,” IEEE Communications Letters, vol. 20, no. 8, pp. 1647–1650, Jun 2016.
-  Z. Becvar, M. Vondra, P. Mach, J. Plachy, and D. Gesbert, “Performance of mobile networks with UAVs: Can flying base stations substitute ultra-dense small cells?” in Proc. of 23th European Wireless Conference, Dresden, Germany, May 2017, pp. 1–7.
-  K. Li, W. Ni, X. Wang, R. P. Liu, S. S. Kanhere, and S. Jha, “Energy-efficient cooperative relaying for unmanned aerial vehicles,” IEEE Transactions on Mobile Computing, vol. 15, no. 6, pp. 1377–1386, Aug 2016.
-  V. Sharma, M. Bennis, and R. Kumar, “UAV-assisted heterogeneous networks for capacity enhancement,” IEEE Communications Letters, vol. 20, no. 6, pp. 1207–1210, 2016.
-  Y. Zeng and R. Zhang, “Energy-efficient UAV communication with trajectory optimization,” IEEE Transactions on Wireless Communications, vol. 16, no. 6, pp. 3747–3760, Mar 2017.
-  J. F. Horn, E. M. Schmidt, B. R. Geiger, and M. P. DeAngelo, “Neural network-based trajectory optimization for unmanned aerial vehicles,” Journal of Guidance, Control, and Dynamics, vol. 35, no. 2, pp. 548–562, Mar-Apr 2012.
-  J. Chen, U. Yatnalli, and D. Gesbert, “Learning radio maps for UAV-aided wireless networks: A segmented regression approach,” in Proc. of IEEE International Conference on Communications (ICC), Paris, France, July 2017, pp. 1–6.
-  S. Zhang, S. Zhao, M. Yuan, J. Zeng, J. Yao, M. R. Lyu, and I. King, “Traffic prediction based power saving in cellular networks: A machine learning method,” in Proc. of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Redondo Beach, CA, USA, Nov 2017, p. 29.
-  I. D. Gebru, X. Alameda-Pineda, F. Forbes, and R. Horaud, “EM algorithms for weighted-data clustering with application to audio-visual scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 12, pp. 2402–2415, Jan 2016.
-  A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal LAP altitude for maximum coverage,” IEEE Wireless Communications Letters, vol. 3, no. 6, pp. 569–572, July 2014.
-  “City cellular traffic map,” https://github.com/caesar0301/city-cellular-traffic-map, accessed: 2016-10-05.
-  U. Paul, A. P. Subramanian, M. M. Buddhikot, and S. R. Das, “Understanding traffic dynamics in cellular data networks,” in Proc. of IEEE International Conference on Computer Communications (INFOCOM), Shanghai, China, Apr 2011, pp. 882–890.
-  M. B. Christopher, Pattern recognition and machine learning. Springer-Verlag New York, 2016.
-  M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Wireless communication using unmanned aerial vehicles (UAVs): Optimal transport theory for hover time optimization,” IEEE Transactions on Wireless Communications, vol. 16, no. 12, pp. 8052–8066, Apr 2017.