I Introduction
Traditionally, wireless communications are mainly designed with fixed terrestrial infrastructure such as ground base stations (BSs), access points, and relays. To meet the everincreasing and highly diversified traffic demand costeffectively, there have been fast growing interests in providing wireless connectivity from the sky, by using various aerial communication platforms such as balloons^{1}^{1}1Project Loon, Available online at: https://x.company/loon/., helikites [1], and unmanned aerial vehicles (UAVs) [2] [3]. In particular, thanks to their fast deployment and controllable mobility, UAVenabled/aided wireless communications have emerged as an appealing technology for many practical applications, such as terrestrial BS offloading [4], emergency response and public safety [5], InternetofThings (IoT) communications [6], [7], massive machine type communications [8], etc.
Extensive research efforts have been recently devoted to UAVenabled communications. In [2], a general overview of UAVenabled wireless communications is given, where three typical use cases are envisioned, namely UAVaided ubiquitous coverage, UAVaided relaying, and UAVaided information dissemination and data collection. By employing UAVs as quasistationary aerial BSs, the UAV placement problem in twodimensional (2D) or threedimensional (3D) space has been extensively studied via exploiting the unique UAVground channel characteristics [9, 10, 11, 12, 13, 14, 15]. Moreover, another important line of work focuses on the UAV trajectory optimization [16, 17, 18, 19, 20]
, which fully exploits the additional design degrees of freedom introduced by the UAV mobility for communication performance enhancement.
Despite all the promising benefits, UAVenabled communications are also faced with new challenges. In particular, due to the practical size, weight, and power (SWAP) constraints, UAVs usually have very limited endurance or fly duration over the air. For example, most rotarywing UAVs in the market typically have the maximum endurance of about 30 minutes^{2}^{2}2DJI Phantom 4 Specs, Available online at: https://www.droneworld.com/djiphantom4specs/. This severely hinders the practical implementation of UAVenabled communications. In particular, as the existing designs for UAVenabled communications are mostly based on the conventional “connectioncentric” communication, where a communication link between the UAV and GN needs to be maintained for information transmission, a service interruption is caused when the UAV needs to be recalled for battery charging or swap. Some initial attempts have been made to prolong the UAV endurance or maximize the communication throughput given the limited energy, e.g., via energyefficient trajectory designs [17]. However, the fundamental UAV endurance problem remains unresolved.
In this paper, we propose a novel scheme to overcome the UAV endurance issue, by utilizing the promising technique of proactive caching at the GNs. Specifically, we focus on contentcentric UAVenabled wireless communication systems, where a UAV is dispatched to serve a group of GNs with random and asynchronous file requests, i.e., the same content may be requested by different GNs at different time. Note that “contentcentric” communications have many practical applications nowadays, such as for video ondemand (VoD) streaming and software download [21]. As shown in Fig. 1, the proposed scheme operates in a periodic manner, with each period consisting of two phases, namely the file caching phase and the file retrieval phase. In the file caching phase, the UAV proactively transmits each of the files from a given set of interest to a subset of selected GNs that cooperatively cache all the files. Next, in the file retrieval phase, each GN that has a file request can retrieve the file either directly from its own local cache or from its nearest neighbor that has cached the file via devicetodevice (D2D) communications [22]. For the proposed scheme, the UAV is only involved in the file caching phase. Thus, the required UAV operation time for each period only depends on how fast it can transmit the files to the selected caching GNs, instead of the random file request pattern of the GNs. This thus offers a promising solution to overcoming the issue of limited endurance for UAVenabled communications. For instance, after completing the file caching, the UAV could return to the depot for battery charging or swap, and yet without causing any service interruption, thanks to the proactive file caching and D2D file sharing by the GNs. It is worth noting that the proposed scheme is fundamentally different from the UAV caching technique studied in [23], where the files are cached at the UAVs (instead of at GNs) based on the predicted content request distribution and mobility pattern of the users. In that work, the UAV needs to remain in the air for the entire period, thus leaving the endurance issue unaddressed.
Note that caching has received significant interests recently in terrestrial cellular systems. By preloading the popular contents during offpeak hours into cellular BSs, mobile users, or even dedicated helper nodes [24, 25, 26, 27, 28, 29, 30, 31, 32, 33], caching offers a promising approach to alleviate the backhaul congestion issue and reduce latency in cellular networks. In this paper, caching at the GNs is exploited as a new means to overcoming the endurance issue for the emerging UAVenabled communications, which, to the best of our knowledge, has not been reported in prior work.
Different from the caching in cellular networks, the file caching phase of the proposed scheme is via the UAVtoGN wireless links, whose quality critically depends on their distances and thus the UAV trajectory over time. Therefore, the caching policy, which includes the decisions on who are the caching GNs and which files to be cached at each caching GN, should be jointly designed with the UAV trajectory and the UAV file transmission scheduling. Furthermore, there exists a fundamental tradeoff between the file caching cost, which is defined as the total time required for the UAV to complete the transmission of all files to the selected caching GNs, and the file retrieval cost, which is the average time required for serving one file request [34]. Note that the file caching cost is ignored in [34]. However, due to the limited endurance of UAV, the file caching strategy should be carefully designed to minimize the caching cost by UAV as well. Intuitively, with the given finite storage capacity at each caching GN, the file retrieval cost in general decreases if more files are cached to the GNs. However, this will incur higher file caching cost, since more time is required for the UAV to complete the file transmissions to the designated caching GNs. Furthermore, from the perspective of reducing the file retrieval cost, it is desirable to cache each file to those GNs that are well separated geographically, so that it is more likely for a GN to find the requested file in a nearby caching GN if the file is not found in its local cache. However, this will increase the file caching cost since the UAV needs to travel longer distance in order to transmit the same file to those more separated caching GNs. In this paper, we will investigate this new fundamental tradeoff in detail, by jointly designing the caching strategy, UAV trajectory and file scheduling. Note that the problem we consider is fundamentally different from that studied in [35] where the UAV trajectory is optimized for broadcasting one common file to all the GNs. The file caching strategy and the file scheduling at the UAV, which are essential design parameters for our proposed scheme, are irrelevant to the simple broadcasting problem considered in [35].
The main contributions of this paper are summarized as follows.

Firstly, we propose a novel scheme for UAVenabled communication by utilizing proactive caching at the GNs to overcome the endurance issue. Furthermore, to characterize the fundamental tradeoff between file caching and file retrieval costs, we formulate an optimization problem to minimize the weighted sum of the two costs via jointly optimizing the caching policy, UAV trajectory and communication scheduling.

Secondly, as the formulated optimization problem is NPhard, we propose an efficient greedy based approach to find highquality approximate solutions for it. The proposed algorithm starts from the case with no file cached, and then adds each caching file sequentially that leads to the maximum reduction in weighted sum cost. To minimize the file caching cost for any given caching policy, the UAV trajectory is designed by optimizing the waypoints based on the concept of virtual base station (VBS) placement, together with the linear programming (LP) for the speed optimization.

Thirdly, to further reduce the complexity of the proposed greedy solution, we propose an efficient approximation for estimating the file caching cost directly, without the need of solving trajectory optimization problem at each iteration. With this lowcomplexity scheme, the UAV trajectory and communication scheduling only needs to be optimized once after determining the caching policy.

Lastly, extensive numerical results are provided to validate the performance of the proposed scheme and illustrate the tradeoff between the file caching and retrieval costs. Furthermore, as compared to the benchmark schemes with separate caching policy and UAV trajectory designs, it is shown that the proposed scheme with joint caching and trajectory optimization achieves significant performance gains in terms of file caching and retrieval costs.
The rest of this paper is organized as follows. Section II presents the system model and proposes the novel scheme for UAVenabled communication with caching. The problem formulation is also given in this section. In Section III, a greedy based solution is proposed for the formulated problem. Furthermore, a lowcomplexity scheme is proposed to further reduce the complexity of the proposed greedy solution by estimating the file caching cost directly. Section IV provides the numerical results, and finally we conclude this paper in Section V.
Notations:
In this paper, scalars are denoted by italic letters. Boldface lowercase and uppercase letters denote vectors and matrices, respectively.
denotes the space of dimensional realvalued vectors. represents the set of nonnegative integers. For a vector , represents its Euclidean norm. denotes the logarithm with base 2. denotes the statistical expectation andrepresents the probability. For a timedependent function
, denotes the firstorder derivative with respect to time . For a set , denotes its cardinality. For two sets and , denotes that is a subset of . , and denote the union, intersection and set difference of and , respectively.Ii System Model and Problem Formulation
We consider a UAVenabled wireless communication system, where a UAV is dispatched to serve a group of GNs. The horizontal location of GN is denoted as . Different from the existing literature that mostly considers the traditional “connectioncentric” UAVenabled communications [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], we consider the “contentcentric” UAVenabled communications. Specifically, we assume that within each period of duration seconds, the GNs are interested in the same set of files, which are denoted as . Note that in practice, depends on how fast the file library needs to be updated, which is usually at relatively large time scale (say one day). We assume that the probability for GN to request file is denoted by , where and . For the special case when all the GNs have the same interest on the files, the file request probability can be represented by the file popularity, such as the Zipf distribution given by
(1) 
where
represents the skewness of the distribution and usually takes values in
[36]. For the special case when , all the files are of equal popularity. As increases, the popularity of different files becomes more diverged.In practice, different GNs may request the same file at different time. A straightforward way to satisfy such asynchronous file requests on demand is via direct UAVGN transmission. However, such a scheme requires the UAV to remain over the air for all time, similar to the conventional terrestrial BSs at fixed locations on the ground. However, in practice, UAVs have limited endurance and thus this scheme is practically infeasible. To overcome this issue, we propose a novel scheme for UAVenabled wireless communication based on proactive caching by the GNs.
Iia Benchmark Scheme: Direct UAVGN File Transmission Without Caching
We first consider a benchmark scheme with direct UAVGN file transmission without caching, where the UAV resembles a conventional ground BS, but hovers above a certain location (e.g., the geometric center of the locations of all GNs) or flies along some optimized trajectory to directly serve the file request on demand. As illustrated in Fig. 2, the file requests (including the requesting GN and the index of the requested file) from the GNs are put in a request queue at the UAV based on their generated time. All such requests are sequentially served by the UAV via direct file transmissions. In practice, the GNs may have the file request at any time. Thus the UAV needs to remain in the air throughout the mission operation time (say several hours or even a day). This is challenging to be practically implemented due to the limited UAV onboard energy and hence endurance. Moreover, such a scheme is also quite inefficient when the file requests from the GNs are highly sporadic, for which the UAV needs to remain in the air even when there is temporarily no file request. Therefore, in the following, we propose a new scheme for UAVenabled communication based on the promising technique of proactive caching at the GNs.
IiB Proposed Scheme with Proactive Caching
With the proposed scheme, each GN is assumed to have a cache with storage capacity of files. As illustrated in Fig. 1, the proposed scheme consists of two phases: file caching phase and file retrieval phase, explained as follows.

File caching phase: The file caching phase occurs at the beginning of each operation period, for which the UAV selects a subset of the GNs to proactively cache the files. To ensure nonzero probability of successful file retrieval for all files, each of the files should be cached by at least one of the selected caching GNs. Furthermore, due to the limited storage capacity at each GN and the incurred file caching cost by transmitting the files from the UAV to their designated caching GNs, the files to be cached at each caching GN should be carefully optimized, jointly with the UAV trajectory and the file transmission scheduling from the UAV to the caching GNs.

File retrieval phase: With all the files cooperatively cached by the selected caching GNs, the file retrieval phase then only involves the D2D communication among the GNs. In this case, each file request could be served by considering two possible scenarios. If the requested file is already cached locally at the requesting GN itself, it can then be simply retrieved from the local cache; otherwise, the file requesting GN will broadcast the file request, and those caching GNs who have cached the requested file will respond. We assume that the GN will then download the file from the nearest GN that has cached this file via D2D communication.
Note that with the proposed scheme, the UAV is only involved in the file caching phase. Thus, its operation time/cost at each period only depends on how fast the UAV can complete the transmission of all files to their respectively selected caching GNs, thus independent of the random file request pattern of the GNs. Thus, regardless of the period duration , the proposed scheme only requires that the UAV’s endurance to be greater than the duration of the file caching phase, since the UAV can then replenish its energy during the file retrieval phase.
In the following, the file caching and retrieval phases are modeled in detail.
IiB1 File Caching
We denote by in bits the size of each of the files^{3}^{3}3For simplicity, we assume that all files have the same size , while the results of this paper can be extended to the case of unequal file sizes with only minor modification.. Furthermore, we assume that each file is divided into packets, with denoting the packet size in bits. We further assume that a packetlevel erasure correction code such as the fountain code [37] or random linear code [38] is applied for each file, so that a file can be recovered from any coded packets, where is the coding overhead.
For the file caching phase, the UAV needs to determine the file caching policy, which includes the subset of the GNs for file caching as well as a subset of the files to be cached at each selected caching GN. This can be mathematically represented by the binary indication variables as
(2) 
Due to the storage limitation, the total number of files cached at each GN should not exceed its storage capacity , i.e.,
(3) 
Furthermore, since each file should be cached by at least one GN to ensure nonzero successful file retrieval probability, we have
(4) 
With the file caching policy determined, the UAV needs to transmit the files to their designated caching GNs following certain trajectory and file transmission scheduling. Assume that the UAV flies at a constant altitude, which is denoted as in meter. Furthermore, denote by , the UAV’s flying trajectory projected on the ground, where is the total time required for the UAV to complete the file caching transmission. Let denote the maximum UAV speed in meter per second (m/s). We then have the constraint . The timedependent distance between the UAV and the GNs can then be written as
(5) 
For ease of exposition, the time horizon is discretized into equal time slots, i.e., , with denoting the elemental slot length such that the distances between the UAV and the GNs are approximately constant within each slot. As a rule of thumb, we may choose such that . Then the UAV trajectory can be approximated by the length sequence , where denotes the UAV’s horizontal location at the th time slot. Denote by the maximum traveling distance of the UAV for one slot duration . The UAV speed constraint can then be discretized as
(6) 
The timedependent distance between the UAV and the GNs can be written as
(7) 
where is the horizontal distance.
Preliminary channel measurement results show that UAVtoground channel typically consists of strong LoS links [39]. Therefore, for simplicity in this paper we assume that the channel between the UAV and each GN is dominated by the LoS link. As a result, the channel power gain from the UAV to GN at slot can be modeled as
(8) 
where denotes the UAVtoground channel power gain at the reference distance of m.
Denote by the transmission power of the UAV. The received signaltonoise ratio (SNR) by GN at the th time slot is given by
(9) 
where is the additive white Gaussian noise (AWGN) power and is the SNR at the reference distance of m.
We assume that the UAV’s transmission rate throughout the file caching phase is fixed, which is denoted as in bits per second (b/s). As a result, the time required to complete one packet transmission is . With the slot duration fixed as , the total number of packets that can be transmitted by the UAV within each time slot is . For convenience, we assume that is an integer. For each time slot , define as the number of packets that are transmitted by the UAV for file . At each time slot , the total number of packets transmitted by the UAV cannot be larger than . Therefore, we should have
(10) 
With the UAV’s transmission rate fixed as , a packet sent by the UAV at slot can be successfully received by GN if and only if , where with denoting the channel bandwidth and denoting the SNR gap between a practical modulation and coding scheme and the theoretical Gaussian signaling. By using (9), this condition is equivalent to that the horizontal distance between UAV and GN should be no greater than a certain threshold, i.e., , where
(11) 
It is not difficult to see that the UAV transmission rate should be chosen such that . For each GN , define as the subset of all time slots such that the horizontal distance between the UAV and GN is no greater than , i.e.,
(12) 
We refer to as the set of contacting time slots for GN with the UAV. Hence, the total number of coded packets that can be successfully received by node for file is given by . If GN is selected to cache file , i.e., , it needs to receive a total of coded packets to recover . Thus, the UAV trajectory and the file transmission scheduling should satisfy the following constraint,
(13) 
In this paper, we define the file caching cost as the total time required for the UAV to complete the dedicated file transmissions to the selected caching GNs, i.e.,
(14) 
Specifically, for any given file caching policy , is the required file caching time such that there exists a feasible UAV trajectory and file transmission scheduling that satisfy the constraints (6), (10), and (13).
IiB2 File Retrieval
Next, we consider the file retrieval phase. After the file caching phase with the policy specified by , for each file , denote by the set of GNs that have cached file , i.e., . In the file retrieval phase, when a GN requests any file , there are two possible scenarios. In the first scenario, file is already cached by GN itself, i.e., , then the file can be retrieved directly from its own local cache. In this case, the cost for file retrieval is essentially zero. Otherwise, when , GN will retrieve file from its nearest peer that has cached via D2D communication, which requires additional time/delay due to D2D transmissions and thus incurs a nonzero cost. For any pair of and such that , let the file retrieval distance for GN to retrieve be denoted as . We have
(15) 
where is the distance between GN and .
As a result, the average channel power gain for GN to retrieve file can be modeled as
(16) 
where denotes the channel power gain at the reference of m, and is the path loss exponent for the D2D channels between GNs.
Denote by in b/s the transmission rate of each GN for the D2D file sharing phase. Therefore, the time required to complete one packet transmission is . Different from the UAVtoGT channels that are typically dominated by LoS links, the terrestrial channels between different GNs are usually subject to additional fading with random variations. We assume quasistatic fading channels, where the instantaneous channel coefficients between the GNs remain unchanged for each packet duration , and may vary across different packets. Therefore, the instantaneous channel gains for GN to retrieve the th packet for can be modeled as
(17) 
where is the distancedependent path loss component given by (16),
is a random variable with
accounting for the fading component of the terrestrial channels, which are assumed to be independently and identically distributed (i.i.d.). Without loss of generality, denote bythe complementary cumulative distribution function (ccdf) for the fading channel power, i.e.,
.Denote by the D2D transmit power for the file sharing phase. Then, the instantaneous SNR for GN to download the th packet for file is given by
(18) 
where is the noise power and is the average SNR at the reference distance m.
With the GNs’ transmission rate fixed as , a packet sent by the nearest caching GN of file to the requesting GN can be successfully received if the instantaneous SNR is no smaller than the threshold , where denotes the channel bandwidth for the D2D communications and is the SNR gap. Hence, the probability for GN to successfully receive the th packet of file is given by
(19) 
Note that is a decreasing function and hence the successful packet reception probability decreases with the file retrieval distance , as expected.
Recall that to recover each file, the GN should successfully receive at least coded packets. For each given pair of GN and file , we define the file retrieval cost as the expected number of required packet transmissions so that on average packets are received by the file requesting node . If such that is already available in its own cache, no packet transmission is needed and hence we have . Otherwise, GN will retrieve file from its nearest GN with each transmitted packet having success probability . Therefore, we set , which measures the average number of transmitted packets (or transmission time normalized to one packet duration ) to ensure the successful decoding of file . In summary, with the caching policy , the file retrieval cost for each given pair of GN and file is
(20) 
Note that if a file is not cached at any GN, the result retrieval cost of this file is infinity for any GN according to (20), which hence implies the constraint that all the files should be cached. In practical implementation, such constraint can be easily removed by defining a finite cost value for not caching a file.
We define the average file retrieval cost as the average time required for serving one file request, where the average is taken over all the GNs and all files based on their popularity, i.e,
(21) 
where is the probability of GN requesting file as defined in (1). Note that can also be interpreted as the average delay for a typical GN to retrieve a file given constant transmission rate between the GNs. As more file requests are served via D2D communication, more files will be available at the GNs and hence the file retrieval cost may gradually reduce as the time goes. The cost defined in (21) corresponds to the worstcase file retrieval cost at the beginning of the file retrieval phase, by only considering the file availability at the GNs after proactive caching enabled by the UAV. If the GNs can adjust their transmission rates based on the communication distance, the file retrieval cost can also be defined as the average transmission time required for retrieving a file at the maximal achievable communication rate, which is clearly a increasing function of the distance between the requesting GN and its nearest caching GN with the file.
IiC Problem Formulation
Based on the above discussions, it can be seen that both the file caching cost defined in (14) and average file retrieval cost defined in (21) are closely coupled with the file caching policy . Intuitively, the file retrieval cost decreases if more files are cached, due to the higher probability of finding the requested file in the local cache as well as the decreased retrieval distance on average when D2D file transmission is needed, as can be inferred from (15), (19), (20) and (21). However, this is usually achieved with increased file caching cost, since more time is required for the UAV to complete the file transmissions to the designated caching GNs. Moreover, to reduce the file retrieval cost, each file should be cached to those GNs that are as widely separated as possible given the same number of caching GNs, so as to reduce the maximum file retrieval distance when D2D file transmission is needed. However, this will also increase the file caching cost since the UAV needs to transmit the same file to more distant GNs by traveling longer distance. Therefore, there exists a fundamental tradeoff between file caching and file retrieval costs for the proposed scheme.
To characterize this new tradeoff, we define a weighted sum cost as
(22) 
where is the weighting factor to balance between the average file retrieval cost and the file caching cost .
The complete tradeoff between and can be obtained by minimizing for different values, via jointly optimizing the design variables including the file caching policy , UAV operation time , UAV trajectory , and file transmission scheduling . For any given , the problem can be formulated as
P1:  (23a)  
s.t.,  (23b)  
(23c)  
(23d)  
(23e)  
(23f)  
(23g)  
(23h) 
where corresponds to the caching capacity constraint at each GN; (23e) ensures that each file is cached by at least one GN; (23f) corresponds to the maximum speed constraint of the UAV; (23g) ensures that all the caching GNs receive sufficient number of packets for their designated caching files from the UAV; and (23h) specifies the maximum number of packets that can be transmitted by the UAV during each time slot.
Iii Proposed Solution
Before solving the general problem P1, we first consider its two extreme cases with and , respectively, to gain important insights. Then the general problem P1 with arbitrary values is solved with a greedy algorithm.
Iiia Minimizing File Retrieval Cost with
When , P1 reduces to minimizing the file retrieval cost , while ignoring the corresponding file caching cost. It follows from (21) that for any given file caching policy , is independent of the UAV trajectory and file transmission scheduling . Furthermore, by ignoring the constant terms in the cost function, P1 reduces to
P1(a):  (24a)  
s.t.,  (24b) 
Note that the optimization variable affects the cost function of P1(a) implicitly via given in (20). In the ideal case when each GN has sufficiently large storage capacity, i.e., , it is not difficult to see that the optimal solution to P1(a) is , i.e., each GN will cache all the files. In this case, for each GN, all the files can be directly retrieved from their local caches once requested. It then follows from (20) that the optimal value of P1(a) is . For the general case with , problem P1(a) is difficult to be solved optimally. In [34], a related problem is studied to optimize the file caching policy to minimize the average file retrieval distance , which has been proven to be NPhard. Since the file retrieval cost is a deterministic and increasing function of the file retrieval distance , it can be shown that P1(a) is also NPhard by following the similar proof in [34].
Therefore, by following a similar globallygreedyheuristic approach in [34], an efficient approximate solution can be found for problem P1(a) for . Specifically, at each step, the incremental reduction of the file retrieval cost for placing file at GN is calculated for all the GNfile pairs. Then, the GNfile pair that leads to the largest cost reduction is selected as a new cache placement. This process iterates until the storages of all the GNs are filled. Note that if a file is not cached by any GN, the retrieval cost is infinity. Therefore, with the globallygreedy approach that maximizes the cost reduction at each iteration, the constraint (23e) is automatically satisfied, as long as the problem is feasible with .
IiiB Minimizing File Caching Cost with
When , we have and P1 reduces to minimizing the UAV file caching cost while ensuring that each file is cached by at least one GN, and ignoring the file retrieval cost. By discarding the constant terms, P1 reduces to
P1(b):  (25a)  
s.t.,  (25b) 
For the ideal case when each GN has sufficiently large storage, i.e., , one GN is sufficient to cache all the files. In this case, it is not difficult to see that the optimal solution to P1(b) is to cache all the files in one single arbitrarily selected GN , i.e., and . As such, the UAV only needs to hover above the selected GN , i.e., . In this case, the minimum caching time is determined by the transmission time for sending all the files to the single GN selected, i.e., .
For the general case with , problem P1(b) is difficult to be optimally solved. A heuristic solution to problem P1(b) can be obtained from the proposed solution for solving P1 with general values in the next subsection.
IiiC General Case with Arbitrary
Due to the implicit relationship between the file caching and retrieval costs with respect to the design variables, including the caching policy , the UAV trajectory and the file scheduling , problem P1 for arbitrary values is even more challenging to solve compared to the special cases discussed in the preceding subsections. Therefore, finding the optimal solution to P1 is difficult in general. In this subsection, we propose a greedy algorithm for solving P1 approximately by jointly minimizing the file caching and retrieval costs with weighting factor . The main idea of the proposed greedy approach is as follows: Instead of optimizing over all possible caching policy , we start from the case where no file is cached, i.e., , and select the best GNfile pair that leads to the maximum reduction of the weighted sum cost in P1 at each iteration. This process continues until the storages of all the GNs are filled or when cannot be further reduced.
For notational convenience, with GNs and files to be cached, we define a set containing all the GNfile pairs, i.e., . The cardinality of is thus . Furthermore, for any particular file caching policy , we denote by the subset of containing all the selected GNfile pairs such that file is cached at GN . At each iteration, a GNfile pair is a candidate pair that could be selected for caching in the next step only if and the storage of GN has not been filled yet, i.e., . Therefore, for any file caching policy , we further denote by the set of candidate GNfile pairs, considering the current caching status specified by . Clearly, we have and for any particular file caching policy .
With the proposed greedy approach, we start with an empty set , i.e., no file is cached, and at each step, select the best element in the candidate set to move to . For notational convenience, we denote by the file caching cost for the UAV to transmit all the files to the designated caching GNs as specified in , and denote by the average file retrieval cost based on the file placement corresponding to , as defined in (21). The detailed procedures of the proposed greedy algorithm are described as follows:

At each step with the existing file placement , we find the best GNfile pair, denoted as , in the candidate set such that the corresponding cost reduction with this new file caching is maximized. Specifically, denote the new caching policy after moving from to by , i.e., . Then, the corresponding cost reduction of problem P1 is
(26) The first term on the right hand side (RHS) of (26) represents the reduction of the file retrieval cost with the additional caching , while the second term corresponds to the associated increase of the file caching cost. Therefore, at each greedy step, the following optimization problem is solved
P2: (27a) s.t., (27b) (27c) (27d) Note that as compared to the original problem P1, the file caching placement in P2 is optimized in a greedy manner, where only one additional cache in the candidate set is to be added to the current caching policy . It is worth noting that the storage constraint (23d) is guaranteed due to the definition of , while the constraint (23e) may not be necessarily satisfied for the solution to P2 in the intermediate steps. However, it is not difficult to see that after the greedy algorithm converges, the constraint (23e) is also guaranteed, since is infinity if any file is not cached at all.

Denote the obtained solution for the GNfile to P2 as . The subsets and are updated as and , respectively. Furthermore, if the storage at GN is used up after caching , no more files can be cached at GN in the following steps. Therefore, the set of candidate GNfile pair for subsequent greedy steps is updated as
(28) Note that in the first case of (28), all GNfile pairs containing GN are excluded from , whereas in the second case, only the pair is excluded.

Repeat step (2)(3) until there is no remaining feasible GNfile pair, i.e., , or the incremental cost reduction becomes negligible.
The remaining task for the proposed greedy scheme is then to solve P2. For any chosen candidate GNfile pair , P2 reduces to a joint UAV trajectory and transmission scheduling optimization problem formulated as
P3:  (29a)  
s.t.,  (29b) 
Therefore, P2 can be solved by simply comparing all the possible values to P3, each corresponding to one feasible in . Note that with and given, the reduction of the file retrieval cost is a constant value that can be directly calculated from (21). Hence, solving P3 is equivalent to minimizing the file caching cost via optimizing the UAV trajectory and file transmission scheduling given the caching placement , which is formulated as
P3(a):  (30a)  
s.t.,  (30b)  
(30c)  
(30d) 
It is worth noting that P3(a) is still a challenging problem since the set of contacting time slots for each GN are intricately related with the UAV trajectory , as given in (12). For the special case when each file is only cached by one GN, P3(a) reduces to a problem for UAVenabled unicast communication. On the other hand, when all the files are placed at all the GNs, i.e., , the problem reduces to the UAVenabled multicast problem, which has been studied in [35]. Note that both the UAV unicast and multicast problems are NPhard in general. Hence, it is generally challenging to find the optimal UAV trajectory and transmission scheduling solution to problem P3(a). By following the similar approach as in [35], we propose an efficient and approximate solution to P3(a) by firstly designing the UAV flying path via applying the concept of virtual base station (VBS) placement and convex optimization, and then finding the optimal UAV speed and file transmission scheduling with the given path by solving a linear programming (LP) problem.
IiiC1 Design of UAV Flying Path
First, to satisfy the constraint (30c) of P3(a), the UAV trajectory needs to be designed such that all the caching GNs, denoted as , can effectively communicate with the UAV, i.e., there exists at least one location along the UAV path for each caching GN such that its distance with UAV is no greater than the required threshold . Given the UAV coverage radius , this resembles the classical traveling salesman problem with neighborhood (TSPN). In the following, we provide an efficient path design for solving the TSPN problem based on the VBS placement and convex optimization techniques.
Specifically, given the locations of the caching GNs and the UAV coverage range , we first solve the VBS placement problem, which aims to find a minimum number of VBSs and their respective locations, so that each caching GN is covered by at least one VBS. Several efficient algorithms in the literature can be applied for solving the VBS placement problem, where in this paper we adopt the spiral placement algorithm proposed in [12]. Let be the minimum number of VBSs obtained by applying the spiral placement algorithm, and their locations are denoted as . An illustrative example for the VBS placement solution is shown in Fig. 3.
With the locations of the VBSs found, the standard TSP algorithm can be applied to find the visiting order of the VBSs, which gives one feasible UAV path. However, the path can be further improved since it may not be necessary for the UAV to visit exactly the VBS locations [35]. To this end, convex optimization technique is further applied to optimize the set of way points. Specifically, with the VBSs and their visiting order determined, the caching GNs in are essentially partitioned into ordered clusters , where denotes the subset of caching GNs that must be covered by the th VBS. Then, the starting and ending points of the UAV flying path intersecting with the th cluster can be further optimized by solving a convex optimization problem, as described in [35]. By connecting the starting and ending points sequentially according to the order of the clusters visited, we obtain the complete UAV flying path, so that each caching GN is enabled to communicate with the UAV. An example of the obtained UAV flying path is shown in Fig. 3.
IiiC2 UAV Speed and File Scheduling Optimization
With the UAV flying path determined, the locations along the path where each GN is covered by the UAV can be obtained based on (12). As a result, finding the UAV trajectory reduces to finding its instantaneous speed along the determined path, for which P3(a) can be cast as an LP problem. Specifically, we discretize the obtained UAV flying path into segments, such that within each segment, the set of caching GNs that are in contact with the UAV remain unchanged. The length of the th segment is denoted by . The total UAV traveling distance is thus . We further denote by the time for the UAV to fly through the th segment. The total file caching cost or the total UAV caching time is thus . The speed constraint in (30b) is then equivalent to
(31) 
Since the duration for the UAV sending one packet is , the total number of packets that can be sent when it flies through the th segment is . Then, the file transmission scheduling parameters can be equivalently transformed to , where refers to the number of coded packets sent by the UAV for file during the th path segment. Corresponding to (30d), we have the following constraints on the total number of transmitted packets per segment,
(32) 
For each GN , define as the subset of the segments such that the GN can successfully receive the packet sent from the UAV, i.e., their horizontal distance being no greater than . Then, the file recovery constraint in (30c) can be recast as
(33) 
Therefore, with any given UAV flying path, problem P3(a) reduces to the following LP problem, which can be efficiently solved by standard convex optimization techniques.
P4:  (34a)  
s.t.,  (34b)  
(34c)  
(34d) 
The pseudocode for solving problem P3(a) is summarized in Algorithm 1.
With problem P3(a) solved, the pseudocode of the overall greedy algorithm proposed for problem P1 is summarized in Algorithm 2. Since one GNfile is selected at each iteration and each GN can cache at most files, the total number of iterations of the proposed greedy algorithm is upperbounded by , i.e., when all the files are placed at each GN or when all the GN storages are used up. Note that the actual number of iterations may be much smaller than this upper bound since the proposed approach may terminate when the incremental cost reduction is insignificant, i.e., , where is a small positive parameter that controls the algorithm termination.
IiiD Complexity Reduction with File Caching Cost Estimation
Note that at each iteration of Algorithm 2, problem P3(a) needs to be solved for times. In this subsection, the complexity of Algorithm 2 is further reduced via efficient estimation for the file caching cost directly, without the need for solving the optimization problem P3(a) as an intermediate step. Specifically, for caching file at GN , the file caching cost is estimated by assuming that the UAV will fly to the location above GN , and hover at it until file is completely transmitted. Then, the estimated total file caching cost is the sum of the UAV flying time and the transmission time. Due to the broadcast nature of wireless communications, all the neighbors of GN with distance no greater than the UAV coverage, denoted as , can overhear the file . Hence, if we also need to cache file at any neighbor of GN in the subsequent iterations, no additional caching cost will be incurred due to the overhearing.
For notational convenience, we denote the set of GNs that have been visited by the UAV as . We further denote the overhearing status of all the GNs and all the files by the binary indication function as
(35) 
At the beginning of the proposed greedy approach, we have and . At each iteration, we need to calculate the increase of file caching cost if we move the GNfile pair from the candidate GNfile pair set to the selected GNfile pair set , denoted as , as part of the total cost reduction in (26), where . If GN has overheard file , i.e., , this file placement will not incur additional caching cost, i.e., . Otherwise, we assume that the UAV needs to visit GN . The minimum distance for UAV to reach GN from the existing visiting points is approximated as
(36) 
where is the distance between GN and GN . Hence, the total additional UAV time required for the UAV to cache file at GN is given by
Comments
There are no comments yet.