The rapid proliferation of mobile devices has led to unprecedented growth in wireless traffic demands. A typical approach to deal with such demand is by densifying the network. For example, macrocells and femtocells are deployed to enhance the capacity and attain a good quality of service (QoS) by bringing the network closer to the user. Recently, it has been shown that only a small portion of multimedia content is highly demanded by most of the users. This small portion forms the majority of requests that come from different users at different times, which is referred to as asynchronous content reuse .
Caching the most popular content at various locations of the network edge has been proposed to avoid serving all requests from the core network through highly congested backhaul links [3, 4, 5]. From the caching perspective, there are three main types of networks, namely, caching on femtocells in small cell networks, caching on remote radio heads (RRHs) in cloud radio access networks (RANs), and caching on mobile devices [6, 7, 8]. One approach to overcome the limitations of the finite capacity backhaul links in the Cloud-RANS, where low energy base stations (BSs) are deployed over a small geographical area and are connected to the cloud, is to introduce local storage caches at the BSs, in which the popular files are stored locally in order to reduce the load of the backhaul links . For the small cell networks, caching the most popular content at the network edge (the small BSs) is a promising solution to reduce the traffic and the energy consumption over the finite capacity backhaul links .
In this article, we focus on device caching solely. The architecture of device caching exploits the large storage available in modern smartphones to cache multimedia files that might frequently be requested by the users. The users’ devices exchange multimedia content stored on their local storage with nearby devices . Since the distance between the requesting user and the caching user (a user who stores the file) will be small in most cases, device to device (D2D) communication is commonly used for content transmission . In this context, Golrezaei et al.  proposed a novel architecture to improve the throughput of video transmission in cellular networks based on the caching of popular video files in base station controlled D2D communication. The analysis of this network is based on the subdivision of a macrocell into small virtual clusters, such that one D2D link can be active within each cluster. Random caching is considered where each user caches files at random and independently, according to a caching distribution.
Different cooperation strategies in D2D networks are proposed in the literature. As an example, in , the authors proposed a cooperative D2D communications framework in order to combat the problem of congestion in crowded communication environments. The authors allowed a D2D transmitter to act as an in-band relay for a cellular link and at the same time transmit its data by employing superposition coding in the downlink. It is shown that cooperation between the cellular link and D2D transmitter helps increase the number of connections per unit area with the same spectrum usage. In the area of D2D caching, the authors in  proposed an opportunistic cooperation strategy for D2D transmission by exploiting the caching capability at the users to control the interference among D2D links. The authors considered an overlay inband D2D communication, divided the D2D users into clusters, and assigned different frequency bands to cooperative and non-cooperative D2D links. The cluster size and bandwidth allocation are further optimized to maximize the network throughput.
The analysis of wireless caching networks from the resource allocation perspective is widely discussed in the literature. For instance, in , the authors showed how distributed caching and collaboration between users and femtocells (helpers) can significantly improve throughput without suffering from the backhaul bottleneck problem common to femtocells. The authors also investigated the role of collaboration among users - a process that can be interpreted as the mobile devices playing the role of helpers also. This approach allowed an improvement in the video throughput without the deployment of any additional infrastructure. Due to the dependence between content cache placement and resource allocation in wireless networks, the joint problem of caching and resource allocation is studied in many works. As an example, Zhang et al. in  proposed a single-hop D2D-assisted wireless caching network, where popular files are randomly and independently cached in the memory of end users. The joint D2D link scheduling and power allocation problem is formulated to maximize the system throughput. Following a similar approach, Chen et al. in 
studied the joint optimization of cache content placement and scheduling policies to maximize the so-called offloading probability. The successful offloading probability is defined as the probability that a user can obtain the desired file in the local cache or via a D2D link with data rate larger than a given threshold. The authors obtained the optimal scheduling factor for a random scheduling policy that controls interference in a distributed manner and proposed a low complexity solution to compute caching distribution.
Motivated by the remarks from the above discussion, i.e., backhaul links being highly congested, the geometric distribution of the users as groups in clusters, and the small memory sizes of a group of users colocated in the same cluster, we propose a novel D2D caching architecture with inter-cluster cooperation. We propose a system in which a user in a given cluster can search its requested files either in the local cluster or any of the remote clusters. We show that allowing inter-cluster collaboration via cellular communication achieves both user and system performance gains. From the user perspective, the average delay per request is reduced when downloading files from a remote cluster instead of serving files from the core network. From the system perspective, the heavy burden on backhaul links is alleviated by decreasing the number of requests that are served directly from the core network. From a resource allocation perspective, similar to the work performed in[12, 11], we analyze the network average delay and throughput per user request for the proposed inter-cluster cooperative caching system under different caching schemes and show how the network performance is significantly improved. To the best of our knowledge, none of the works in the literature dealt with the performance analysis of D2D caching networks with inter-cluster cooperation.
The main contributions of this article are summarized as follows:
We study a D2D caching system with inter-cluster cooperation from a queueing theory perspective. We formulate the network average delay minimization problem in terms of cache placement. The delay minimization problem is then shown to be non-convex, and it can be reduced to a well-known 0 - 1 knapsack problem which is NP-hard.
A closed-form expression of the network average delay is derived under the policy of caching popular files. Moreover, a locally optimal greedy caching algorithm is proposed whose delay is within a factor of the global optimum. Results show that the delay can be significantly reduced by allowing D2D caching with inter-cluster cooperation.
We derive a closed form expression for the average throughput per request for the proposed inter-cluster cooperating scheme. Moreover, we conduct the asymptotic analysis for the average sum throughput when the content library size grows to infinity. The result of the scaling analysis shows that the upper bound for the network average sum throughput decreases when the library size increases asymptotically, and the rate of this decrease is controlled by the popularity of files.
The rest of the paper is organized as follows. The system model is presented in Section II. In Section III, we formulate the problem and perform the delay analysis of the system. In Section IV, the content caching schemes are studied. Section V provides the throughput analysis. Finally, we discuss the simulation and analytical results in Section VI and conclude the paper in Section VII.
Ii System Model
Ii-a Network Model
In this subsection, we describe our proposed D2D caching network with inter-cluster cooperation. Fig. 1 illustrates the system layout. A cellular network consists of a small base station (SBS) and a set of users placed uniformly in the cell. The cell is divided into a set of equally sized clusters . For mathematical convenience, we assume that the number of users per cluster is users, as in  and the reference therein. Users in the same cluster can communicate directly using low power high rate D2D communication in a dedicated frequency band for D2D transmission.
Each user requests a file from a file library independently and identically, according to a given request probability mass function. It is assumed that each user can cache up to files, and for the caching problem to be non-trivial, it is assumed that . From the cluster perspective, we assume to have a cluster’s virtual cache center (VCC) formed by the union of devices’ storage in the same cluster, which caches up to files, i.e., .
We assume that the D2D communication does not interfere with communication between the BS and users. We also assume that all D2D links share the same time-frequency transmission resource within one cell. Multiple transmissions on those resources are possible since the distance between requesting users and users with the stored file will typically be small. Furthermore, there should be no interference by other transmissions on an active D2D link. To achieve this, the cell is divided into smaller areas, which we denoted as clusters. To avoid intra-cluster interference, only one such communication per cluster is allowed.111We adopt a simplified PHY-layer model in this work. Users in the same cluster are assumed to be served in a round-robin manner.
We define three modes of operation according to how a request for content is served:
Local cluster mode ( mode): Requests are served from the local cluster. Files are downloaded from nearby users via a single-hop D2D communication. In this mode, we neglect self-caching, i.e., the event when a user finds the requested file in its internal cache with zero delay. Within each cluster, the BS can help devices find their requested content by broadcasting signals containing the content replication ratio.
Remote cluster mode ( mode): Requests are served from any of the remote clusters via inter-cluster cooperation. The BS fetches the requested content from a remote cluster, then delivers it to the requesting user by acting as a relay in a two-hop cellular transmission. The BS assists in content dissemination in the “remote cluster mode” by relaying the content between different clusters.
Backhaul mode ( mode): Requests are served directly from the backhaul. The BS obtains the requested file from the core network via the backhaul link and then transmits it to the requesting user.
In each cluster, we assume that the stream of user requests are served sequentially based on first in first out (FIFO) criterion. The BS receives all requests and works as a coordinator to establish the file transfer between the requesting user (a user who requests the file) and the serving node (another user who caches the file or a caching server in the core network). The BS keeps track of which devices can communicate with each other and which files are cached on each device. Such BS-controlled D2D communication is more efficient and more acceptable to spectrum owners if the communication occurs in a licensed band as compared to traditional uncoordinated peer-to-peer communications . To serve a request for file in cluster , first, the BS searches the VCC of cluster . If the file is cached, it will be delivered from the local VCC ( mode). We assume that the BS has all the information about cached content in all clusters, such that all file requests are sent to the BS, then the BS replies with the address of the caching user from whom the file will be retrieved.
If a file is not cached locally in cluster but cached in any of the remote clusters, it will be fetched from a randomly chosen cooperative cluster ( mode), instead of downloading it from the backhaul. Unlike multi-hop D2D cooperative caching discussed in , in our work cooperating clusters are assumed to exchange cached files using a two-hop cellular communication link through the BS, such that the D2D band is dedicated only to the intra-cluster communication. Hence, all the inter-cluster communication is performed in a centralized manner through the BS. Finally, if the requested file has not been cached in any cluster in the cell, it can be downloaded from the core network via the backhaul link ( mode). The selection of the three modes of operation is conducted in a prioritized order from the local cluster, from the remote cluster, or finally from the core network through the backhaul link as a last resort.
Serving files sequentially according to the above three modes is based on the assumption that the BS has a capacity limited wired backhauling, such that the average delay per request is decreased when allowing inter-cluster cooperation. Otherwise, if the backhaul is not a bottleneck, e.g., optical fiber or millimeter wave backhaul links are available, requests for files not cached in the local cluster are served directly from the core network through the high capacity backhaul link. The analysis in this paper relies on a well-known grid-based clustering model , i.e., no specific underlying physical model or parameters are assumed. Therefore, the obtained design/results, e.g., design of caching scheme and the performance of the greedy algorithm, can be applied to similar scenarios with three prioritized paths (modes) for file downloading. For example, on-board users, such as on a plane or a ship, can obtain requested files from neighboring users via Bluetooth (local cluster mode), from a remote user through an access point  acting as a relay (remote cluster mode), or finally from the backhaul, which is the least preferred option. As another example, in the case of connecting users through unmanned aerial vehicles (UAVs) , serving files can be prioritized as follows. A file is received from a neighboring user via D2D communication (local cluster mode), from a remote user through the UAV acting as a relay (remote cluster mode), or from the backhaul to the core network through the UAV as a last resort.
Ii-B Content Placement and Traffic Characteristics
We use a binary matrix with to denote the cache placement in all clusters, where indicates that content is cached in cluster . Fig. 2 shows the assumed users’ traffic model in a cluster , modeled as a multiclass processor sharing queue (MPSQ) with arrival rate , and three serving processors representing the three transmission modes. According to the MPSQ definition , each transmission mode is represented by an M/M/1 queue with Poisson arrival rate and exponential service rate. A graphical interpretation of the content cache placement is shown in Fig. 3. The content caching policy is defined by a bipartite graph , where edges denote that content is cached in the VCC of cluster .
If a user in cluster requests a locally cached file (i.e., ), it will be served by the local cluster mode with an average rate . However, if the requested file is not cached locally and cached in any of the remote clusters, i.e., when and , it will be served by the remote cluster mode.
We denote the rate for the remote cluster mode by , accounting for the average sum transmission rate between the cooperating clusters through the BS. Accordingly, is shared between clusters simultaneously served by the remote cluster mode. Finally, requests for files that are not cached in the entire cell, i.e., when , are served via the backhaul mode with an average sum rate . We assume that , such that the part of the cellular rate allocated to the users served by the backhaul mode is neglected for the delay analysis. is assumed to be the effective rate from the core network to the user using BS.
Due to traffic congestion in the core network and the transmission delay between cooperating clusters, we assume that the aggregate transmission rates for the above three modes are ordered such that . We also assume that the content size
is exponentially distributed with meanbits. Hence, the corresponding request service times of the three transmission modes also follow an exponential distribution with means sec, sec, and sec, respectively.
Iii Problem Formulation
In this section, we characterize the network average delay on a per request basis from the global network perspective. Specifically, we study the request arrival rate and the traffic dynamics from a queuing theory perspective and get a closed form expression for the network average delay.
Iii-a File Popularity Distribution
We assume that the popularity distribution of files in all clusters follows a Zipf’s distribution with skewness order. However, it is assumed that the content may vary across clusters. This is inspired by the fact that, for instance, users in a library may be interested in an entirely different set of files from the users in a sports center. Our assumption for the popularity distribution is extended from , where the authors explained that the scaling of popular files is sublinear with the number of users.222The number of popular files increases with the number of users with a rate slower than the linear polynomial rate, e.g., the logarithmic rate.
To illustrate, if user 1 and user 2 are interested in a set of files with size , then the first files of user 2 are common with user 1 and are the new ones. User 3, in turn, shares files with users 1 and 2, and has new files, etc. The union of all demanded (popular) files by users is log . Hence, the library size increases sublinearly with the number of users. In this work, we assume that the scaling of the library size is sublinear with the number of clusters. The cell is divided into clusters with a small number of users per cluster, such that users in the same cluster are assumed to request files according to the same file popularity distribution function (i.e., users in the same cluster are interested in the same set of popular files).
The probability that a file is requested in cluster , with highly demanded files in each cluster, follows a Zipf distribution written as ,
where and , is the order of the most popular file in the th cluster, and is the indicator function. When , we get for the first cluster, which is the Zipf’s distribution with the most popular file . For example, if , then for the second cluster, which is the Zipf’s distribution with the most popular file ; also is the most popular file in the third cluster, and so on.
Iii-B Arrival and Service Rates
The arrival rates for the three communication modes , , and in a cluster are denoted respectively by , , and while the corresponding service rates are represented by , , and . For the local cluster mode, we have
where is the probability that the requested file is cached locally in cluster . The corresponding service rate is . For the remote cluster mode, the request arrival rate is defined as
where min equals one only if the content is cached in at least one of the remote clusters. Hence, min is the probability that the requested file is cached in any of the remote clusters given that it is not cached in the local cluster . The corresponding service rate is , where represents the number of cooperating clusters simultaneously served by the remote cluster mode, i.e, the number of clusters which share the cellular rate.
Finally, for the backhaul mode, the request arrival rate is written as
where is the probability that the requested file is not cached entirely in the cell, so this content could be downloaded only from the core network. The corresponding service rate is , where is defined as the number of clusters simultaneously served via the backhaul mode.
The traffic intensity of a queue is defined as the ratio of mean service time to mean inter-arrival time. We introduce as a metric of the traffic intensity at cluster as
Similar to , we consider as the stability condition, otherwise, the overall delay will be infinite. The traffic intensity at any cluster is simultaneously related to the request arrival rate and the transmission rates of the three serving modes.
Iii-C Network Average Delay
In , it is proven that the mean queue size for an MPSQ with arrival rate [sec] and traffic intensity , is
where and are respectively the arrival and service rates of a service group . Given the fact that the average delay equals the mean queue size divided by the arrival rate, substituting the above expression to calculate the average delay per request in a cluster yields
Based on the analysis of the delay in a single cluster, we derive the network weighted average delay per request as
where denotes the overall user request arrival rate in the cell. We observe from (6) that the cluster per request delay , and correspondingly the network average delay , depend on the arrival rates of the three transmission modes, which are in turn functions of the content caching scheme. Because of the limited caching capacity on mobile devices, we would like to optimize the cache placement in each cluster to minimize the network weighted average delay per request. The delay optimization problem is then formulated as
where (9) and (10) are the constraints that the maximum cache size is files per cluster, and the file is either cached entirely or is not cached, i.e., no partial caching is allowed. The objective function in (8) is not a convex function of the cache placement elements . Moreover, this equation can be reduced to a well- known knapsack problem which is already proven to be NP-hard in .
In this case, the caching problem is trivial, i.e., there are no caching constraints. For any cluster , and . The optimal solution is obtained when all the files are cached in each cluster. All the requests are served internally from the local cluster via D2D communication.
In the next section, we analyze the network average delay under several caching policies. We further reformulate the optimization problem in (8) as a well-known structure that has a locally optimal solution within a factor of the global optimum.
Iv Proposed Caching Schemes
Iv-a Caching Popular Files (CPF)
In each cluster, the most popular files for the users in the cluster are cached without repetition. Since popular files are different among clusters (but overlapped), applying CPF might end up replicating the same file in many clusters . We assume that the request arrival rate is equal for all clusters.
Iv-A1 Arrival Rate for D2D Communication
The arrival rate of the D2D communication mode is given by
where is the probability that the requested file is cached in the local cluster , and is the most popular file index for cluster . As an example, for the first cluster, .
Iv-A2 Arrival Rate for Inter-cluster Communication
The arrival rate of the inter-cluster communication mode is given by
where is defined as max. To explain, the inner summation represents the probability that the requested file is cached in a remote cluster , where the cached files in the th cluster are indexed from to . is defined such that a cached file in the remote clusters is counted only once when calculating . The outer summation is the sum over all clusters except the local cluster .
To compute the service rate of the remote cluster mode, , we first need to obtain the number of cooperating clusters since they share the cellular rate. As introduced in Section III,
is a random variable representing the number of clusters served by the cellular communication whose mean is given by
Iv-A3 Arrival Rate for Backhaul Communication
The arrival rate of the backhaul communication mode is now calculated as
is then obtained to calculate the backhaul service rate . As alluded to in the definition of , is a random variable representing the number of clusters served via the backhaul link whose mean is given by
Obviously, we have . From (11), (12), and (IV-A3), the network average delay can be calculated directly from (7). The CPF scheme is computationally straightforward if the most popular content is known. Additionally, the CPF scheme is easy to implement in an independent manner since it is executed in a per cluster level regardless of the caching status of other clusters, which is different from the greedy algorithm proposed in the next subsection. However, it achieves high performance only if the popularity exponent is large enough, i.e., when the content popularity distribution is skewed, since a small portion of content is highly demanded which can be cached entirely in each cluster.
Iv-B Greedy Caching Algorithm (GCA)
In this subsection, we introduce a computationally efficient caching algorithm. We prove that the minimization problem in (8) can be reformulated as a minimization of a supermodular function subject to uniform partition matroid constraints. This structure has a greedy solution which has been proven to be locally optimal within a factor of the optimum [24, 25, 26].
We start with the definition of supermodular and matroid functions, then we introduce and prove some relevant lemmas.
Iv-B1 Supermodular Functions
Let be a finite ground set. The power set of the set is the set of all subsets of , including the empty set and itself. A set function , defined on the powerset of as : , is supermodular if for any and we have 
To illustrate, let denote the marginal value of an element with respect to a subset . Then, is supermodular if for all and for all , we have , i.e., the marginal value of the included set is lower than the marginal value of the including set .
Iv-B2 Matroid Functions
Matroids are combinatorial structures that generalize the concept of linear independence in matrices . A matroid is defined on a finite ground set and a collection of subsets of said to be independent. The family of these independent sets is denoted by or . It is common to refer to a matroid by listing its ground set and its family of independent sets, i.e., . For to be a matroid, must satisfy these three conditions:
is a nonempty set.
is downward closed; i.e., if and , then .
If and are two independent sets of and has more elements than , then such that .
One special case is a partition matroid in which the ground set is partitioned into disjoint sets , where
for some given integers . One special case of the partition matroid is the uniform partition matroid in which .
Please see Appendix A for the proof. ∎
The objective function in equation (8) is a monotone non-increasing supermodular function.
Please see Appendix B for the proof. ∎
The greedy solution for this problem structure has been proven to be locally optimal within a factor of the optimum [24, 25, 26]. The greedy caching algorithm for the proposed D2D caching system with inter-cluster cooperation is illustrated in Algorithm 1, where is an element denoting the placement of file into the VCC of cluster . We first define the attributes of the system in the first line of the algorithm’s pseudocode. We then initialize the cache memory of all clusters to zero. We set the number of iterations to be , which means that at each iteration, we cache one file in one cluster, resulting in caching different files in clusters after iterations. In each iteration, all combinations of caching a file in a cluster are tried, and the network service delay is calculated. A file is chosen to be cached in the -th cluster, which achieves the highest reduction in the network service delay.
The greedy algorithm is run at the BS level, and the BS then instructs the clusters’ devices to cache the files according to the output of this algorithm. The deterministic caching approach (both CPF and GCA) can only be realized if the devices stay at the same locations for many hours. Otherwise, performance obtained with the deterministic caching strategy serves as a useful upper bound for more realistic schemes . As examples of the greedy algorithm, the authors in  showed that the problem of optimal joint caching and routing can be formulated as maximization of a monotone submodular function subject to matroid constraints, and hence can be solved by the greedy algorithm. Also, the authors in  showed that the delay minimization problem can be formulated as a maximization of a submodular function under matroid constraints, which can be solved by the greedy algorithm.
V Throughput Analysis
We have analyzed the per request average delay from the network perspective under different caching schemes. In this section, we conduct the per request throughput and throughput scaling analysis. We first characterize the per request throughput from the queuing theory perspective based on the analytical results of previous sections, then study the scaling of the average sum throughput when the number of files asymptotically goes to infinity.
V-a per request Throughput Analysis
In this subsection, we first formulate a condition on the traffic demand for the network to be stable, then we study the throughput per request from the cluster perspective. As introduced in Section II-B, the content size is assumed to have an exponential distribution with mean [bits]. For a cluster whose users’ traffic is modeled as an MPSQ with three serving processors (transmission modes), the number of users’ requests in the queue that matches the -th transmission mode is denoted by , where . Denote
as the vector counting the numbers of users’ requests in the queue for each transmission mode.
The process describing the number of users’ requests served by the three serving processors (transmission modes) is then a continuous-time Markov process . This process has a discrete state space and admits the following generator :
where designates the vector of having coordinate 1 at position j and 0 elsewhere, and . The first term of the above generator, , accounts for the arrival of a request that matches the th transmission mode while the second term, , accounts for serving a request by the th transmission mode.
Let be the vector counting the number of users’ requests of each transmission mode at the steady state, and let be the total number of requests in the queue at the steady state. The average traffic demand [bps] of each transmission mode in the th cluster is defined as 
and the total traffic demand per cluster is then given by
We now obtain the cluster critical traffic demand, beyond which the MPSQ is no longer stable. The constraint (5) that limits the traffic intensity from the above to one can be rewritten as
by multiplying both sides by and rearranging the terms, we get
where [bps] is the critical traffic demand per cluster, beyond which the MPSQ loses its stability.
The steady state distribution of the total number of users’ requests in the MPSQ modeling the users’ traffic follows a geometric distribution with parameter .
This result can be deduced from  and the references therein, and the proof is omitted in this paper to avoid repetition. ∎
As a direct result from Lemma 3, the mean number of total users’ requests in the MPSQ at the steady state is given by
At the steady state, the queue throughput is equal to the traffic demand . Hence, the average throughput per request is defined as the ratio of the given queue throughput and the average number of users’ requests, i.e.,
V-B Throughput Scaling Analysis
We conduct the scaling analysis of the average sum throughput when the number of files grows asymptotically to infinity, i.e., . We first define the outage probability for our proposed D2D cooperative caching system and then compare it with a clustered D2D caching system without inter-cluster cooperation . The obtained formula of the outage probability is further approximated and then exploited in the throughput scaling analysis.
In the following, we shall implicitly ignore the non-integer effects when they are irrelevant for the scaling laws. For example, recalling that the network has node density and it is divided into clusters, the number of users per cluster after integer rounding is denoted as . Next, we conduct the analysis for the CPF scheme. Since the backhaul rate is considered much smaller than the rate of cellular and D2D communications, we assume that the throughput from the backhaul communication is negligible as compared to the cellular and D2D throughput.
V-B1 Outage Probability
For a reference clustered D2D caching network without inter-cluster cooperation , the probability of no outage is defined as the probability that a randomly chosen user can download a requested file from nearby users in the same cluster . Conversely, a user is said to be in outage when its requested file is not cached within the allowed transmission range (i.e., not cached in a neighbor user in the same cluster). In our cooperative clustered model, a user is said to be in outage when the requested file is neither stored in the local cluster nor any of the remote clusters. We denote this outage probability as , which also represents the percentage of users who are in outage in relation to the total number of users; the probability of no outage is then denoted as .
As stated before, the number of users per cluster, denoted as , equals . In addition, the probability of no outage, , can be calculated by determining the probability that a randomly chosen user in cluster is served via the local cluster or the remote cluster modes. The probability of no outage is therefore expressed as the sum of two terms, the first term is corresponding to the probability of serving requests from the local cluster, and the second term is the probability of being served from a remote cluster. From (11) and (12), and under the assumption of the CPF scheme, the probability of no outage is given by
Due to the symmetry between clusters in terms of the cache content, cluster cache size, and the probability of being served from a remote cluster, we continue with the assumption that the user is being served from the first cluster (i.e., ) and the remote clusters (the potential cooperating clusters) are from to .
We now aim at deriving an approximated version of (V-B1) by replacing the summations with approximated integrals from , and then the obtained result is used later in the throughput scaling analysis. We have two approximations from ,
where max, and , represent respectively the probability of no outage for a non-cooperative system and the improvement (increase) in the probability of no outage due to the inter-cluster cooperation. In Fig. 4, we plot the outage probability of our proposed system with inter-cluster cooperation compared to a reference system without inter-cluster cooperation. We note that as the number of users per cluster increases, the outage probability correspondingly decreases. That is attributed to the fact that the probability of obtaining the requested files from the local cluster increases with the number of users per cluster.
V-B2 Throughput Scaling Analysis
We now express the network average sum throughput, denoted as (bps), as a function of the system parameters, namely, number of users, library size, and popularity exponent. Based on the assumed interference model, only one D2D link can be active at any time in each cluster. Whenever there is an active D2D link within a cluster, we say the cluster is good.333In this article, and different from , we neglect the inter-cluster interference. We assume that any cluster can be active whenever there is a scheduled D2D link, regardless of the activity of all other clusters. This assumption makes the calculated throughput an upper bound for the actual throughput. We also assume that a D2D link is scheduled in any cluster whenever the opportunity arises, i.e., if the user to be served in a cluster requests a file not cached locally, the request is then served via the appropriate transmission mode (remote cluster or backhaul modes), meanwhile, another D2D link is scheduled inside the cluster. In addition to the D2D throughput, there is the cellular throughput from the remote cluster mode. So the instantaneous throughput (bps) can be written as
and the average sum throughput is obtained from
where is the number of active D2D links, is the expected number of active D2D links, which is approximately the expected number of good clusters, and is the probability of occurrence of cooperation between clusters. For notational simplicity, we henceforth substitute by (bps) and by (bps), where .
where the above inequality holds because is a probability and cannot be greater than one. In particular, is tight with its upper bound for a large number of clusters and relatively uniform popularity distribution (i.e., not skewed). In the sequel, we calculate the expected number of good clusters .
Up to now, the cell is divided into virtual clusters, each of them with uniformly distributed users. As mentioned before, a cluster is good if at least one user requests a file that can be served from the locally cached content via D2D communication. Conversely, a cluster is not good if all users in the same cluster cannot serve their requests from the locally cached content, which occurs with probability , where is the probability that a randomly chosen user in any cluster can not obtain a requested file from nearby users in the same cluster. The probability of having a good cluster is then . Therefore, we have the following
where changes from to when changes from to . These ranges of and are interesting for the scaling analysis since they are reasonable in practice . In the following, we conduct the scaling analysis for the regime when changes sublinearly with , i.e, for some constant , and . We analyze the scaling of the upper bound for when asymptotically grows to infinity. Substituting into (V-B2) yields
where (a) follows by using , then we have
where (b) follows from , then we have
This result shows that:
As the library size increases, the upper bound for decreases, since the probability of having active D2D links (good clusters) decreases.
As increases, corresponding to the decrease of the popularity exponent , the upper bound for vanishes more rapidly with the library size .
The upper bound for scales linearly with the number of users .444We use the standard Landau notation: denotes and denotes , where , , and are real constants .
The average sum throughput is plotted against the number of users per cluster in Fig. 5, for different values of . We observe that there is an optimal value of at which the throughput is maximized. First, the throughput increases with the cluster size . Then, as the cluster size increases, the outage probability decreases owing to the higher cache size per cluster. However, for larger cluster size, the throughput starts to decrease owing to the decrease in the number of clusters associated with the larger cluster size.
Vi Numerical Results
In this section, we evaluate the performance of our proposed inter-cluster cooperative architecture using simulation and analytical results. Results are obtained with the following parameters: requests/sec, files, files, Mbits, clusters, users, files, and files. Mbps and Mbps as in . For a typical D2D communication system with transmission power of 20 dBm, transmission range of 10 m, and free space path loss model as in , we have Mbps.
In Fig. 6, we verify the accuracy of the analytical results of the network average delay under the CPF with inter-cluster cooperation. The theoretical and simulated results for the network average delay under the CPF scheme are plotted together, and they are consistent. We see that the network average delay is significantly improved by increasing the cluster cache size . Moreover, as increases, the average delay decreases. This is attributed to the fact that a small portion of content forms most of the requests that can be cached locally in each cluster and delivered via high data rate D2D communication.
In the following, we evaluate and compare the performance of various caching schemes. In Fig. 7, our proposed inter-cluster cooperative caching system is compared with a D2D caching system without cooperation under the CPF scheme. For a D2D caching system without cooperation, requests for files that are not cached in the local cluster are downloaded directly from the core network. For the sake of concise comparison, we define the delay reduction gain as
Fig. 7 shows that, for a small cluster cache size, the delay reduction (gain) of our proposed inter-cluster cooperative caching is higher than 45% with respect to a D2D caching system without inter-cluster cooperation and greater than 80% if the cluster cache size is large.
To show the energy-delay reduction gain tradeoff among the devices, in Fig. 8, we plot the per-cluster energy consumption during the local and remote cluster modes and the gain attained from inter-cluster cooperation against the cluster cache size . dBm, and dBm denote respectively the transmission power in the local cluster and remote cluster modes. In each transmission mode, the energy per request is the transmission power times the transmission duration. The transmission duration is given by the ratio of file size over the transmission rate. We see that the consumed energy during the local cluster transmission, i.e., D2D communication, monotonically increases with the cluster cache size . With the increasing of , more requests are served via the local cluster mode . For the consumed energy during the remote cluster transmission, we see that it initially increases with , then it decreases, and the same behavior is observed for the delay-reduction gain. This can be interpreted as follows. When increases, the number of requests served from the remote clusters increases since the remote clusters’ VCCs increase. When becomes much larger, the local cluster cache becomes sufficiently large to serve most of the requests, as opposed to being served by the remote cluster mode.
For comparison purposes, Fig. 9 shows the average delay for the proposed caching schemes and random caching against various system parameters. Fig. 9(a) shows the network average delay plotted against the request arrival rate for three content placement techniques, namely, GCA, CPF, and random caching (RC).555Here, we adopt different transmission rates from  and  to provide insights on the difference between the caching schemes, otherwise, the GCA is far superior to the other schemes, with negligible delay. In RC, content stored in clusters are randomly chosen from the file library. The most popular files are cached in the CPF scheme, and the GCA works as illustrated in Algorithm 1. We see that the average delay for all content caching strategies increases with since a larger request rate increases the probability of a longer waiting time for each request. It is also observed that the GCA, which is locally optimal, achieves significant performance gains over the CPF and RC solutions for the above setup. Fig. 9(b) shows that the GCA is superior to the CPF only for small values of the popularity exponent . If the popularity exponent is high enough, CPF and GCA will achieve the same performance. When increases, the CDF of the Zipf’s distribution becomes more skewed. This implies that only a smaller portion of the files is highly demanded by the devices. The lower the number of files requested by the devices, the higher the probability of having such files cached in the clusters’ VCCs. If all these files are cached locally in each cluster, the global minimum solution for the delay minimization problem is attained. This interpretation explains why when increases, the CPF and GCA solutions converge to the global optimal solution. We also note that the CPF and RC schemes roughly achieve the same delay when . This stems from the fact that with , all files have equal popularity, and correspondingly, CPF is equivalent to RC. Moreover, RC fails to reduce the delay as increases, since caching files at random results in a low probability of serving the requested files from local clusters.
Next, we turn our attention to the throughput results in Fig. 10. Fig. 10(a) plots the throughput per request as a function of the popularity exponent for the three caching schemes. It is shown that the per request throughput monotonically increases with for the CPF and GCA schemes, and shows a slight decrease for the RC scheme. When increases for the GCA and CPF, the locally stored files form most of the users’ requests that can be delivered via high rate D2D communication. Conversely, for the RC scheme, which caches the files uniformly at random, the probability of having the requested files cached in the local clusters slightly decreases when the popularity of files becomes skewed (higher ). Due to the resulting lower probability of serving the requests from the local clusters, the throughput per request, in turn, slightly decreases owing to the lower probability of activating D2D links. In Fig. 10(b), the throughput per request is plotted against the cluster cache size for the three caching schemes. It is noticed that for all the caching schemes, the per request throughput is improved with the cluster cache size, and the GCA achieves the highest throughput. This can be explained by the fact that, with large cluster cache size, there is a high opportunity of exchanging cached content via the local cluster mode that exploits the high rate of the D2D communication.
In this work, we propose a novel D2D caching architecture to reduce the network average delay. We study a cellular network consisting of one SBS and a set of users. The cell is divided into a set of equally-sized virtual clusters, where the users in the same cluster exchange cache content via D2D communication, while the users in different clusters cooperate by exchanging their cache content via cellular transmission. We formulate the delay minimization problem in terms of the content cache placement. However, the problem is NP-hard and obtaining the optimal solution is computationally hard. We then propose two content caching policies, namely, caching popular files and greedy caching. By reformulating the delay minimization problem as a minimization of a non-increasing supermodular function subject to uniform partition matroid constraints, we show that it could be solved using the proposed GCA scheme within a factor of the optimum. Moreover, we conduct the throughput analysis to investigate the behavior of the average throughput per request under different caching schemes. We study the scaling behavior of the average sum throughput when the library size asymptotically grows to infinity and show that the network average sum throughput decreases with the library size increase, and the rate of this decrease is controlled by the popularity exponent. We verify our analytical results by means of extensive simulations and the results show that the network average delay could be reduced by 45%-80% by allowing inter-cluster cooperation.
Appendix A Proof of lemma 1
We define the ground set that describes the cache placement elements in all clusters as
where is an element denoting the placement of file into the VCC of cluster . This ground set can be partitioned into disjoint subsets , where is the set of all files that might be placed in the VCC of cluster .
Let us express the cache placement by the adjacency matrix