The explosive mobile data traffic growth is putting a heavy burden on backhaul links, causing delays in downloading contents. However, a large portion of the mobile traffic is due to duplicate downloads of a few popular contents. Caching technology has been considered as an effective solution to reduce the burden of the backhaul, by storing the contents on edge devices . This enables the mobile users to download their requested contents from the nearby devices instead of downloading the contents from the core network. In modeling such scenarios, most of research efforts focused on downloading cost. However, storing a content may be subject to a cost as well. A storage cost may be due to flash rental cost incurred by cloud service providers or flash damage caused by writing a content to the memory device . In both cases, the storage cost typically depends on the time duration of storage, hereafter referred to as the retention time. Intuitively, with longer retention time, the requested contents can be obtained with higher probability from the cache, thus avoiding the cost of downloading from the network. But longer retention time results in a higher storage cost. Therefore, what to cache and for how long are both key aspects in optimal caching. Few works in studying optimal caching have considered the impact of storage cost. The works such as [3, 4, 5] considered only the downloading cost. The study in  proposed an approximation algorithm with performance guarantee for multicast-aware proactive caching. The authors in  considered cost-optimal caching with user mobility. They presented an extension in  by providing a linear lower bound of the objective function. In these works, storage cost was neglected. The study in 
chose to represent storage cost using a random variable. Later, the work in suggested that the storage cost can be better modeled by an increasing linear/convex function. Another limitation of  is that the retention time is fixed. Later, this assumption was relaxed in  and the retention time was treated as an optimization variable in a time-slotted system. We remark that in , a user is associated with only one cache. A generalization of a multiple-path routing model with retention-aware caching was studied in .
For mobility scenarios, contents are often cached at mobile devices. They can exchange the requested contents when they move into the communication range of each other. Making the best of mobility information between mobile users can significantly improve the caching efficiency [4, 10, 11]. However, considering both downloading and storage costs, with presence of user mobility, calls for further research. To this end, our main contributions are as follows.
We formulate a Proactive Retention-Aware Caching Optimization (PRACO) problem with user mobility in a time-slotted system, taking into account both the downloading and storage costs.
This problem is a combinational optimization in its nature. However, we provide mathematical analysis in order to facilitate the computation of global optimum time-efficiently, namely,
we first prove that for any content, the optimal caching decisions over the time slots can be derived, given the initial number of mobile users caching the content;
the above analysis is then embedded into a dynamic programming algorithm and we prove it is both globally optimal and of polynomial-time complexity.
The numerical results show significant improvements in comparison to two conventional algorithms, namely, popular caching and random caching.
Ii System Model and Problem Formulation
Ii-a System Model
We consider a vehicular network scenario which consists of a content server having all the contents, a number of vehicles, and road side units (RSUs) providing signal coverage for the vehicles. Denote by the set of vehicles that are interested in requesting contents, referred to as requesters, whose index set is represented by . Denote by the set of vehicles that we call helpers. Each helper is equipped with a cache of size , that can supply the requesters with contents from the cache, and therefore to mitigate backhaul congestion. We consider a library of contents, whose index set is . The sizes of all the contents are the same and are assumed to be one. In addition, each content is either fully stored or not stored at all at a helper. Figure 1 shows the system scenario.
The event that vehicles move into the transmission range of each other is called a contact
, during which communication between them can occur. We consider that the contact between any two vehicles follows a Poisson distribution. Poisson distribution can characterize the mobility pattern of vehicles as the tail behavior of the inter-contact time distribution can be characterized as an exponential distribution by analyzing the real-world vehicle mobility traces, see. Here, we assume a homogeneous contact rate for all the vehicles, denoted by . This assumption is in fact common [13, 14]. As a consequence, it is not necessary to explicitly consider the content cached by each helper, as there is no difference between the helpers in the perspective of the requesters. Thus, the caching performance is fully determined by the number of helpers for each content. Moreover, it is obvious that a content is cached by no more than helpers. Therefore, in modeling cache capacity, it is sufficient to constrain that the total number of the cached contents of all the helpers does not exceed .
We consider a time-slotted system where each time slot111Note that the time duration of a slot is different from that in LTE. Here, the order of magnitude in performance evaluations is hour. is of duration . The time period subject to optimization consists of time slots. In each slot, all the requesters are active and ask for some content. Also, no requester becomes helper in the next slots or vice versa. Each requester has its content request probabilities, of which the distribution is independent of time slot. The probability that content is requested by requester is denoted by , with . The contents, if cached by the helpers, are fetched at the beginning of the time period. When a requester asks for a content in a slot, the requester will first try to collect the content from the encountered helpers. If the requester fails to obtain the content at the end of this slot, it downloads the content from the server. In the latter case, a downloading cost is incurred.
Ii-B Cost Model
Denote by our caching decision which is a matrix for the contents and the slots. The entry at location , i.e., , denotes the number of helpers storing content in slot , . The caching optimization process, applied at the beginning of the time period, will determine the number of helpers for each content as well as the retention time. The latter is represented by the number of helpers over the time slots, and this number either remains or decreases from one time slot to the next.
Downloading a content from the server results in a downloading cost. Also, caching a content in a helper has a storage cost due to storage rental cost and flash memory damage. Same as  and , we neglect the cost of the helpers to fill their caches at the beginning of the time period. In addition, for the requesters, the downloading cost from helpers is negligible in comparison to that from the server. Therefore, the total cost consists of the downloading cost from the server for the requesters and the storage cost for the helpers.
Denote by the storage cost due to storing a content in a helper’s cache in slot . A longer retention time needs a higher threshold voltage, which results in a higher memory damage and consequently gives a higher storage cost, for more detailed discussions, see . Motivated by this, we assume that is an increasing function.
When content is requested by requester in slot , the probability that the requester has to download the content from the server is denoted by . If does not meet any helper having within the slot, the only way of obtaining is to download from the server. As the contacts between the users follows a Poisson distribution, is given by:
Thus, the total cost, denoted by , reads as:
where is the weighting factor of the two cost types.
Ii-C Problem Formulation
The proactive retention-aware caching optimization (PRACO) problem can be formulated as (2).
Iii Problem Analysis
We prove that for each content, the optimal number of helpers caching the content decreases over the time slots. Next, we present an algorithm, which, with respect to the possible initial numbers of helpers of a content, computes the optimal number of helpers for this content over time. We then prove that the algorithm’s optimality and its polynomial time complexity.
For any content c and time slot , if helpers minimizes the total cost for , then for any , the total cost of using helpers is higher than using helpers.
The total cost for content and time slot is:
As the minimum of occurs for , we have:
for . Also, is an increasing function. Thus:
for By rearranging the terms of (6), we have for and . ∎
In the algorithm, for each content
, a vectorof size is used to store the optimal cost for all possible initial numbers of helpers. Line considers all possible initial numbers of helpers in the range . Lines 6 and 7 compute the optimal values of , denoted by respectively, for given . The computation is of complexity . For any content and any possible initial number of helpers , the optimal cost over all the slots are saved in . The overall complexity of Algorithm 1 is , given computed.
Note that Lines 6 and 7 are greedy by construction. Namely, for time slot , Line 7 determines the number of helpers that minimizes the cost of that specific time slot. Even though this is intuitive, it is not obvious that the greedy choice is globally optimal for the given initial number of helpers. The optimality analysis is formalized in Lemma 2.
For any , consider , the numbers of helpers over the time slots returned by Algorithm when is given. Consider another sequence that differs from the first sequence and offers a lower total cost. First consider the case that for some , and remains smaller than the values of the second sequence in consecutive time slots until time slot where . That is, the second sequence has elements all being greater than , whereas for slot , . Consider changing all of , , …, to in sequence two, while keeping the values of all other time slots of this sequence. The updated sequence is feasible because . Thus, monotonicity remains for the updated sequence. The update reduces the cost of the second sequence by Lemma , hence a contradiction. A special case is , for which does not exist. However the same update and conclusion apply. One case remains, namely there is no time slot with , yet sequence two is different from sequence one. In other words, , …, . Let , , be the first time slot with strict inequality, i.e., . Such a time slot must exist, because, otherwise the two sequences coincide. Consider increasing to . Sequence two remains feasible in terms of being monotonically decreasing, because after setting to as . The cost of , due to the update, becomes lower because when is considered by the algorithm, is the optimum. Therefore in this case the second sequence cannot be better either. Hence the result. ∎
By Algorithm 1, , , can be computed if , , is given. Consequently, solving PRACO is equivalent to finding the optimal values of , . We drop the second subscript and use as optimization variables for the initial numbers of helpers, and reformulate PRACO as follows. The cost of , i.e., is from Algorithm 1. Constraint (7b) models the cache capacity.
Iv The overall Algorithm and Optimality
Iv-a Dynamic Programming
We use dynamic programming (DP) to obtain the optimal values of , . Denote by the cost of optimal caching of the first contents with a total cache capacity of units. Thus, by definition, is the overall optimal cost. The values of , submit to recursion, as formalized in the lemma below.
The value of can be derived from the recursive formula shown in (9) for , with:
We use induction. For , the result is obvious for any . Suppose that is the optimal value for some , with in any range of interest. By (9), we have:
For , the initial number of helpers must be one of the values in . For any value of , is the optimal cost for content (Lemma 2), and the corresponding cache capacity for contents up to is . For the latter, is optimal. These, together with the -operator give the optimum for . ∎
Iv-B Algorithm Description and Optimality
Algorithm 2 describes the DP approach. The input parameters consist of , , , and . Here, is from the output of Algorithm 1. Apart from as defined earlier, is used to store the optimal caching solution. Lines - compute and for , whereas Lines and compute and for . Finally, is mapped to optimal values of , denoted by , using Lines -.
Algorithm 2 delivers the global optimum of PRACO in polynomial time.
The optimality follows from Lemma 2 and the recursion of which the correctness is established in Lemma 3. As for time complexity, the steps in Algorithm 2 together require a complexity of . However, a prerequisite is that the -values are given. Computing these values with Algorithm 1, given computed, has complexity . Hence the overall complexity is of . Finally, note that, even though is not a parameter for input size, its values is bounded by , because otherwise the capacity constraint is redundant and the problem decomposes by content (and solved without the need of DP). Hence the complexity is of , which is polynomial in input size. ∎
V Performance Evaluation
We compare the DP algorithm to two conventional caching algorithms, i.e., random caching  and popular caching . Both algorithms consider contents for caching one by one. In the former, the contents are considered randomly, but with respect to the files’ request probabilities; a content with higher request probabilities will be more likely selected for caching. In the latter, popular contents, i.e., contents with higher request probabilities, will be considered first. For the content under consideration, the cache decision is the number of helpers with minimum total cost.
We use a Zipf distribution with shape parameter to characterize the content request probability for any requester. Thus, . Same as , the time period is set to hours, and the duration of each time slot () is hour. The storage cost is simulated using .
Figures 3-5 provide the results and show the impacts of parameters , , , and on the cost, respectively. It can be seen that the cost decreases with respect to all the mentioned parameters. This is quite expected. For example, when increases, the requesters have more opportunity to meet helpers, leading to lower cost. The same conclusion can be made for cheaper storage (small ), and higher capacity (larger ). For parameter , a higher value means more variation in the contents’ request probabilities, thus it is easier for the algorithms to identify caching solutions such that the helpers more likely store the requested contents.
The DP algorithm outperforms the two conventional caching algorithms. In Figures 2-4, the improvement is significant when and increase and decreases. For example, when increases from to , the DP algorithm outperforms the popular caching algorithm by to , and outperforms the random caching algorithm by to . This is because the DP algorithm uses the storage capacity of helpers optimally in comparison to the conventional algorithms.
Recall that small means low storage cost. When , which is a fairly large value in the context, the optimal strategy tends to not to store contents – it is more preferable to download from the server. Hence cache optimization is less relevant and the algorithms are similar in performance. When decreases, the difference between the performance of the DP and the other algorithms becomes apparent, as the DP algorithm uses the storage capacity optimally while the conventional algorithms are not able to accomplish this.
The paper has studied a proactive retention-aware caching problem, considering user mobility, storage cost, and cache size. We have provided analysis and algorithm development, proving that global optimum is within reach in polynomial time. Simulation results have manifested significant improvements by the proposed algorithm in comparison to two conventional caching algorithms. In our future work, we consider a more general system scenario including non-homogeneous contact rates, helpers with different cache sizes, and contents with different sizes. Thus, the problem becomes more challenging and new solution approaches need to be developed.
-  X. Wang, M. Chen, T. Taleb, A. Ksentini, and V. Leung, “Cache in the air: Exploiting content caching and delivery techniques for 5G systems,” IEEE Commun. Mag., vol. 52, no. 2, pp. 131–139, Feb. 2014.
-  S. Shukla and A. A. Abouzeid, “Optimal device-aware caching,” IEEE Trans. Mobile Comput., vol. 16, no. 7, pp. 1994–2007, Jul. 2017.
-  K. Poularakis, G. Iosifidis, V. Sourlas, and L. Tassiulas, “Exploiting caching and multicast for 5G wireless networks,” IEEE Trans. Wireless Commun., vol. 15, no. 4, pp. 2995–3007, Apr. 2016.
-  T. Deng, G. Ahani, P. Fan, and D. Yuan, “Cost-optimal caching for D2D networks with presence of user mobility,” in Proc. IEEE Globecom, 2017, pp. 1–6.
-  ——, “Cost-optimal caching for D2D networks with user mobility: Modeling, analysis, and computational approaches,” IEEE Trans. on Wireless Commun., vol. 17, no. 5, pp. 3082–3094, May. 2018.
-  N. Abedini and S. Shakkottai, “Content caching and scheduling in wireless networks with elastic and inelastic traffic,” IEEE/ACM Trans. Netw., vol. 22, no. 3, pp. 864–874, Jun. 2014.
-  B. Schroeder, R. Lagisetty, and A. Merchant, “Flash reliability in production: The expected and the unexpected,” in Proc. Usenix FAST, 2016, pp. 67–80.
-  S. Shukla and A. Abouzeid, “Proactive retention-aware caching,” in Proc. IEEE Infocom, 2017, pp. 1–9.
-  S. Shukla, O. Bhardwaj, A. Abouzeid, T. Salonidis, and T. He, “Hold’em caching: Proactive retention-aware caching with multi-path routing for wireless edge networks,” in Proc. ACM Mobihoc, 2017, pp. 1–10.
-  W. Wang, X. Peng, J. Zhang, and K. Letaief, “Mobility-aware caching for content-centric wireless networks: Modeling and methodology,” IEEE Commun. Mag., vol. 54, no. 8, pp. 77–83, Aug. 2016.
-  R. Wang, J. Zhang, and K. Letaief, “Mobility-aware caching in D2D networks,” IEEE Trans. Wireless Commun., vol. 16, no. 8, pp. 5001–5015, Aug. 2017.
-  H. Zhu, L. Fu, G. Xue, Y. Zhu, M. Li, and L. Ni, “Recognizing exponential inter-contact time in vanets,” in Proc. IEEE Infocom, 2010.
-  T. Spyropoulos, K. Psounis, and C. Raghavendra, “Efficient routing in intermittently connected mobile networks: The multiple-copy case,” IEEE/ACM Trans. Netw., vol. 16, no. 1, pp. 77–90, Feb. 2008.
-  X. Zhang, G. Neglia, J. Kurose, and D. Towsley, “Performance modeling of epidemic routing,” Comput. Netw., vol. 51, no. 10, pp. 2867–2891, Jul. 2007.
-  B. Blaszczyszy and A. Giovanidis, “Optimal geographic caching in cellular networks,” in Proc. IEEE ICC, 2015, pp. 3358–3363.
-  H. Ahlehagh and S. Dey, “Video-aware scheduling and caching in the radio access network,” IEEE/ACM Trans. Netw., vol. 22, no. 5, pp. 1444–1462, Oct. 2014.