Whereas the amount of data traffic is exponentially growing, it has been realized that the major portion of the data traffic originates from duplicated downloads of a few popular contents. These duplicated downloads congest the backhaul links, hence lowering the quality of service. It is costly to increase the capacity of backhaul links, hence they should be used more effectively. A promising technique is to store the popular contents on the edge of network such as BSs with caching capability [2, 3, 4]. This technique helps to improve the efficiency of communications systems via providing the contents of interest from the BSs instead of from the core network. In fact, the measurement studies in [5, 6] showed up to of traffic reduction in 3G and 4G networks via caching techniques.
Optimal content caching heavily depends on two main factors, namely the number of requests for the contents and the delivery deadlines of such requests. The number of requests for a content, referred to as the popularity of a content, may vary over time. Therefore, the contents of cache need to be updated accordingly. An update incurs a downloading cost due to getting contents from the server to the BS cache. It is commonly assumed that a content request need to be served as soon as it is made. We extend the problem setup and investigate a scenario in which a user can put a deadline on delivery time of the requested content. To the best of our knowledge, the joint impact of delivery deadline and content downloading cost in content caching has not been studied in the literature. In order to close this gap, we study content caching along time in a BS with limited caching capacity. We address optimal scheduling of updating the cache taking into account the downloading cost subject to delivery deadline and cache capacity constraints.
I-B Related Works
Content caching has been studied in various system scenarios in the context of wireless communication networks. We provide a review with emphasis on the recent developments. We refer the reader to  for a comprehensive survey.
studied content caching in BSs when the probability distributions of contents are known. In the objective was to minimize the expected downloading time of contents. In [8, 9] collaborate content caching among BSs was considered with the objectives of minimizing an operational cost and average downloading delay, respectively. In  decentralized content caching was studied with presence of multihop communications. In  the user’s hit probability was maximized.
The studies in [12, 13, 14, 15, 16] enhanced the system models in the works mentioned above to take into account the impact of user mobility in content caching of BSs. The works in [12, 13] took into account the movement of users where the trajectories of users are known. In , content caching at both BSs and user equipments was investigated with the objective of minimizing the energy consumption. The works in [15, 16] further improved the system model in  and considered caching on mobile users such that they can obtain their contents of interest from each other via device-to-device communications.
In contrast to the aforementioned works, the studies in [17, 18, 19, 20, 21, 22, 23, 24] investigated content caching in BSs when the popularity distributions of contents are unknown. The work in  determined the popularity of a content based on the previously stored contents. The work in  computed popularity of a content using a big dataset, and proposed an optimal content caching algorithm to minimize the delivery time of contents. In 
, the authors estimated popularity of contents via local interest for the content and then proposed a caching algorithm to maximize the hit rate. In an online algorithm is proposed to estimate the popularity of contents based on the incoming requests. The works in [21, 22, 23, 24] proposed learning based methods to estimate the popularity of contents.
In all works mentioned so far, the popularity distribution of contents is invariant along time. The studies in [25, 26, 27, 28, 29] relaxed this assumption and considered content caching with time-varying popularities. In [25, 26] caching contents of uniform size was studied, however, the cost of cache updating was neglected. In, the authors used an approach using stochastic geometry in the context of estimating the hit rate of caching. In  content caching with updates in D2D networks was considered. In  collaborative caching was studied, where the cost for updates is accounted for. However, the authors assumed a requested content need to be served instantly after the request is made. This may not be true in some circumstances when a requester can wait before the content is delivered until a time point, that is deadline.
The works just mentioned above are the most related studies to our work in the sense that they have also considered cache updating along time. However, the main effort in these investigations was devoted to estimating the popularity distributions of contents rather than designing effective content caching algorithms. Moreover, in the studied system models, the cost of performing updates is neglected or the deadlines of content requests are not considered. Therefore, we aim to complement the above works and devote our effort to designing an effective content caching algorithm where the deadline constraints and the cost of cache updating are considered jointly.
I-C Our Contributions
We investigate scheduling of content caching in a BS with limited caching capacity in a time-slotted system under delivery deadline and cache capacity constraints. Our main contribution lies on the joint consideration of time-varying popularity of contents and the deadlines of requested contents. Our objective is to optimally schedule the updates across the time slots so as to minimize the total cost of obtaining the requested contents by users. The main contributions of this work are summarized as follows:
We formally prove the NP-hardness of the problem based on a reduction from the Partition problem.
We provide a mathematical problem formulation. Specifically, the problem is formulated as an integer linear program (ILP), taking into account the size of contents, capacity of the cache, deadlines of requests, and costs of content downloading and cache updating.
Based on a mathematical reformulation of the problem, we develop an effective solution approach based on a repeated column generation algorithm (RCGA). RCGA runs repeatedly and alternatively two algorithms, namely a column generation algorithm (CGA) and a problem-tailored rounding algorithm (TRA). TRA is specially designed to construct integer solutions from the fractional solutions of CGA. Moreover, RCGA provides an effective lower bound (LB) of global optimum such that the LB can be used to measure the effectiveness of any suboptimal algorithm.
We propose two greedy algorithms based on existing algorithms in the literature. Even though these algorithms can not provide high-quality solutions, they are of interest because of their low complexity and consequently fast solutions for large-scale problem instances.
Finally, we conduct extensive simulations to verify the effectiveness of RCGA, and greedy algorithms by comparing them to the LB. Simulations results manifest that the solutions obtained from RCGA and the greedy algorithms are within and of global optimum, respectively.
Ii System Scenario and Complexity Analysis
Ii-a System Scenario
The system scenario consists of a content server, a base station (BS), users within the coverage of the BS, and contents. The set of users is denoted by . The server has all the contents, and the BS is equipped with a cache of size . Denote by the set of contents. Denote by the size of content . The system scenario is shown in Figure 1.
We consider a time-slotted system in which a time period is divided into time slots. Denote by the set of time slots with . At the beginning of each time slot, the contents of the cache are subject to updates. Namely, some stored contents may be removed from the cache and some new contents may be added to the cache by downloading from the server.
The popularity of a content is determined by the number of requests for the content. In our model, user , requests at most contents within the time slots based on its interest. The set of requests for user is denoted by . The length of a time slot is long enough to complete the downloading process of the requests from the BS or the server. We assume the time of making each request is known or can be predicted via using a prediction model . In addition, each request has a deadline before which the requested content must be delivered to the user. For user and its -th request, the requested content, the time slot of request, and the deadline of request, are denoted by , , and , respectively.
A content may become available or unavailable in the cache from a time slot to another due to caching updates. A content is either downloaded from the cache if the content is available in the cache between the time slot of the request made and its deadline, or, otherwise from the server. Denote by and the costs for downloading one unit of data from the server and from the cache, respectively. Intuitively, to encourage downloading from the cache. The time duration for downloading data from the server to the BS is neglected as the backhaul capacity is significantly higher than that of wireless access. The problem of optimally scheduling content caching subject to deadline of requests is abbreviated to SCCD. The objective is to minimize the total cost of content downloading.
Ii-B Complexity Analysis
In this section, we formally prove the NP-hardness of the problem based on a reduction from the Partition problem.
SCCD is NP-hard.
The proof is based on a polynomial-time reduction from the Partition problem that is NP-complete . Consider a Partition problem with a set of integers. The task is to determine whether it is possible to partition into two subsets and with equal sum.
We construct a reduction from the Partition problem as follows. We set , for , , and . In this case, there is no updating cost and we only have downloading cost. The time slots of requests and deadlines for all requests are set to , i.e., . Denote by the number of users requesting content in this slot. We set for , , and . If content is cached, , the users can download content from the cache, thus the downloading cost for content is . Otherwise, the users have to download content from the server, giving rise to the downloading cost of . That is, if the cache stores content , it will obtain gain. By this construction, the total gain that can be achieved is upper-bounded by . Now the question is whether we can achieve this gain. Solving the defined instance of SCCD will answer this question and also the Partition problem. Namely, after solving this instance of SCCD, if a total gain of is achieved, then the answer to the Partition problem is yes, and the contents in and outside the cache correspond to the two subsets and , respectively. Otherwise, the answer to the Partition problem is no. Hence the conclusion. ∎
Iii Integer Linear Programming Formulation
Iii-a Cost Model
Denote by a binary optimization variable which equals one if and only if the -th request of user is downloaded in time slot from the cache. The downloading cost for user to obtain the content requested in the -th request, denoted by , is expressed as:
where the first term indicates that if the content is downloaded before its deadline from the cache, the downloading cost is . Otherwise, it is downloaded from the server with cost . The downloading cost for completing all requests of user , denoted by , is:
Thus, the downloading cost for completing all requests for all users, denoted by , is expressed as:
Placement of contents in the cache incurs an updating cost. The updating cost over the time slots, denoted by , is expressed as:
is a binary variable which equals one if and only if the cache does not store contentin slot , but stores the content in slot , and is the cost for downloading content from the server to the cache.
Iii-B Problem Formulation
In general, as the popularity of contents changes over time, storing popular contents in each time slot will reduce the downloading cost, but it significantly increases the updating cost. On the other hand, if the stored contents remain unchanged over the time slots, the updating cost is low, but the downloading cost will be high. Based on this, our optimization problem is to minimize the total cost consisting of the downloading and the updating costs, denoted by , by optimizing decisions in terms of caching the contents over the time slots. Denote by an matrix of optimization variables for contents and time slots:
where is a binary variable that takes value one if and only if content is stored in slot . SCCD can be formulated as an integer linear program (ILP) and shown in (5).
Constraints (5b) indicate that the total amount of cache space used for storing the contents is less than or equal to the cache capacity in each time slot. Constraints (5c), (5d), (5e), and (5f) together ensure that is one if and only if the cache does not store content in time slot , but stores the content in time slot . Constraints (5g) state that can take value one only if , i.e., content is stored in the cache in time slot . Constraints (5h) say that request from user is met in at most one of the time slots between the time slot of request and its deadline.
ILP (5) can be solved by an off-the-shelf integer programming algorithm from optimization packages. However, for large-scale problem instances solving the problem needs significant computational effort. Therefore, we develop a column generation algorithm and rounding mechanism, presented in Section V, to obtain near-to-optimal solutions of SCCD.
Iv Problem Reformulation
In this section, we provide a reformulation of SCCD that enables a column generation algorithm (CGA). We will see in Section VII that the algorithm achieves near-to-optimal solutions.
We define sequence to represent the caching solution of content over the time slots. As for , in total possible sequences exist for content . However, as will be clear later on, the algorithm needs to deal with only a small subset of the candidate sequences. Denote by a set, with . Denote by a binary variable where if and only if the -th sequence of content is selected, otherwise zero. Exactly one of them is used in the solution of the problem, thus . For any given sequence, the total cost of the sequence can be calculated as the sequence contains known caching decisions. The total cost for content with respect to the -th sequence is denoted by and is expressed in (6). Denote by constants , , and the values of , , and with respect to the -th sequence, respectively. Note that given the values of the value of can be determined.
Based on the above notion, SCCD is reformulated as (7). Constraints (7b) formulate cache capacity over the time slots. These constraints have the same meaning as constraints (5b). Constraints (7b) say that exactly one column has to be selected for each content. In formulation (7) the deadline and updating constraints (i.e., constraints (5c)-(5h)) are not present, and they are embedded in the columns. As can be seen both (5) and (7) are valid optimization formulations of SCCD. However they differ in structure.
V Algorithm Design
In this section, we present our solution approach. We first consider the continuous version of formulation (7) and apply column generation to derive its global optimum. This gives obviously a lower bound to the global optimum of SCCD. Next, if the solution obtained from the column generation algorithm (CGA) is fractional, we use a tailored rounding algorithm (TRA) to obtain integer solutions. Using TRA, some of the decisions in terms of caching will be fixed and CGA will be used again to resolve the new problem subject to these decisions. This will continue until an integral solution is obtained. We refer to our algorithm as repeated column generation algorithm (RCGA).
V-a Column Generation Algorithm
For some structured linear programming problems, CGA can reduce the computational complexity for solving large-scale instances . The main advantage of using column generation is that the optimal solution can be obtained without the need of considering the set of all possible columns of which the number is typically exponentially many. In column generation, the problem under consideration is decomposed into a so called master problem (MP) and a subproblem (SP). The algorithm iterates between a restricted MP (RMP) and SP. The idea is to start with a very limited set of columns. The algorithm solves the SP to generate one or multiple new column that improves the objective function of the RMP. This process is repeated until no improving column exists. A column in SCCD is defined as a value assignment of sequence .
V-A1 MP and RMP
MP is the continuous version of formulation (7). CGA starts with a small subset for any content . This leads to a so-called restricted version of the MP problem referred to as RMP, which is expressed in (8). Denote by the cardinality of .
The SP uses the dual optimal solution to generate new columns. Denote by the optimal solution of . Denote by , and the corresponding optimal dual variables of constraints (8b) and (8c), respectively. Here, and and . After obtaining , checking if is the optimum of MP can be determined by finding a column with the minimum reduced cost for each content . If all these values are nonnegative, then the current solution is optimal. Otherwise, we add the columns with negative reduced costs to their respective sets.
Given , the reduced cost of content for column is . Here, is expression (6) in which and are replaced with their counterparts of optimization variables . To find the column with minimum reduced cost for content , we need to solve subproblem SP, shown in (9). Denote by the optimal solution of SP, i.e., . If the reduced cost of is negative, we add to . Note that term is a constant and thus dropped from the objective function.
Even though this is an ILP, we show that it can be solved in polynomial time by mapping the SP to a shortest path problem.
V-B SP as a Shortest Path Problem
For SP, we construct an acyclic directed graph where finding the shortest path from defined source to distention is equivalent to solving the subproblem. Denote by the total downloading cost for content when all requests over all time slots are served from the server, i.e., . Denote by the updating cost when the content is not stored in the previous time slot, but is stored in the current slot. Denote by the cost related to the dual optimal solution in time slot . Denote by the cost from the requests made for content in time slot with deadline greater than or equal to time slot , i.e.,:
The graph is shown in Figure 2. We first introduce the vertices and then the arcs. Two vertices and are defined to represent the source and destination, respectively. is a vertex representing . For time slot , in total vertices are defined, represented by and , . Vertex represents decision and vertices , , represent decision for the following scenarios. Vertex indicates that the content has not been stored in the cache in slots , i.e., for . Vertex , , indicates the content has been in the cache in time slot , but not in the subsequent time slots until time slot , i.e., and for . These vertices are defined to trace the most recent time slot that the content was in the cache. Tracing enables to define the cost of each arc with respect to deadline.
Now, we introduce the arcs and their weights. There is an arc from to with weight . For time slot , there are two outgoing arcs from , one to with weight and the other to with weight zero. Consider time slot , for vertex there are incoming arcs such that one comes from with weight , and the others come from for with weight , respectively. Selecting vertex in the path means that no request has been served in time slots as for , hence the third term in the weight is defined to serve all requests that are made in time slots with deadline later than or equal to time slot . For each vertex , , there is one incoming arc from with weight zero. For vertex the arc comes from with weight zero. There are arcs from vertices and to all having weight zero.
For each content , SP can be solved in polynomial time as a shortest path problem.
We show that the optimal solution of the subproblem can be obtained from the shortest path of the graph defined above. Assume the optimal solution of SP, i.e., and are given. The path is constructed as follows. One of the following three scenarios may happen in time slot . If , the vertex on the path is . If , the next vertex is . If and for ., the next vertex is . By construction of the graph, this path from to gives the same objective function of SP as and .
Conversely, assume the shortest path is given. For time slot , if the path contains one of the vertices for , we set . Otherwise, the path contains vertex , and we set . As soon as the values of for and are known, the values of for , , and can be easily determined. The value of is set to the first time slot that the request can be served. By the construction of the graph, this solution gives the same objective function value as the shortest path. To clarify why this is correct we give an example. Assume that the shortest path is given which has length . Then, we set for and , , and for all requests that can be served in time slot 2. With these setting of variables, the objective function has the same value as the length of the shortest path, as shown in (10). Based on the rationale illustrated in the example, it is straightforward to conclude the correctness in general.
Finally, the shortest path problem can be solved in polynomial time . Hence, the conclusion. ∎
V-C Rounding Algorithm
As the solution obtained from the RMP (i.e., ) may be fractional, we need a mechanism to obtain a feasible integer solution. One straightforward way is to round the fractional elements of . However, this way of rounding has some limitations. First, the solution may easily become infeasible. Second, even if the solution is feasible, it may be far from the global optimum. Third, when an element of , say , becomes fixed in value, the caching decisions of content over all time slots are made, and consequently there is no opportunity to improve the solution of content .
In order to overcome the above limitations, we make a rounding decision for one content and one time slot at a time. More specifically, the caching decision of content in time slot is made based on the value of , and is the sum of those elements of such that the corresponding columns store content in time slot , that is, . In fact, the value of can be viewed as an indicator of how probable it is to store content in time slot at optimum. In the following we prove a relationship between and and then base our algorithm on this result.
For any content and , is binary if and only if every element of is binary, where .
For necessity, for any content , if is binary for any , , it is obvious that all elements of are binary. Now, we prove the sufficiency. For any content , assume that every element in is binary. Assume that is larger than zero for . As element is either zero or one, the value of for must be either all zero or all one. Otherwise, as , one of the elements of will become fractional. This means that all columns corresponding to for must be the same. Having two columns with the same values violates the fact that the sequences of any two differ in at least one element. Therefore, for any content , if is binary for any , then is an binary for any . Hence the proof. ∎
A family of rounding algorithms can be derived based on how the caching decisions of the contents are made. We do it gradually. First, for content and time slot , if then the decision is to store this content in this time slot, i.e., . Next, we find the fractional element of to being closest to zero or one, and round the value, giving the caching decision of the corresponding content and time slot. Next, the CGA will be applied subject to the rounded values to obtain the new . This process is repeated until a feasible integer solution is obtained. Note that a caching decision for a content and time slot will remain in all the subsequent iterations. An important observation is that the SP with the giving caching decisions still can be solved via shortest path. If , we simply remove vertices , for and , and the arcs connected to these vertices from the graph. If , we remove vertex and its connected arcs.
TRA is presented in Algorithm 2. Symbol is used when a value is assigned to a programming variable and symbol is used when an optimization variable is fixed to take a value. The details of TRA are as follows. First, in Line , is calculated. For each and , if has value one, then TRA fixes in SP by Line 2. In addition, as is fixed to one, the columns in that have value zero in time slot can not be used any more and they are discarded. To achieve this, we fix , , if . This is done by Line 3.
Second, as long as is not an integer solution, then by Theorem 3 at least one element of must be fractional. The fractional value of being nearest to zero and its corresponding time slot and content are calculated by Lines 4-5, and these are denoted by , , and respectively. Likewise, the fractional value of being nearest to one and its corresponding time slot and content are calculated by Lines 6-7, and these are denoted by , , and respectively. If is less than , TRA fixes the value of time slot to zero by Line 9. Furthermore, those columns not compatible with the decision are discarded from . This is done by Line 10. Otherwise, TRA checks whether there is enough spare space to store content . If yes, then the value of time slot is fixed to one in SP by Line 12, and the columns with value zero in time slot are discarded from by Line 13. If no, the value of time slot is fixed to zero by Line 15 and the columns with value one in time slot are discarded from by Line 16.
Note that the above operations may lead to discarding all columns of a content such that the RMP becomes infeasible. To avoid this, an auxiliary column for each content is added such that the column has value one in the time slots that are fixed to one so far, and zero in the other time slots. This is accomplished by Line 18.
V-D Framework of RCGA
Note that as none of the variables in the SPs or RMP is fixed when CGA is applied first time (i.e., in the first iteration of Algorithm 3), the cost from CGA provides a lower bound to the global optimum of SCCD. This lower bound can be used to measure the effectiveness of the final solution from Algorithm 3 or the solution obtained from any other suboptimal algorithm. The RCGA framework is shown in Algorithm 3. The number of iterations needed to obtain a feasible solution is bounded by . Because, each time TRA is used, at least the caching decision of one content in one time slot is made, and as there are contents and time slots, Algorithm 3 terminates in at most iterations.
Vi Greedy Algorithms
In this section, we consider cheap algorithms. We propose two greedy algorithms that deal with one time slot at a time.. These algorithms are developed based on two conventional caching algorithms in the literature, i.e., popularity-based caching (PBC)  and random-based caching (RBC) . In PBC, a content is chosen as a candidate to be stored in the cache based on how frequently it is requested. In RBC, the candidate content will be chosen randomly and proportionally to its popularity. That is, the higher a requested content is, the more likely this content will be selected as a candidate content. Popularity of content in time slot is modeled by the total number of the requests that must to be satisfied in this time slot, namely, all requesters with deadline . Denote by the set of these requests for content in time slot . Denote by