Caching with Time Domain Buffer Sharing

by   Wei Chen, et al.

In this paper, storage efficient caching based on time domain buffer sharing is considered. The caching policy allows a user to determine whether and how long it should cache a content item according to the prediction of its random request time, also referred to as the request delay information (RDI). In particular, the aim is to maximize the caching gain for communications while limiting its storage cost. To achieve this goal, a queueing theoretic model for caching with infinite buffers is first formulated, in which Little's law is adopted to obtain the tradeoff between the hit ratio and the average buffer consumption. When there exist multiple content classes with different RDIs, the storage efficiency is further optimized by carefully allocating the storage cost. For more practical finite-buffer caching, a G/GI/L/0 queue model is formulated, in which a diffusion approximation and Erlang-B formula are adopted to determine the buffer overflow probability and the corresponding hit ratio. The optimal hit ratio is shown to be limited by the demand probability and buffer size for large and small buffers respectively. In practice, a user may exploit probabilistic caching with random maximum caching time and arithmetic caching without any need for content arrival statistics to efficiently harvest content files in air.



page 1

page 2

page 3

page 4


Hit Ratio Driven Mobile Edge Caching Scheme for Video on Demand Services

More and more scholars focus on mobile edge computing (MEC) technology, ...

Cache Placement in Two-Tier HetNets with Limited Storage Capacity: Cache or Buffer?

In this paper, we aim to minimize the average file transmission delay vi...

Exploiting Tradeoff Between Transmission Diversity and Content Diversity in Multi-Cell Edge Caching

Caching in multi-cell networks faces a well-known dilemma, i.e., to cach...

Caching under Content Freshness Constraints

Several real-time delay-sensitive applications pose varying degrees of f...

On Optimal Proactive and Retention-Aware Caching with User Mobility

Caching popular contents at edge devices is an effective solution to all...

Soft-TTL: Time-Varying Fractional Caching

Standard Time-to-Live (TTL) cache management prescribes the storage of e...

Impact of Traffic Characteristics on Request Aggregation in an NDN Router

The paper revisits the performance evaluation of caching in a Named Data...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The explosive growth of mobile multimedia and social networking applications has stimulated a corresponding explosively increasing demand for bandwidth in the emerging fifth-generation of mobile systems (5G). However, standard on-demand transmission infrastructures can hardly cope with the dramatic increase of mobile data traffic, due to the scarcity of wireless resources such as radio spectrum, constrained average or peak power, and limited base station sites. Moreover, the efficiencies of many radio resources have already achieved their theoretical limits after being heavily exploited in the past decades.

Caching holds the promise of trading cheap storage resources for substantial throughput increases in content-centric networks. By allowing a user to cache popular content items before requested, the peak data rate can be significantly reduced. Therefore, caching becomes a key solution in the era of 5G to meet the stringent requirements of data rate, energy efficiency, and Quality-of-Service (QoS) [1] - [3]. To fully exploit the caching gain, context-awareness is enabled in paradigm-shift 5G network architectures, where social networking [4], big data analytic [5], recommendation systems [6]

, and natural language processing

[7] are adopted for popularity prediction. Then the synergy between communication, caching, and computing is regarded as the cornerstone in building versatile 5G-grade deployments [8], [9].

The applications of caching in 5G motivate an extensive study on the communication-storage tradeoff that characterizes a fundamental limit of caching. Maddah-Ali and Niesen first revealed the tradeoffs between the delivery rate and memory size for uniform demands [10] and nonuniform demands [11]. The rate-memory tradeoff of coded caching for multilevel popularity was presented in [12]. Aiming at characterizing the memory-rate tradeoff better, Sengupta and Tandon proposed a tighter lower bound on the worst-case delivery rate in [13]. To further improve the rate-memory tradeoff, Amiri and Gündüz conceived an enhanced coded caching, in which content items are partitioned into smaller chunks [14]. When the storage resources are shared, how to efficiently allocate them among multiple users becomes a critical issue. Optimal storage allocations were proposed for wireless cloud caching systems by Hong and Choi [15], and for heterogeneous caching networks by Vu, Chatzinotas, and Ottersten [16]. In wireless device-to-device networks, caching is also expected to bring dramatic capacity gain. Given cache size constraint, the capacity upper and lower bounds were revealed by Ji, Caire, and Molisch in [17]. More recently, scaling laws for popularity-aware caching with limited buffer size were found by Qiu and Cao in [18].

The demand probability is shown to play a key role in the rate-memory tradeoff. In most existing works, it is assumed to be time-invariant. In practice, however, content popularity may not remain unchanged. More particularly, a content item may become more popular, e.g. due to information propagation in social networks, but will finally be expired at the end of its lifetime. As a result, time-varying content popularity models were proposed in [19], [20], and [21]. In our previous work [21]

, we were aware of a fact that most data is requested only once. Based upon this observation, we defined the request delay information (RDI) as the probability density function (

) of a user’s request time/delay for a content item. In practice, the RDI can be estimated from a content item’s labels or key words and then applied to to predict the user’s request time for a content item. The communication-storage tradeoffs of caching with various RDI were revealed in

[21] and [22] for unicast and multicast transmissions respectively.

The time-varying popularity allows a user to remove a cached content item from its buffer when this content item becomes outdated or less popular.111In this paper, the two terms buffer and memory are used interchangeably. Once a content item is removed, the occupied buffer space can be released in order to cache other content items. As such, the user may reuse its buffer in the time domain. When time domain buffer sharing is enabled, recent works [23] and [24] show that the storage cost is also an increasing function of the content caching time. In general, however, how to efficiently share the buffer in the time domain still remains as an open but fundamental problem.

In this paper, we are interested in storage efficient caching based upon time domain buffer sharing. Our aim is to maximize the caching gain for communications, while limiting the storage cost. In particular, we shall determine whether and how long a user should cache a content item, according to its RDI. We first focus on the infinite-buffer assumption that allows us to formulate an infinite-buffer queue model. In the queueing-theoretic model, Little’s law [25] is adopted to bridge the storage cost and the maximum caching time, thereby giving the communication-storage tradeoff. To strike the optimal tradeoff between the hit ratio and the average buffer consumption, we present two storage efficient caching policies for the homogenous and heterogenous RDI cases, in which all content items have the same and different RDI respectively. With homogenous RDI, we conceive a probabilistic caching policy in which the random maximum caching time obeys a certain distribution. With heterogenous RDI, we allocate the storage cost among different content classes to maximize the overall storage efficiency. The storage allocation is formulated as a convex optimization, the optimal solution of which has a simple structural result. The solution also leads to a decentralized arithmetic caching without any need for the global knowledge of content arrival process. It allows a user to efficiently harvest popular content items in air and decide how long a content item should be cached, simply based on the user’s local RDI prediction.

The communication-storage tradeoff under the infinite-buffer assumption lays a theoretical foundation for storage efficient caching with finite buffer, which is more practical in general. In this scenario, the hit ratio is jointly determined by that of the infinite-buffer caching, as well as, the blocking probability due to buffer overflow. To obtain the blocking probability, we formulate a queue model and then apply the diffusion approximation. By this means, we show that the hit ratio maximization is approximately equivalent to a one-dimensional quasi-convex optimization, the variable of which is the mean caching time. Two approximate but analytical solutions are presented for large and small buffers, in which the hit ratios are limited by the demand probability and buffer size respectively. For decentralized implementation without statistics of content arrivals, arithmetic caching with finite buffer is also conceived.

The rest of this paper is organized as follows. Section II presents the system model. Based upon the infinite buffer assumption, Sections III and IV investigate the storage efficient caching with homogeneous and heterogeneous RDIs respectively. In Section V, the more practical finite-buffer caching is presented. Finally, numerical results and conclusions are given in Sections VI and VII, respectively.

Ii System Model

Fig. 1: Caching with time domain buffer sharing.

Consider a caching system, as shown in Fig. 1. The th () content item transmitted by the base station (BS) is denoted by , which consists of bits.222Throughout this paper, the terms content file and content item are used interchangeably. The transmission of is accomplished at time . From the receiver’s perspective, the content arrival process is defined as , with content interarrival time given by . Assuming that the limit exists, we present the content arrival rate as .

The user may ask for content file after a random delay from ’s being sent. If will never be requested by the user, then we set . The of is denoted by , also referred to as the RDI of . We categorize content items into classes according to their statistical RDI. Let , , denote the th class, where all the content items have the same statistical RDI denoted by , i.e., . If , the content flow has homogenous RDI. In this case, the class index

is dropped. Otherwise, the content flow has heterogenous RDI. Moreover, the cumulative distribution function (

) of is denoted by . In this context, the probability that a content item of class will not be requested after transmission is given by . In other words, its demand probability is . Furthermore, the minimum and maximum possible request delay are presented by and , respectively. Let stand for the conditional expectation of given , i.e., or .

We assume that the limit exists for all , which represent the average number of bits per content item of class . Let denote the probability that a content item belongs to class . Therefore, the average number of bits per content item is obtained by (bits). The overall transmission rate of the base station is given by (bits/second). Furthermore, the probability that a bit belongs to class is determined by .

Next, we present the performance metrics of the communication gain and storage cost of caching. In particular, the caching gain for communications is characterized by the effective throughput, which is defined to be the average number of bits that the user reads from its buffer in unit time. Since the effective throughput increases linearly with the hit ratio, the hit ratio is regarded as an alternative performance metric for the caching gain. Let denote the effective throughput contributed by content items of flow . Therefore, the sum effective throughput or the overall caching reward is obtained by . The storage cost is characterized by two different performance metrics that are applicable in the infinite and finite buffer scenarios respectively. When the user is equipped with an infinite buffer, the storage cost is the average buffer consumption that is defined as the average number of bits cached in the receiver buffer. Let denote the average number of cached bits from flow . Then the overall storage cost is obtained by . When the user is equipped with a finite buffer, the storage cost is simply the buffer size.

Finally, our aim is to efficiently share or reuse the receiver buffer in the time domain. To this end, we shall carefully decide whether and how long the user should cache a content item. Intuitively, a content file should be removed from the buffer if it has been requested and read, because in practice a BS is seldom asked to transmit the same content file to a user twice.333If the user thinks he or she will read a content file again, this content file can be stored locally. However, such storage cost is beyond the scope of this paper. On the other hand, we should carefully limit the caching time of each content item in order for efficiently sharing the buffer in the time domain. To achieve this, let denote the maximum caching time for content file . In other words, must be removed from the user’s buffer after being cached for time , even though it has not been read.

A simple caching policy is static caching, in which all content items of flow have the same maximum caching time, i.e., for all . A generalized caching policy is probabilistic caching, in which

can be a non-negative random variable. Let

denote the of the maximum caching time of content class . When , the probabilistic caching reduces to the static caching with deterministic maximum caching time . In other words, static caching is a special case of probabilistic caching. In this work, our main target is to find the optimal maximum caching time or its probability density function to maximize the effective throughput under the constraint of the storage cost.

Iii Caching with Infinite Buffer and Homogenous RDI

In this section, we focus on caching with infinite buffer and homogenous RDI, thereby dropping the content class index . Based on the effective throughput and storage cost analysis of static caching policies, we present a normalized rate-cost function, which characterizes the communication-storage tradeoff. A probabilistic caching policy is further conceived to achieve the optimal rate-cost tradeoff.

Iii-a Effective Throughput and Storage Cost of Static Caching

In this subsection, we consider static caching policy with a fixed maximum caching time . The effective throughput and the storage cost are presented as two increasing functions of .

Lemma 1

The effective throughput is given by


Given the of the request delay, , and the maximum caching time , the hit ratio is obtained by


Since the effective throughput is equal to , the lemma follows.

Fig. 2: A queueing-theoretic model of infinite buffer.

Next, we analyze the storage cost by formulating a queueing-theoretic model as shown in Fig. 2, in which a content file arrives after being transmitted and departs when it is removed. Although it is clearly not a first-in-first-out (FIFO) queue, Little’s law [25] is still applicable, which gives the following lemma.

Lemma 2

The average buffer consumption is given by


The average number of content files cached in the buffer is equal to the average queue length . According to Little’s law, we have , in which denotes the random caching time of a content file. Consider a content file with request delay . We have if this file is requested before its maximum caching time, i.e., . Otherwise, we have . Therefore, its caching time is given by


the expectation of which is then determined by


Recalling that the probability that a content item will never be requested is , we get Eq. (3).

Lemmas 1 and 2 imply that both the effective throughput and storage cost monotonically increase with the maximum caching time . As a result, there exists a fundamental tradeoff between them, as shown in the following theorem.

Theorem 1

The storage cost is an increasing function of the effective throughput given by


where , which is the inverse function of when is continuous.


Since Eq. (1) can be rewritten as , the maximum caching time is given by


By substituting Eq. (7) and into Eq. (3), we obtain Eq. (6).

Iii-B Normalized Rate-Cost Function of Static Caching

In this subsection, we normalize both the storage cost and the effective throughput by the BS’s throughput given by . More specifically, the normalized storage cost is defined as . From Eqs. (3) and (5), we see that the normalized storage cost is equal to the mean caching time, i.e., . From Eqs. (1) and (2), we also see that the normalized effective throughput is equal to the hit ratio defined by Eq. (2), i.e., . From Eq. (6), we may write the normalized rate-cost function as


the domain of which is . From Eq. (3), one can see that the codomain of is if and . Otherwise, we have . Since , the storage efficiency is irrelevant to the BS’s transmission rate but simply relies on the statistical RDI . As a result, the tradeoff between the effective throughput and the storage cost is equivalent to the tradeoff between the hit ratio and the mean caching time.

Since relies only on , it can be regarded as a transform from , which we shall refer to as the cost- transform (CP transform for brevity), denoted by or . It is easy to check that for any . Hence, any is equivalent to its normalized form given by , on which we shall focus in the remainder of this paper. Tables I and II present the CP-transforms of five typical s, as well as, some interesting properties of the CP-transform to calculate or bound easily based on the CP-transform of a standard . From Table I, we see when

. This is not surprising because the exponential distribution is memoryless. In the following, we show it is not only sufficient but also necessary.

Corollary 1

if and only if .


Our proof relies on the observation that and . Therefore, if and only if

. The solution to this first-order linear ordinary differential equation is

. Recalling the boundary value conditions and , we have and , which completes our proof.


TABLE I: CP-transforms of typical s
Distributions p.d.f CP transforms
Exponential ,
Uniform ,
Triangular ,
Pareto ,
Arcsine ,


TABLE II: Key Properties of the CP Transform
Operators CP transforms
Time scaling,
Time shift,
Density scaling,
Rate shift,
Stochastic order relation ,
Linear combination

Iii-C Probabilistic Caching with Homogeneous RDI

In this subsection, we shall show that the rate-cost tradeoff can be further optimized by the probabilistic caching policies, due to the following observation.

Lemma 3

If two rate-cost pairs and are achievable, then the rate-cost pair given by is also achievable for .


The desired rate-cost pair is achieved by randomly choosing the maximum caching time for each content file, i.e., with probability and with probability .

By applying probabilistic caching, any linear combination of the achievable rate-cost pairs in the rate-cost curve characterized by Eq. (8) is also achievable. Furthermore, since there is no need to cache any content item when , the storage cost is . In summary, the following theorem holds.

Theorem 2

The optimal rate-cost function is the lower convex envelope of and given by

According to Theorem 2, the probabilistic caching policy brings a storage efficiency gain if and only if , or more specifically, is non-convex or . To shed some new light on , we are interested in the concavity or convexity of . Consider that is differentiable for . Then Faà di Bruno’s formula gives444Eq. (9) is also obtained by applying Example 29 in Chapter 5 of [26].


where . Eq. (9) implies that the concavity and convexity of is determined by . Once becomes either concave or convex, the optimal rate-cost function can be significantly simplified, as shown in the following theorem.

Theorem 3

The optimal rate-cost function is given by


where .


When , is a concave function satisfying . Since , the lower convex envelope of is a straight line joining the points and , the slope of which is . When , is concave. Moreover, we have because . As a result, the lower convex envelope of is itself.

Recalling that , we may obtain a sufficient but not necessary condition for the concavity or convexity of , which further simplifies Eq. (10) to be


Clearly, if is a nondecreasing function on the interval .

Iv Caching with Infinite Buffer and Heterogenous RDI

In this section, we focus on the infinite buffer scenario with heterogenous RDI, where there exist multiple content classes , . To distinguish them, a flow index is assigned to the notations developed in the previous section. In this case, how to efficiently allocate storage resources among various content classes becomes a critical issue.

Iv-a Joint Rate-Cost Allocation

Our aim is minimize the overall storage cost given a target normalized effective throughput, denoted by , which is feasible when . Since denotes the probability that a cached bit belongs to class , the normalized effective throughput and storage cost are expressed as and , respectively. Moreover, we have the effective throughput and the average buffer consumption . Therefore, the joint rate-cost allocation is equivalent to the joint normalized rate-cost allocation formulated as


Problem (12) is a convex optimization problem that can be solved in low complexity, because the rate-cost functions are convex for . Let denote the inverse function of the derivative of the normalized rate-cost function of class , i.e., .555If , we have . If , then . Then a structural result for problem (12) is presented in the following theorem.

Theorem 4

The optimal solution to (12) is given by


where is a positive number satisfying .666A careful reader may notice that may not be invertible. In this case, can be any solution that satisfies and .


It is easy to check that Eq. (13) holds by using the method of Lagrange multipliers.

Since monotonically increases with , we may adopt a binary search algorithm to find in low complexity. When is concave for all , Problem (12

) reduces to a linear program given by


the optimal solution of which is given by


where is the index of with the th largest and .

Iv-B The Optimal Probabilistic Caching with Heterogenous RDI

Having solved the joint rate-cost allocation problem, we next turn our attention to the optimal probabilistic caching, or more specifically, the maximum caching time of content class . If , should be seconds. In this case, a content item of with and is always cached until it is requested. If , then the hit ratio of should be . To achieve this hit ratio, we consider two possible scenarios. If , the maximum caching time is determined by . Otherwise, there must exist a probability and two rate-cost pairs and , , satisfying . In this case, we randomly set the maximum caching time to be with probability and to be with probability . When , the user does not cache the content item with probability .

Iv-C Arithmetic Caching

In this subsection, we are interested in a practical situation, in which a user may not have any global knowledge about the content arrival processes, namely, and . Therefore, it is not possible for the user to formulate a joint rate-cost allocation problem (12). Fortunately, the structural result in Theorem 4 implies an arithmetic caching method without any need of and .

The design purpose is to maximize the effective throughput, while limiting the average buffer occupation to be less than or equal to a target value . To achieve this goal, the user estimates its average storage cost locally, which is a function of , denoted by . If , i.e., the receiver buffer is under-utilized, then we should increase to achieve higher effective throughput. If , i.e., the receiver buffer is over-utilized, then we should decrease to reduce the storage cost. There are many efficient algorithms to update , e.g., , where is the step size. However, how to increase the convergence speed is beyond the scope of this paper. Given , the user will decide whether and how long a content file should be cached based on its RDI , which can be estimated according to its key words or labels. In other words, the user is capable of harvesting storage-efficient content files in air.

Iv-D Communication-Storage Tradeoff with heterogeneous RDI

Given a target hit ratio , we may obtain the minimum storage cost, or the mean service time, by solving problem (12). As a result, the overall rate-cost function characterizes the optimal communication-storage tradeoff with heterogenous RDI.

Let denote the overall cost-rate function, which is the inverse function of . In other words, presents the maximum hit ratio given a mean caching time . The domain of is given by , where . Furthermore, is bounded by . Similar to the proof of Lemma 3, we may show their concavity and convexity in the following.

Lemma 4

and are increasing concave and convex functions, respectively.

Noting that a convex function can be uniformly approximated by a -function, we may assume that and are differentiable. Without solving problem (12), the structural result (13) implies some useful properties of , as shown in the following lemma.

Lemma 5

The first order derivative of for and can be determined by and , respectively. When are concave for all , and . When are convex for all , and .

Iv-E A Unified Framework of Probabilistic Caching

It is worth presenting a unified framework of probabilistic caching. Given the s of the maximum caching time of content class , , the hit ratio of content class is given by

from the law of total probability. As a result, the overall hit ratio is presented by


where means the overall undemand probability. From Eq. (4) that represents the random caching time for a given , the of the caching time of content class is determined by . Therefore, the and expectation of the caching time are presented by


A careful reader may see that the minimization of the storage cost in Eq. (18) subject to a hit ratio constraint on in Eq. (16) is a variational problem, because are probability density functions. Fortunately, it is equivalent to the joint rate-cost allocation problem (12), which is a convex optimization. Then the solution to problem (12), namely, given by Eq. (13), allow us to infer as shown in subsection IV-B. This explains why we focus on the static caching along with its probabilistic time-sharing policy first in Section III. However, the unified framework of probabilistic caching is useful in the next section where caching with finite buffer are considered.

V Caching with Finite Buffer

In this section, we investigate caching with finite buffer, the size of which is denoted by . By buffer size , we mean the buffer may cache at most content items.777Throughout this section, we assume that all the content items have the same file size, i.e., , for all , and the buffer may cache at most bits. With a finite buffer, a content file cannot be cached when the buffer is full. Since the dropped content files have no contribution to the effective throughput, we have , where denote the blocking probability of content items due to buffer overflow. As a result, we shall first present the blocking probability . Then the effective throughput is maximized under the buffer size constraint. It is interesting to see that the cost-rate function of caching with infinite buffer also plays a key role in the finite buffer caching system.

Fig. 3: A model of finite buffer.

V-a Blocking Probability of Content Items

Using Kendall’s notation, we formulate a queue model888The queueing model of Lemma 2 is not a queue because customers may occupy different sizes of buffer space. Fortunately, the proof of Lemma 2 only relies on Little’s Law, that holds for arbitrary queues. of the buffer state in order to derive . Each content item is regarded as a customer, the interarrival times of which, , have a general distribution. The customer’s time spent in the equivalent system is equal to the caching time of the corresponding content file, which is independent of the queue state. As a result, once a customer joins the queue, it is served immediately by one of the parallel servers. Its service time is also equal to the the corresponding content item’s caching time. Furthermore, there are parallel servers in total given the buffer size . If all servers are busy when a content file arrives, it has to be dropped.

The model, and especially its special case model, have been extensively studied in the call admission control (CAC) of classic circuit switched networks. In the model, the interarrival times are independent and obey the exponential distribution. Its accurate blocking probability is given by the Erlang B formula, i.e., , which is defined as


where is the mean service time, or equivalently, the mean caching time.

Next, we turn our attention to the more general model. A diffusion approximation provides a very well approximated blocking probability in the heavy traffic scenario with large values of and . From [27], we have , which is defined as


where and denote the standard normal probability and cumulative distribution functions, respectively, and denotes the asymptotic peakedness of the arrival process with respect to the of the caching time, . More particularly, the asymptotic peakedness is given by


where , as shown by Borovkov in [28]. For a renewal process , we have . In contrast to the model, the blocking probability of the model is determined by in Eq. (17), rather than the mean caching time only. Furthermore, it is worth noting that Eqs. (19) and (20) are consistent with each other for Poisson arrival, because the model is a special case of the model. More specifically, we have and in the model. As a result, Eq. (20) is reduced to , which is approximately equal to according to Hayward approximation [27].

Having obtained the blocking probability , we next present some useful bound and approximation to be adopted in the next subsection.

Lemma 6

For large values of and , the blocking probability is upper bounded by


where .


Our proof starts with an observation from Eq. (20) that . According to Hayward approximation [27], we have . By recalling that , we have in Eq. (21). Therefore, is upper bounded by . Since is an increasing function of , Eq. (22) holds.

Lemma 6 implies that the blocking provability of a queue with arrival rate is upper bounded by that of an queue with arrival rate . More importantly, The upper bound in Eq. (22) relies on the mean caching time only, rather than the of the caching time . For large and , Eq. (22) also provides a well approximated blocking probability due to the scaling property of Erlang B formula. Alternatively, when , there exists a more simple approximation of , as shown in the following lemma.

Lemma 7

For , the blocking probability is approximated by


See Eq. (26) in subsection 6.3 of [27].

V-B Effective Throughput of Caching with Finite Buffer

In this subsection, we shall maximize the effective throughput of caching with finite buffer. Recalling that , we present the effective throughput maximization problem for the general system as