The RICH Prefetching in Edge Caches for In-Order Delivery to Connected Cars

08/20/2019
by   Ahsan Mahmood, et al.
Politecnico di Torino
0

Content caching on the edge of 5G networks is an emerging and critical feature to quench the thirst for content of future connected cars. However, the tight packaging of 5G cells, the finite storage capacity at the edge, and the need for content availability while driving motivate the need to develop smart edge caching strategies adapted to the mobility characteristics of connected cars. In this paper, we propose a scheme called RICH (RoadsIde CacHe), which optimally caches content at edge nodes where connected vehicles require it most. In particular, our scheme is designed to ensure in-order delivery of content chunks to end users. Unlike blind popularity decisions, the probabilistic caching used by RICH accounts for the user mobility information that the system can realistically acquire. Furthermore, we provide a complete system architecture and define the protocols through which the different system entities can interact. We assess the performance of our approach against state-of-the-art solutions, under realistic mobility datasets and system scenarios. Our RICH edge caching scheme improves significantly the content availability at the caches and reduces the required backhaul bandwidth, with beneficial effects for both the end users and the network operators.

READ FULL TEXT VIEW PDF
06/21/2020

Applying Machine Learning Techniques for Caching in Edge Networks: A Comprehensive Survey

Edge networks provide access to a group of proximate users who may have ...
01/21/2020

Online Caching and Coding at the WiFi Edge: Gains and Tradeoffs

Video content delivery at the wireless edge continues to be challenged b...
08/31/2020

SOPI design and analysis for LDN

Liquid Data Networking (LDN) is an ICN architecture that is designed to ...
01/20/2019

Cloud-Aided Interference Management with Cache-Enabled Edge Nodes and Users

This paper considers a cloud-RAN architecture with cache-enabled multi-a...
10/30/2020

Unsatisfied Today, Satisfied Tomorrow: a simulation framework for performance evaluation of crowdsourcing-based network monitoring

Network operators need to continuosly upgrade their infrastructures in o...
11/28/2017

Online Edge Caching and Wireless Delivery in Fog-Aided Networks with Dynamic Content Popularity

Fog Radio Access Network (F-RAN) architectures can leverage both cloud p...
11/12/2019

Reversing The Meaning of Node Connectivity for Content Placement in Networks of Caches

It is a widely accepted heuristic in content caching to place the most p...

I Introduction

Connected cars are considered by drivers as a projection of their homes on the road, and the same seamless connectivity and content-based services are expected on the road as well. Communication of connected cars with the network infrastructure can support a wide variety of applications, ranging from critical, safety-related applications to the provision of entertainment services (e.g., video distribution). An example of the former category is “See-Through” [28], an advanced driving assistance automotive use case, where vehicles receive video stream of road conditions from vehicles in front of them to clear the obstructed view and perform actions such as overtaking and lane changing. Given the latency-sensitive nature of these applications, a deployment of servers and content at the network edge, hence close to vehicular users, is highly desirable [19]. This requirement, along with the expected increase in the number of connected cars, calls for a significant redesign of the network architecture in order to support high-performance connectivity and reduce the core network congestion due to cloud-based content and applications.

Mobile Edge Computing (MEC), one of the key technologies for 5G networks, provides computing and storage platforms at the edge of the mobile network [13, 20]. MEC can therefore be used to push content to edge nodes, in close proximity to connected cars. A major limitation of this approach, however, is that edge nodes do not have the same storage flexibility as the cloud, and efficient strategies have to be developed to store the right content at an Edge Node (EN) (e.g., cellular base station, AP, roadside unit). Also, the thirst for wireless capacity has led to a reduced coverage size of ENs, which requires content to be replicated in multiple ENs to meet the user demand.

It is worth stressing that, although caching policies have been widely investigated (e.g., [16, 2, 27]), connected cars add significant challenges to the problem of edge caching. They are, indeed, highly-mobile vehicles, thus storage strategies should be optimized not with respect to the current content popularity, rather to the expected content popularity among users that are about to enter the coverage of ENs. Furthermore, the dynamics of connected cars, augmented by the limited coverage size of ENs, require content to be stored where cars have a chance to actually download it, i.e., in slow-speed areas such as congested intersections.

Most of the existing mobility-based caching policies, e.g., in [32, 8], require full knowledge of the trajectory of each car, rising concerns about drivers’ privacy. Also, they are not tailored to in-sequence delivery of content [11, 4, 9], typically required by future on-board streaming applications. A major design challenge for caching policies is therefore to rely only on coarse mobility information (sequences of waypoints, dwell time, etc.), while supporting in-sequence content delivery.

In this paper, we propose RICH (RoadsIde CacHe), a prefetching policy for edge caching specifically adapted to highly dynamic environments with a coarse knowledge of car trajectories. We consider an urban environment where cars can connect to ENs, in order to download content, as shown in Fig. 1. Our approach is based on the knowledge of the sequence of travel waypoints, and some aggregate statistics about the distribution of the dwell time under the coverage of each EN. Note that this is in accordance with the current trend in 5G systems [1]

, which foresees a dynamic caching system to prefetch content in the ENs, based on future demand estimation obtained by

users’ context information, such as direction and speed.

Furthermore, unlike classical works on caching, we consider a data streaming application in which the content is divided into fixed-size chunks, strongly correlating the download process at each EN, due to the in-sequence chunk delivery. Importantly, we also account for the correlation that mobility introduces in the request process among different ENs. Indeed, the instantaneous popularity of a chunk at an EN depends on the actual temporal and spatial trajectory of all cars interested in the corresponding content. The goal of our strategy is thus to cache in advance the chunks in the sequence of ENs traversed by the car, by choosing the chunks that will be most likely downloaded at each specific EN. This increases the cache hit probability

, i.e., the probability that the content chunks are downloaded directly from the cache, which in turn greatly reduces the backhaul traffic and content access delay.

Fig. 1: Scenario: a car traversing multiple Edge Nodes (ENs).

In more detail, our novel contributions are as follows:

  • Architecture and protocol: we introduce a split content caching architecture, with a centralized module located in the backhaul, which manages the content stored in the caches located at the ENs. This centralized approach is amenable to being realized through the Software Defined Networking (SDN) paradigm. We also define the system protocol governing the interaction between various entities in the vehicular network.

  • Analytical formulation: we describe an analytical model capable of predicting the probability of downloading a chunk of a content from a given EN.

  • Caching scheme: we leverage the previous model to develop a mobility-aware prefetching strategy that selects what (i.e., which chunk) and where (i.e., in which ENs) to cache, based on the distribution of the dwell times.

  • Implementation: we evaluate our solution under a realistic urban traffic dataset of the city of Bologna. Our simulation model is comprehensive and it closely mimics a real scenario. We compare our proposed RICH scheme to two state-of-the-art prefetching policies: pure popularity-based caching (POP) and a mobility-aware caching strategy called netPredict [32].

Ii Network Architecture

Fig. 2: Network caching architecture

We consider an urban scenario in which vehicular users traverse coverage areas of several ENs, where each EN is responsible for the communication in a specific geographical area, as depicted in Fig. 1. Due to the dynamic nature of the vehicular network, along with a large number of users and a volatile wireless channel, users downloading data may experience severe service disruptions. Our goal is to design a caching architecture that provides users with a timely delivery of the content, while also reducing the backhaul bandwidth consumption.

We focus on a challenging application such as content streaming, for which in-order delivery should be ensured. We let ENs cache parts of the content requested by the vehicular users, and we envision the presence of a Prefetcher module that instructs, in advance, each EN on which part of the content to store. In this way, the content portion that each user is expected to download from an EN, will be readily available for delivery at the EN when actually reached by the user.

In the proposed architecture shown in Fig. 2, the Prefetcher is located in the network backhaul, along with a Data Store module. Both of them are connected with all the ENs. The Data Store provides a catalog for all the contents, each of which is assumed to be composed of a sequence of chunks, each univocally identified. The Prefetcher can acquire (e.g., through the user’s navigation system) or predict the sequence of ENs that the user will traverse in the near future. Also, thanks to past measurements, the distribution of the cars’ dwell time under each EN is known to the Prefetcher, as well as the available storage in the caches of the ENs and of the users’ content requests. Based on such information, the Prefetcher defines which chunks of which content should be cached in advance at which EN, and instructs the ENs accordingly. Upon getting the instructions from the Prefetcher, each EN retrieves the needed chunks from the Data Store before arrival of the user in the coverage area. This allows leveraging non-real-time data transfer and thus relaxes the QoS requirements for the content transfer into the caches.

Each EN is responsible for providing streaming service to the user, and, in order to provide timely content delivery, it utilizes a local caching storage, divided in two parts:

  • the Prefetch Cache, which contains the chunks prefetched from the Data Store, as per Prefetcher’s instructions;

  • the Standard Cache, which stores chunks that were not available in the Prefetch Cache at the time the user requested them, and they have been retrieved from the Data Store.

Notably, the latter cache mimics the behavior of a standard caching system, where a new chunk is stored just after a cache miss is experienced. On the contrary, the Prefetch Cache behaves similarly to a content delivery network (CDN) node, where the content is proactively stored into the node.

Moreover, the EN carries out its tasks via two agents:

  • the Streaming Agent reads the chunks from the cache and streams them to the user; one separate stream is managed for each user under coverage;

  • the Caching Agent interacts with the Prefetcher, fetches chunks from the Data Store, inserts the prefetched/missed chunks into the Prefetch/Standard Cache, and manages both the Prefetch and the Standard Cache according to the adopted eviction policy.

Ii-a System protocol

The Streaming Agent contacts the Caching Agent whenever a new content request arrives, as well as whenever a requested chunk is not available in the cache. The Caching Agent, in turn, informs the Streaming Agent as the needed chunk becomes available in the cache. The interactions between the aforementioned entities are further clarified by the space-time diagrams in Figs. 3 and 4, which depict the protocol followed, respectively, when a user starts requesting a content and while the car connects with the subsequent ENs to complete the content download, if necessary. In both cases, the proposed protocol is composed of two phases:

  • prefetching phase: an EN prefetches the chunks from the Data Store, as per Prefetcher’s instructions, and the EN serves the user on the basis of the chunks stored in the Prefetch Cache;

  • data recovery phase: the EN fetches the missing chunks from the Data Store and sends them in sequence using both the Prefetch and the Standard Cache, if needed.

In both figures, chunks downloaded from the Prefetch and from the Standard Cache are depicted in green and red, respectively. All control messages are depicted in blue.

We assume, for simplicity, that a generic content is divided into equal sized chunks, and we define as the set of all the corresponding chunk identifiers. Let be a generic chunk identifier of content and denote the corresponding actual data. Let be a generic user and/or its car (the notation is summarized in Tab. I).

In more details, Fig. 3 refers to a scenario in which car enters the coverage area of the first EN along its path and the user requests the first chunk of content . At the EN, the request is forwarded to the Prefetcher111The actual implementation of this step is more complex since the request is sent to the Streaming Agent, which registers a new stream for the user and, in turn, forwards the chunk request to the Prefetcher through the Caching Agent. In the following we will omit such level of details in the interaction among the modules internal to the EN.. Since the Prefetcher knows the sequence of ENs the car will traverse, it can run the prefetching policy (described in Sec. III) and instructs the first EN to prefetch the set of chunks of content . The EN, through the Caching Agent, checks which of the required chunks are already available in the Prefetch Cache and sends a message to the Data Store asking for the missing ones, denoted by . The Data Store sequentially sends the chunks to the EN, i.e., all chunks . The Caching Agent inserts the received chunks in the Prefetch Cache and informs the Streaming Agent about the data availability. Then the Streaming Agent initiates the data stream towards the user, sending the chunks in sequence.

If a chunk is not available in the Prefetch Cache, the system enters the data recovery phase, as the data must be retrieved directly from the Data Store. The latter situation may happen in two cases: (i) some prefetched chunks were evicted from the cache to make space for other chunks; (ii) all the prefetched chunks have been transmitted and the car is still under coverage. In both cases, the Streaming Agent attempts to serve the car by getting the chunks from the Standard Cache. Otherwise, the missing chunks, denoted by , are fetched from the Data Store for transmission to the user. is chosen large enough to compensate for the round-trip time from the Prefetch Cache to the Data Store and to guarantee a continuous streaming from the Data Store to the user, using the EN as relay until the car leaves the coverage area.

Fig. 4 shows only the prefetching phase occurring at the second EN and at the subsequent ENs traversed by the car. The eventual data recovery phase is omitted as it is identical to the one in Fig. 3. When the car enters the coverage of the first EN, the Prefetcher instructs all the subsequent ENs to prefetch the required chunks before the arrival of the car in their coverage areas. This ensures that the user experiences no delay due to content prefetching in the subsequent ENs222 Note that the delay due to non-overlapping coverage areas, if any, can be avoided by assuming a sufficiently large playout buffer at the user; the buffer size would depend on the type of network deployment.. As shown in Fig. 4, the EN receives a message to prefetch the chunks . For the subset of chunks in that are not yet available in the cache, the EN asks the Data Store and stores the corresponding chunks in the Prefetch Cache.

When the car enters the EN coverage, it sends a chunk request for the first missing chunk . The EN forwards this message to the Prefetcher in order to notify it about the car entrance in the coverage area. The EN is responsible to send all the required chunks to the car which are available in the local cache. Eventually, the Prefetcher may trigger a data recovery phase to compensate for missing chunks in the cache.

A generic user and/or its car
Content identifier
Generic chunk identifier
The set of all chunk identifiers of content
Data corresponding to chunk of content
A generic subset of representing the chunks
of content instructed-to-be/actually prefetched
TABLE I: Notation
Fig. 3: Protocol followed by the first EN traversed by a car.
Fig. 4: Protocol followed by the ENs after the first EN traversed by the car.

Iii RICH prefetching policy

Given the caching architecture discussed above, our goal is to maximize the Prefetch Cache hit probability across all cars. Indeed, the user throughput increases whenever the desired chunks are locally downloaded from the EN and not from the backhaul, while leading to better utilization of the backhaul network resources. We therefore propose an efficient strategy to be run at the Prefetcher, which selects the chunks to store in the Prefetch Cache (hereinafter simply called cache) of the ENs traversed by a car. We recall that the car path is expressed as the sequence of traversed ENs, and the distribution of the dwell time under the ENs is obtained through historical data.

Note that the above caching problem could be formalized as an ILP problem by discretizing the time into steps and associating a binary decision variable with each step, each EN, and each chunk of every content, expressing whether that chunk of that particular content is stored in a particular EN at that time step. Then, assuming that the sequence of ENs traversed by the cars, the time at which they enter an EN, and the cars content demand are known, we could express formally the hit probability to maximize over all cars and all chunks. However, such formulation would need global knowledge of the system dynamics over an extended period of time; also, given that the cars dwell time is random, its solution would require stochastic optimization methods. In light of the limited knowledge that is available in practice and of the problem complexity, below we design an approximated method, which, as shown in Sec. V, proves to be very efficient.

We start by motivating our approach using a toy-case example and highlighting the benefits of knowing the distribution of the cars dwell time under an EN. Then we theoretically evaluate the probability that a specific chunk is downloaded from an EN by a tagged car, which will be used to define our prefetching policy. Indeed, our approach consists in letting each EN store those chunks whose probability to be downloaded by a car is above a given threshold.

Iii-a A toy-case example

Consider a single EN covering the area of an intersection controlled by a traffic light. The dwell time of each car and, thus, the number of downloaded chunks depends on the traffic light state. Let us now focus on the subset of cars that request a specific content starting from its first chunk. Assume that 80% of these cars experience green light, thus the corresponding dwell time under the EN is small and the users (labeled by “fast”) can download up to 10 chunks. The 20% of these cars experience red, thus the corresponding dwell time under the EN is large and the users (labeled by “slow”) can download up to 100 chunks. Hence, the average number of chunks that can be downloaded by a car is 28.

Assume now that the system adopts a prefetching policy that makes its decisions based on the average number of chunks downloaded by the users, as, e.g., the state-of-the-art netPredict policy discussed later in Sec. IV-C. Such a policy will store just the first 28 chunks of the content in the cache. This can be seen as an inefficient decision for both kinds of users. Indeed, for the fast users, able to download just the first 10 chunks, the additional 18 chunks stored in the cache are completely useless and a waste in terms of cache occupancy. For the slow users, able to download up to 100 chunks, the cache is heavily underutilized since only the first 28 chunks are available.

In our work we propose instead to exploit the distribution of the dwell time under each EN to increase the performance of the overall caching system. Indeed, in the above example, two more clever solutions can be easily envisaged. The first is to store 10 chunks only: the slow users still experience cache misses as in the netPredict case, however we now avoid the waste of cache occupancy in the case of fast users and save room for other content. The second solution, instead, stores the first 100 chunks: in this way, all the users will find the requested chunks in the cache, at the cost of a higher cache occupancy.

Note that the possible inefficiencies arising in a single EN (i.e., cache) scenario are exacerbated in the presence of a sequence of ENs, due to the required in-sequence delivery of the chunks. Thus, we advocate the use of prefetching policies that are aware of the dwell-time distribution, and not only of the average dwell time under an EN, as, e.g., netPredict does.

Iii-B Chunk download probability for large caches

We now derive the probability that a user successfully downloads a chunk from the cache. Let us focus on one car and one specific content that a user wishes to download from the traversed ENs. Assume that EN , with , is the th EN along the path of the car, composed of ENs.

We start by defining the chunk delivery process under a best-case scenario in which all ENs cache the whole content. This implicit assumption of very large cache is aimed at devising a simple model that will be tailored to small caches later in this section.

Let

be the random variable representing the last chunk received from EN

, and let by definition. Then the set of chunks downloaded from EN is given by: . Thus the probability that chunk is downloaded from EN is, for any ,

(1)

Note that if is large enough, the content will be downloaded for sure from some EN, thus .

Let be the random variable representing the total number of chunks downloaded from EN

by the user. The probability density function (pdf) of

depends mainly on two factors: (1) the mobility of the car, since, e.g., longer dwell times under the EN coverage typically imply larger amounts of download data, and (2) the actual throughput obtained by the user when connected to the EN, which in turn depends on the wireless data rate and on channel contention among other users. In Sec. IV-A, we will describe how to compute the pdf of in the urban scenario under study.

Based on our definitions, it is easy to see that for ,

(2)

The theorem below relates the download probability to .

Theorem 1

Given a car traversing a sequence of ENs, the probability to download a specific chunk from EN , with , can be expressed as

(3)
Proof:

Given (2), we can write (1), for any , as:

Note that, as expected, for the expression in (3) becomes since the user will download chunk from the first EN only if the total amount of downloaded chunks from EN is greater than . Now, thanks to the well-known property of the expectation of non-negative integer random variables, we can claim:

Property 1

The following corollary holds in the special case when the car dwell times under the ENs are i.i.d. random variables. Note that this case is addressed here for completeness, as well as to provide a more explicit expression of the ’s, however our approach does not require such an assumption.

Corollary 1

Let the random variables ’s be i.i.d. and defined on a positive support. Let be their discrete pdf. Then, for any , we have:

(4)

where is the convolution operator.

The proof is reported in Appendix A. The complexity of computing the convolution for all ENs is . Notably, such computation can be done offline, based on past statistics.

We now consider the case of an EN cache of limited size. Let be the available space in terms of chunks at EN

. Then we can introduce a discrete random variable,

, which represents the total number of cached chunks downloaded from EN by the user, given the available room in the cache of EN . The pdf of is given by:

(5)

Indeed, it is not possible to download more than cached chunks, and the events corresponding to a number of downloaded chunks larger than in the original model with large cache size, now correspond to downloading all the available chunks.

Given the above distribution, the probabilities can be computed as in (3), or when ’s are i.i.d. as in (7) .

To further clarify the behavior of the download probability at each EN, we provide an example below.

  Example 1: Consider a toy scenario in which a car traverses four ENs and large-sized caches are available. ’s are i.i.d. with a symmetric triangular distribution and mean value equal to 10 chunks, i.e., the average number of chunks downloaded at each EN is 10. Fig. 5 shows the download probabilities at each EN for each chunk , computed by applying Corollary 1. From Fig. 5, we observe that at the first EN decreases as increases since the randomness in the mobility reduces the probability of downloading further chunks. Due to the limited support of the distribution of , becomes zero for chunks. At the second EN, is now bell-shaped, since values of close to zero correspond to the case in which takes very small values (which is unlikely), i.e., the car speed is very high under the coverage of the first EN, hence the user does not have enough time to download any chunk. The maximum is obtained around 15, which is reasonable since in the case of deterministic mobility with chunks for any , the chunks to be downloaded would be exactly in the interval , which is symmetric around 15. The chunk download probability from the following ENs () still exhibits a bell-shaped behavior, but with a larger support. This is due to the higher uncertainty on downloading a specific chunk from a given EN, which, in turn, is due to the increased randomness in the number of previously delivered chunks.  

Iii-C The RICH prefetching algorithm

Based on the previous definition of the chunk download probability, we define our prefetching policy. The goal of our scheme is to ensure that the probability with which a user can download a chunk from any of the ENs is greater than a given threshold , with . Our scheme thus identifies the smallest set of ENs that should cache each chunk so that its download probability exceeds . This set depends on the combined probabilities of downloading the chunk from each EN. Thus, setting allows controlling the maximum amount of chunks to be downloaded from the backhaul.

The pseudocode of RICH is reported in Alg. 1, which, for each chunk , returns the set of ENs, , that must store according to the RICH prefetch policy. After the initialization, for each chunk (lines 3-12), RICH considers the set of corresponding download probabilities (line 5). To avoid degenerate solutions, RICH just considers the ENs for which it is possible to download chunk , i.e. . Now the algorithm iterates (lines 6-11) on all the download probabilities in decreasing order (line 7). Chunk is now stored at the EN corresponding to the highest value. If is already greater than , no other EN should cache . Otherwise, the EN corresponding to the second top value of should store the chunk too. Eventually, chunk will be cached at as many ENs, associated to the top values, as necessary so that the sum of their exceeds . Notably, the probabilities can sum up since referring to disjoint events.

1: Probability threshold
2: Download probability for any chunk and for any EN
3:for  do For any chunk
4:     , Init
5:      Store download prob.  in descending order for chunk
6:     while  and  do Check threshold
7:          remove the highest prob. from
8:          EN index corresponding to
9:          Accumulate the probabilities
10:          Update the list of ENs
11:     end while
12:end for
13:return ENs where to prefetch each chunk
14:return Final download probability
Algorithm 1 RICH prefetch with a single-threshold

At the end of the procedure, represents the estimated download probability of chunk from any EN in , thus with copies of chunk . Whenever , the prefetching procedure has run successfully. Otherwise, it fails since it is not possible to find any set of ENs where to prefect the chunk in order to satisfy . In such a case, we do not store any copy of the chunks. Note also that the last ENs of the path experience a significant border effect since there are not subsequent ENs where to prefetch the chunks. In Alg. 1, the computation complexity of each loop is , due to the sorting in line 5 and the fact that each chunk can be stored in at most ENs. Thus the overall complexity is .

To understand the effect of setting , we consider some extreme cases. If we set , a single copy of each chunk is stored in the most probable EN. Whereas, if we set and the threshold is reached, the chunk is stored in any EN for which the download is possible. Thus the value of can be numerically optimized to maximize a given performance metric, i.e., the overall cache hit probability.

Fig. 5: Download probability for each EN and overall download probability, , for the RICH caching policy with a single threshold , given a toy example scenario with chunks.
Fig. 6: Number of copies for the single-threshold and multiple-thresholds RICH prefetching scheme.
Fig. 7: Example of number of copies for two subsequent ENs.
Fig. 8: Download probability for each EN and overall download probability, , for the RICH caching policy with multiple thresholds, being (for ) and (for ).

In the example of Fig. 5, we show the different corresponding to a simple triangular distribution for with chunks. We also show the final obtained with RICH by setting . The non-monotonic behavior of is due to the different number of copies that are stored at the ENs for different chunks, as shown by the curve labeled “single-th” in Fig. 6. The behavior of the number of copies is due to two different effects. Intuitively, as increases, the uncertainty about the possibility to download chunk increases, thus the RICH caching algorithm compensates it by creating a higher number of copies. Indeed, in Fig. 6 the number of copies grows from 1 to 3. At the same time, the uncertainty is usually larger for the chunks which are “between” two ENs and thus the number of copies can be non-monotonic, as shown around chunk 26 in Fig. 6. To understand this second effect, consider the example shown in Fig. 7, depicting and for two subsequent ENs. The most probable chunks at each EN (whose download probability is greater than

) will be stored as just one copy, whereas the other chunks (between the most probable chunks) will be stored as two copies, due to the uncertainty in the precise moment when the car will enter EN

.

Iii-C1 Coping with uncertainty

We now propose two further enhancements to our scheme, in order to efficiently cope with chunk requests that are farther in the future.

Multi-threshold RICH policy. Our intuition is that a fixed threshold , as the one used in Alg. 1, may be unfit for chunks that are expected to be downloaded in farther ENs, for which the level of uncertainty is larger. Indeed, for such chunks the single-threshold RICH policy will naturally require a very large number of copies, thus wasting precious space in the cache that could be better used for other users. We can therefore consider a dedicated threshold for each EN , leading to a decreasing sequence of thresholds . The actual values of the thresholds can be optimized numerically to maximize the cache hit probability. We define instead the range of chunks for which threshold is adopted for EN as the set of all the chunks such that , for any other EN . The reason for this choice is that the chunks with the largest download probability at one EN must affect the actual threshold value to use at such EN. Importantly, the complexity of optimizing the thresholds grows as , i.e., the number of ENs in the path, and not as the number of chunks in a content (which may be arbitrary large, depending on the specific kind of content).

Fig. 8 shows the same scenario as in Fig. 5 but using different values of threshold for the four ENs. Only , while all the other thresholds are equal to . The final is lower with respect to the single threshold case for , but this is due to the smaller number of stored copies (usually 1 for most of the chunks), as shown in Fig. 6. This reflects the intuition that for the chunks whose uncertainty is large, it is better not to “waste” cache storage with multiple copies.

Refreshing the policy. Due to the high level of uncertainty, for “far-in-the-future” chunks, RICH compensates with a high number of copies, wasting a large amount of cache storage while achieving a marginal performance improvement. Thus, we design RICH to prefetch contents just on the initial sequence of ENs, and to refresh the prefetching decision before entering the first EN not considered in the previous decision. This approach permits to better trade the effectiveness of the prefetching decision with the time to prefetch the content.

Iii-C2 Eviction policy

Consider now a generic scenario with multiple users served by one EN. Since the EN cache is finite, we need an eviction policy determining which chunks should be removed when the cache is full and which new chunks must be inserted. We adopt the following policy: the chunks removed with higher priority are the ones that have been already delivered, i.e., those destined to users not anymore under coverage. Among these chunks, the ones with the lowest download probability are evicted first. In this way, we can efficiently exploit the cache under chunk demands that vary in both space and time.

Iv Simulation scenario and methodology

Here, we first introduce the scenario and the real-world vehicular traces used to investigate the performance of RICH. Then we describe the methodology adopted for the comparison of RICH against state-of-the-art solutions.

Iv-a Reference scenario

We take as reference scenario a real-world 2 km 2 km urban area of the Italian city of Bologna, illustrated in Fig. 9. The detailed vehicular mobility traces were obtained to reproduce the experimental data gathered through the traffic detectors available at the intersections [3]. The total trace duration is 79 minutes comprising 11,079 vehicles (approximately, 950 are simultaneously on the map) and representing 120 minutes of the morning rush hours, under quite stationary traffic conditions.

Fig. 9: Reference urban area in the city of Bologna. Circles represent EN locations and the corresponding coverage areas (distance is expressed in meters).

In the above urban section, we select the major traffic arteries and place eight ENs along them, as represented in Fig. 9 in correspondence of main intersections regulated by a traffic light. Let ={A,B,C,D,E,F,G,H} be the set of ENs; all ENs have the same cache size. We assume an ideal radio range of 100 m at each EN, thus the coverage areas of subsequent ENs (denoted by circles in the figure) do not overlap. Note however that, even if we consider here non-overlapping coverage areas, our scheme can be applied under any scenario.

To investigate the effect of the prefetching policy across multiple ENs, we consider only those cars in the trace that enter the coverage of any three ENs. Among the large number of the possible combinations of three ENs, we consider only the ones with a large number of users and call them significant paths. In total, we identify 7 significant paths with a minimum number of cars equal to 45 each (see Tab. IV for details), obtaining 2,053 cars in total following a significant path. Given the vehicular traces of all the cars traversing the significant paths, for each EN we derive the empirical distribution of the car dwell times, needed by the Prefetcher.

Finally, although RICH can be implemented on any Infrastructure-to-Vehicle technology (ITS-G5/DSRC, WiFi or LTE), we opted for the IEEE 802.11a WiFi standard, yet operating on the 5.9 GHz frequency band. The reasons for this choice are twofold. First, since our application is not strictly safety related, we would not use ITS-G5 to transmit on the ITS-G5 bands (at least 5.875-5.905 GHz). Nevertheless, we want to avoid transmitting on the WiFi bands at 5.4 GHz so as to not interfere with non-vehicular communications and increase the capacity available for content streaming delivery. Second, ETSI is preparing a deregulation of the ITS band to allow WiFi access, e.g., 802.11a, under a ‘detect-and-avoid’ principle against ITS-G5 technology [10, 22]. Thus, according to C-ITS [7], we test RICH on the Service Channel 1 (SCH 1) at 5.875-5.885 GHz and, accordingly, ignore potential coexistence with ITS-G5.

Iv-B Simulation methodology

We developed a discrete-event simulator, based on OMNeT++ [23], which models the network architecture in Fig. 2 with 8 deployed ENs, as in Fig. 9. Even if in our model we implemented both the Prefetch and the Standard Caches, in the following we will investigate just the performance due to the Prefetch Cache.

We assume an ideal communication link from the ENs to the Prefetcher, with a propagation delay equal to s, corresponding to a Prefetcher located physically in the same urban area. Instead, the Data Store is connected to the ENs, with a 100 Mbps communication link and a propagation delay of 2 ms. We model the wireless access network with a detailed 802.11a model provided in Inet-Framework 3.3 [14]. The simulation parameters are defined in Table II.

Parameter Value Parameter Value
WiFi 802.11a WiFi active scan false
Frequency band 5 GHz Beacon interval 100 ms
Bandwidth 10 MHz Receiver sensitivity -85 dBm
Max bitrate 54 Mbps SNIR threshold 4 dB
Rate control AARF Background noise power -101 dBm
MTU 1 kbytes Pathloss type Free space
Tx power 33 dBm Pathloss coefficient 3
Antenna gain 1 dB Max communication range 110 m
Retry limit 7 Max interference range 200 m
TABLE II: OMNeT++ simulation parameters

Each significant path comprises 3 ENs: we run the prefetching policy and evaluate the performance at the first two, to avoid the border effect due to the last EN, described in Sec. III-C. Table III shows the statistics about the temporal average of the number of users under each EN (given that at least one car is present), and the total number of distinct cars under coverage. EN A is located at the most crowded intersection, thus we expect heavy congestion and lower per-user throughput. On the other hand, H is at the least crowded intersection, for which we expect the least number of users on average, thus high per-user throughput.

EN A B C D E F G H
Avg. cars 27.51 13.36 7.88 4.52 8.35 6.02 3.42 2.79
Total cars 1580 1510 527 382 337 1456 161 206
TABLE III: Number of cars under each EN in Bologna urban area
Path ABC ABF CDE FBA FBC HDC HGA
No. of cars 54 555 337 810 91 45 161
TABLE IV: Significant paths identified in Bologna urban area

To evaluate the download probability for any chunk and for any EN, i.e. , we must determine the number of downloaded chunks at each EN. Let be the random variable of the dwell time of a generic car under EN . Let be the total bandwidth available when a user is under coverage of an EN. Let be the average number of cars under the coverage of EN . Let be the total size in bits of each chunk. Approximatively, we can claim that:

(6)

Replacing the empirical values of and (given in Table III) in (6), we estimate the distribution of , and then the empirical for any EN and chunk . Finally, we evaluate to take into account the finite cache, by applying (5).

We remark that, we account for content popularity while evaluating the performance of RICH. Indeed, each time a new car enters the coverage of any EN for the first time, it generates a content request according to a Zipf’s distribution with exponent , coherently with the value observed in [5]. The size of one content is 2,600 chunks, with each chunk being 65 kbytes large. The normalized cache size is defined as the cache size divided by the total size of all contents in the catalog. The size of the catalog is 10 contents.

The performance metrics we consider are as follows333The above statistics are computed considering the total time period during which a car was under the coverage of any of the ENs.:

  • cache throughput : average amount of data received by the user over time and directly downloaded from the cache;

  • cache hit probability, : fraction of chunk requests that are satisfied by directly downloading the chunk from the cache, i.e., in the event of a cache hit;

  • backhaul traffic : average amount of data downloaded from the server over time, i.e., due to the event of a cache miss;

  • normalized backhaul overhead: total amount of additional traffic transferred between the Data Store and the ENs with respect to amount of traffic delivered to the users, normalized to the traffic delivered to the users; it can be also negative in case of content reuse in the cache;

  • normalized cache size, : size (i.e., maximum allowed occupancy) of each cache, normalized to the catalog size;

  • network cache occupancy : average total cache occupancy across the ENs, normalized to the catalog size.

Iv-C Alternative prefetching policies

Below we describe our benchmark schemes.

POP: it is a mobility-agnostic approach that prefetches contents in decreasing order of popularity till storage saturation. It requires knowledge of the popularity level of all the contents in advance, but it is provably optimal in terms of hit probability for a single cache and under a stationary content request process [5]. We tailor the behavior of POP to our data streaming scenario, in which each content is divided into chunks. POP stores the chunks of the most popular contents in sequence; if the cache space is not sufficient for the whole content, only the initial chunks of the content are stored.

netPredict [32]: it exploits both spatial and temporal predictions of the car path based on the previous history. For a fair comparison with RICH, we assume that the spatial prediction in netPredict is perfect and, hence, the sequence of ENs traversed by the car is known in advance. Based on the knowledge of the average dwell time under each EN and the value of the average bandwidth available at each EN, netPredict stores a number of chunks given by their product. Thus, the adopted communication model in [32] is identical to (6), but exploits just instead of the distribution of as in RICH. In a nutshell, netPredict can be seen as a special case of RICH for deterministic .

Iv-D Statistical analysis of the reference scenario

Cars may arrive under the coverage of the ENs following various paths, as shown in Table IV. Also, the number of cars on each path is different. Considering such traffic conditions, the dwell time distributions (and hence the distributions of ) of the cars at the ENs show interesting statistical properties (see the values in Table V for a cache size of 2,600 chunks). We observe that, if incoming cars at some EN take different paths (with a significant number of cars in each path), the distribution of

shows high value of skewness and kurtosis. For example, Table 

IV shows that EN B is involved in several paths. Since cars enter B’s coverage with different incoming and outgoing roads, the distribution of shows the highest value of skewness and kurtosis. A similar behavior can be observed in case of ENs A, D, and H. On the other hand, the incoming cars under the coverage of E and G follow only the paths CDE and HGA, respectively. Hence, the distributions of and show lower values of skewness and kurtosis.

EN A B C D E F G H
Skewness 2.11 2.48 0.41 1.74 -0.11 -0.15 -0.18 1.62
Kurtosis 8.57 11.60 2.71 6.89 2.63 2.22 1.68 5.98
TABLE V: Statistical description of empirical in Bologna urban area.

V Numerical results

As the first step, we use an exhaustive approach to find the combination of the thresholds at the three ENs for maximizing the cache hit probability. Table VI shows the values we obtained for different cache sizes, assumed to be the same at all ENs. In general, the optimal threshold value in the first EN, , is higher than the others due to the smaller uncertainty on the cars mobility in the first hop. By construction, depends on the chunks that are most probably downloaded from EN 3. However, such chunks can also be downloaded from EN 2. The higher value of as compared to allows some additional chunks to be stored in EN 2, which increases the cache hit probability. For cache size chunks, the optimal value of the thresholds does not change, because the large cache capacity is never fully utilized.

Cache size Optimal threshold
#contents #chunks
1 2600 0.88 0.67 0.70
2 5200 0.92 0.58 0.67
3 7800 0.99 0.57 0.67
4 10400 0.99 0.57 0.67
5 11000 0.99 0.57 0.67
TABLE VI: Optimal thresholds for different cache sizes.
Fig. 10: Urban Bologna scenario: cache hit probability (left); network cache occupancy (center); cache throughput (right).
Fig. 11: Urban Bologna scenario: backhaul traffic (left); normalized backhaul overhead (center); joint user/operator utility function (right).

Edge network performance. We now evaluate the performance of the three policies at the network edge, which directly impacts the content access delay. Fig. 10(left) depicts the cache hit probability as a function of the normalized cache size , for POP, netPredict and RICH prefetching schemes. As expected, a larger cache size improves the performance of all caching schemes, even if beyond the cache hit probability for RICH and netPredict becomes constant since under the two policies the cache is never full. However, RICH outperforms netPredict and POP up to 33% and 190% respectively, for small cache size. This is due to the higher effectiveness of the RICH policy, which tends to store chunks only in those ENs from where they can be downloaded with high probability. For very large cache size (), POP achieves higher cache hit probability as compared to RICH. This is because the cache is always fully occupied for POP and at each cache stores every content in the catalog, hence a hit is always guaranteed. Fig. 10(center) shows the network cache occupancy for the three caching schemes. By construction, regardless of the cache size, the cache occupancy is highest for POP. In comparison with netPredict, the cache occupancy of RICH is slightly higher but the gain in the other performance metrics (e.g., cache hit probability, cache throughput and backhaul traffic) is significantly larger. Finally, Fig. 10(right) shows that RICH achieves a cache throughput of up to 83 Mbps (i.e., around 11.9 Mbps per single EN/cache), while the maximum value for netPredict is 64 Mbps.

Backhaul network performance. The performance gain provided by RICH over both netPredict and POP can be observed also in terms of backhaul traffic, since the higher cache hit probability of RICH implies a lower probability to access the server and retrieve the content from there. Fig. 11(left) shows the overall backhaul traffic due to cache misses. In the best case, RICH reduces such traffic by approximately 57% and 70% as compared to netPredict and POP, respectively.

Fig. 11(center) highlights that RICH incurs lower backhaul overhead as compared to the other policies for large enough caches. In the best case, RICH achieves 27% and 67% lower overhead than netPredict and POP, respectively. The lower negative values of the backhaul overhead indicate better reuse of the chunks stored in the cache. For small caches, RICH incurs higher backhaul overhead due to higher number of prefetched chunks and lower chunk reuse, but RICH still achieves lower backhaul traffic as compared to the other schemes (see Fig. 11(left)), thanks to the higher hit probability and content reuse.

A joint user/operator view. Here we compare the performance of the caching schemes in terms of joint user/operator utility functions. The user utility is described as an exponential function decreasing with , since larger implies smaller latency to access the contents. The operator utility instead is modeled as an exponential function decreasing with the normalized cache size. Following a standard approach, we express the joint utility function as the product of the user and the operator utility, which is depicted in Fig. 11(right). Note that RICH outperforms netPredict, regardless of the cache size. Compared to POP, the RICH utility is significantly higher for small cache sizes, while for larger caches RICH allows for a great reduction in cache occupancy.

V-a Errors in the knowledge of car mobility

To assess the robustness of RICH, we evaluate two kinds of error in the system knowledge of the car mobility. The former affects the dwell time under an EN, and, thus, the actual number of downloaded chunks. The latter affects the knowledge of the sequence of traversed ENs.

Errors in the dwell time. We add a random error to the dwell time experienced by a car. Let be the observed dwell time of a car under a given EN, and let be the minimum observed dwell time under the same EN, based on past statistics. We set the actual dwell time of the car as:

where

is Gaussian distributed with average

and standard deviation

. When  s, all the dwell times under an EN are deterministically shifted by . When , the car is slowed down, while, for , the car accelerates with respect to its original speed.

Fig. 12(top) shows the cache throughput for different values of and for in the urban Bologna scenario, where the average dwell time across all the ENs is 39 s. For  s, most of the cars spend the minimum time under the coverage of the ENs (), thus the cache throughput becomes very low. The cache throughput increases with as the cars get more time to download the chunks available in the cache. For  s, instead, the high coverage time does not have a significant effect on the cache hit events. Indeed, all the stored chunks have already been downloaded and the extra coverage time cannot be exploited, thus the cache throughput decreases.

Fig. 12(bottom) shows the cache hit probability for different values of . For  s, cars spend less time under coverage, thus both caches hits and misses decrease. However, the cache hit probability remains high due to the large amount of data downloaded by cars under their first EN. For  s, the cache hit probability decreases rapidly by increasing , due to the higher number of cache miss when the cars experience a coverage time longer than expected. The hit probability is almost unchanged for and  s, while it decreases at most by 30% for  s, which is the worst case scenario.

Figs. 13(top) and 13(bottom) depict the backhaul traffic and the normalized backhaul overhead. Both metrics are low when  s, as cars spend less time under coverage, hence the number of cache miss is small and few requests reach the Data Store. When  s, both metrics increase due to the significant increase in the number of cache miss. Notably, due to the large content reuse, the backhaul overhead is negative.

In summary, only large errors in the dwell time (i.e., large values of or ) have an evident impact on the performance, as they make the past statistics nearly useless. Otherwise, the performance is just slightly affected by errors, confirming the robustness of the proposed approach.

Fig. 12: Urban Bologna scenario: effect of random error in dwell time on the cache throughput (top) and on the cache hit probability (bottom).
Fig. 13: Urban Bologna scenario: effect of random error in the dwell time on the backhaul traffic (top) and on the normalized backhaul overhead (bottom).

Errors in the car path. In the considered scenario, RICH relies on the information on just two subsequent ENs traversed by a car along its path. Additionally, it runs when a car enters the coverage area of the first EN, thus guaranteeing some intrinsic level of robustness. Nevertheless, a car could change its actual trajectory by completely skipping the second EN along its expected path – an event that could arbitrarily affect the cache hit.

To understand the effect of errors in the cars path, we selected at random a set of 20% and 50% of cars that would skip the second EN along their path, in the urban Bologna scenario. Fig. 14 shows the cache throughput (averaged over several simulation runs) versus the cache size. The cache throughput decreases by approximately 10% and 23% with respect to the case where cars follow the original path, confirming the robustness of the proposed approach.

Fig. 14: Urban Bologna scenario: effect of error in the car path on the cache throughput.

V-B More detailed knowledge on car mobility

We now analyze the benefit that a more detailed information on the car mobility could bring. To this end, we classify the cars in two categories: slow and fast. Slow cars are those dwelling under the coverage area of an EN longer than 10 s; all others are classified as fast cars. Table 

VIII reports the coefficient of variation of the resulting dwell time

, with and without such classification. As expected, for all ENs the variance of the dwell time for slow and fast cars is smaller, highlighting the gain brought by the additional information.

Figs. 15(top) and 15(bottom) depict the cumulative distribution corresponding to for EN A and for EN H, respectively, in the Bologna urban area, for cache size equal to 2600 chunks. For EN A, Fig. 15(top) shows that slow and fast cars can download, on average, around 100 chunks and 30 chunks, respectively. Observing the average number of cars in Table III, EN A is expected to be located at a very congested intersection. Indeed, only few, slow cars experience a large enough dwell time and a small radio congestion that allow them to download all the 2600 chunks stored in the cache. EN H, instead, appears to be located at a much less congested intersection, according to the values in Table III. This implies a lower radio congestion, hence, a higher download capability, as shown in Fig. 15(bottom). Indeed, under EN H, fast users download at least 170 chunks, with an average around 250, whereas slow users download around 1000 chunks. For brevity, we omit the distribution of the number of chunks downloaded from the other ENs; the number of fast and slow cars per EN is reported in Table VII.

Using the dwell-time distribution conditioned on the fast/slow cars, we can obtain better results for both RICH and netPredict. The cache throughput of RICH now reaches about 91 Mbps (10% increase), while netPredict achieves 83 Mbps (30% increase). Similarly, the backhaul traffic is further reduced to 6 Mbps (80% decrease) for RICH, while it remains unchanged for netPredict. The performance of POP, instead, is unaffected as POP is mobility agnostic.

EN A EN B EN C EN D EN E EN F EN G EN H
Fast Slow Fast Slow Fast Slow Fast Slow Fast Slow Fast Slow Fast Slow Fast Slow
376 1204 948 562 182 345 215 167 0 337 1456 0 68 93 120 86
TABLE VII: Number of fast and slow cars at each EN in the urban Bologna scenario
EN Coefficient of variation
Fast Slow Combined
A 0.36 0.61 0.75
B 0.31 0.45 0.80
C 0.43 0.32 0.59
D 0.30 0.40 0.72
E 0.32 0.32
F 0.04 0.04
G 0.38 0.17 0.44
H 0.33 0.40 0.70
TABLE VIII: Coefficient of variation of at each EN .
Fig. 15: Cumulative density function of the number of downloaded chunks measured in the trace, based on the slow/fast classification at EN A (top) and at EN H (bottom).

Vi Related work

Several works have appeared on content caching at the edge of the cellular network. In particular, [4] proposes an optimal probabilistic content placement policy that maximizes the total cache hit probability for random network topologies, based on content popularity. Thus, unlike our work, the policy in [4] is oblivious to the actual mobility pattern of the users. A hybrid scenario, comprising MANET and cellular networks, is studied in [9], where each node estimates the content popularity and caches a content based on that. Importantly, this scheme can be considered as a distributed implementation of the POP policy that we use as benchmark in our work.

Regarding cellular backhaul networks, [31] investigates the effect of different criteria to identify a web content when accessing the caching system. The main idea is to avoid different application-level identifiers for a content, so as to prevent content items from being duplicated in the caches. We remark that, in our study, we consider the streaming of content through a chunk-based approach and we assume that each chunk and each content are univocally identified.

Several works have employed cooperation among caching nodes to improve performance in heterogeneous cellular networks. As an example, [17] introduces a hierarchical caching architecture where caches are deployed in both small-cell and macro-cell base stations, and both can satisfy a user content request. Similarly, the work in [6] considers a cluster of small-cell base stations as one cache entity. The base stations adopt both cooperative caching techniques and cooperative transmissions. Differently from the above studies, our approach is based on a centralized prefetching scheme orchestrating the caches deployed at the different ENs.

Few works have investigated caching schemes specifically taking into account user mobility. [29] proposes a mobility-aware cache placement strategy for Cloud Radio Access Networks. The authors leverage on the user-mobility pattern to analytically estimate the content-request rate in different cells and minimize the network energy consumption. This work differs from ours as our goal is to optimize the cache hit probability. [24] proposes a mobility and popularity-aware caching scheme for a heterogeneous cellular network, comprising macro-cell and small-cell base stations, where only the latter are capable to cache contents. The goal of the caching policy is to minimize the data downloaded from the macro-cell base station. More relevant to our study is [32], which considers the MobilityFirst network architecture introduced in [25], and proposes an approach, called netPredict, aiming to support smooth mobile content delivery. In MobilityFirst, a global identifier is associated with each user, so that the user mobility is recorded at each network node. Each of these nodes is equipped with two distinct buffers, similarly to our architecture in Fig. 2. The first one caches the most popular content items, exactly as the POP policy considered in our work. The second buffer is devoted to store the content based on a prefetching policy leveraging on the predicted sequence of nodes traversed by each particular user, and the average dwell time and available bandwidth under each node. A similar prefetching policy is proposed in [8] for a cellular network scenario. Similarly to our work, the content is delivered to users by base stations using a chunk-based approach. The specific mobility of each user is used to identify the chunks to prefetch along the user path. Unlike our study, however, both [32] and [8] assume that the caching policy knows or predicts the spatial and temporal trajectory of each user, in order to estimate the time intervals in which the user will be covered by each base station. Our approach instead requires only the knowledge of the distribution of the dwell times under each edge node at aggregate level. This distribution can be estimated locally by each EN and does not require the precise knowledge of the car trajectory: only the sequence of ENs is needed. This simplifies the prediction process and provides better user privacy.

Another body of works relevant to our study is based on the application of the Information-Centric Networking (ICN) paradigm to vehicular networks. In particular, [26] proposes an architecture for content-centric vehicular networks, which is compatible with our proposed prefetching approach. [15] presents a socially-aware vehicular information-centric system, which leverages on caching, computing, and communication capabilities of smart vehicles for facilitating content availability to mobile users. Although in standard ICN-based architectures network nodes reactively store contents during delivery to the user, similarly to our approach [12] and [21] propose to prefetch contents in ICN nodes. The study in [21]

formulates the problem of optimally placing content chunks in the ICN-based network nodes as an integer linear programming optimization problem, maximizing the content retrieval probability. Moreover, forward error correction coding is adopted to reconstruct the whole content if enough chunks have been received, independently from their order. The work in 

[21] is thus based on a single content retrieval and not on content streaming as in the RICH case. Interestingly, the authors in [30] advocate that it is critical to consider user mobility information for caching design in content-centric wireless networks. They assume that the user dwell time is estimated based on the available data, and optimize the cache failure probability by solving a convex optimization problem. However, the details of the estimator for the dwell time are not provided. Similarly, [33] proposes a mobility-aware cooperative caching scheme for content-centric 5G networks, where contents can be stored at the network edge as well as in the vehicular cloud. Furthermore, MEC servers are leveraged to compress the contents thereby enhancing the storage capability of the edge nodes. The considered scenario is similar to that of [12] where vehicles are assumed to be moving at a constant speed, thus, the approach is not applicable to our urban scenario.

Finally, a preliminary version of this work was included in our conference paper [18], where the basic ideas of our caching framework were sketched.

Vii Conclusions

In this paper we study how to efficiently provide connected cars with streaming data as they drive along a road covered by wireless Edge Nodes (ENs). Our RICH prefetching policy determines the content chunks to store in the ENs caches, based on the past statistics of the achievable data rates and of the dwell time experienced by the cars under the coverage. RICH requires only to know the sequence of ENs traversed by a car, without any detailed information on its actual trajectory and real-time traffic conditions.

Throughout extensive trace-driven simulations, our scheme was shown to improve the cache throughput and to reduce the backhaul traffic, with beneficial effects for both the users and the network operators. Furthermore, RICH was shown to be robust to possible errors in the knowledge of the cars path and dwell time.

Viii Acknowledgements

This work has been partially funded by the EC H2020 5G-Transformer Project (grant no. 761536), the EC H2020 5G-EVE Project and the EC H2020 HIGHTS project (grant no. 636537). EURECOM acknowledges the support of its industrial members, namely, BMW Group, IABG, Monaco Telecom, Orange, SAP, ST Microelectronics, and Symantec.

References

  • [1] (2014) 5G: challenges, research priorities, and recommendations. Note: Joint white paper, NetWorld 2000 Cited by: §I.
  • [2] C. Bernardini, T. Silverston, and O. Festor (2013) Mpc: popularity-based caching strategy for content centric networks. In IEEE ICC, Cited by: §I.
  • [3] L. Bieker, D. Krajzewicz, A. Morra, C. Michelacci, and F. Cartolano (2015) Traffic simulation for all: a real world traffic scenario from the city of Bologna. In Modeling Mobility with Open Data, pp. 47–60. Cited by: §IV-A.
  • [4] B. Blaszczyszyn and A. Giovanidis (2015) Optimal geographic caching in cellular networks. In IEEE ICC, External Links: Document Cited by: §I, §VI.
  • [5] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker (1999) Web caching and Zipf-like distributions: evidence and implications. In IEEE INFOCOM, pp. 126–134. External Links: ISSN 0743-166X Cited by: §IV-B, §IV-C.
  • [6] Z. Chen, J. Lee, T. Q. Quek, and M. Kountouris (2017) Cooperative caching and transmission design in cluster-centric small cell networks. IEEE Trans. on Wireless Comm.. Cited by: §VI.
  • [7] (2016) Cooperative Intelligent Transportation Systems (C-ITS) Platform. Note: Final Report Cited by: §IV-A.
  • [8] S. K. Dandapat, S. Pradhan, N. Ganguly, and R. Roy Choudhury (2013) Sprinkler: distributed content storage for just-in-time streaming. In ACM CellNet, Cited by: §I, §VI.
  • [9] M. Dehghan, A. Seetharamz, T. He, T. Salonidis, J. Kurose, and D. Towsley (2014) Optimal caching and routing in hybrid networks. In IEEE MILCOM, Cited by: §I, §VI.
  • [10] ETSI TR 103 319 broadband radio access networks (bran); 5 GHz high performance RLAN; mitigation techniques to enable sharing between RLANs and road tolling and intelligent transport systems in the 5 725 MHz to 5 925 MHz band. ETSI Standard. Cited by: §IV-A.
  • [11] M. Fiore, C. Casetti, and C.F. Chiasserini (2014) Caching strategies based on information density estimation in wireless ad hoc networks. IEEE Trans. on Vehicular Tech. 60 (5), pp. 2194–2208. Cited by: §I.
  • [12] D. Grewe, M. Wagner, and H. Frey (2016) PeRCeIVE: proactive caching in ICN-based vanets. In IEEE VNC, Cited by: §VI.
  • [13] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young (2015) Mobile edge computing — A key technology towards 5G. ETSI white paper 11 (11), pp. 1–16. Cited by: §I.
  • [14] INET framework. Cited by: §IV-B.
  • [15] J. A. Khan and Y. Ghamri-Doudane (2016) SAVING: socially aware vehicular information-centric networking. IEEE Comm. Mag. 54 (8), pp. 100–107. Cited by: §VI.
  • [16] A. G. Li Zhe (2012) Caching policies for in-network caching. IEEE ICCCN. Cited by: §I.
  • [17] J. Ma, J. Wang, and P. Fan (2016) A cooperation-based caching scheme for heterogeneous networks. IEEE Access. Cited by: §VI.
  • [18] A. Mahmood, C. Casetti, C. F. Chiasserini, P. Giaccone, and J. Harri (2016) Mobility-aware edge caching for connected cars. In IEEE WONS, Cited by: §VI.
  • [19] F. Malandrino, C. Chiasserini, and S. Kirkpatrick (2016) The price of fog: a data-driven study on caching architectures in vehicular networks. In ACM IoV-VoI, Cited by: §I.
  • [20] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief (2017) A survey on Mobile Edge Computing: the communication perspective. IEEE Comm. Surveys & Tutorials 19 (4), pp. 2322–2358. External Links: Document, ISSN 1553-877X Cited by: §I.
  • [21] G. Mauri, M. Gerla, F. Bruno, M. Cesana, and G. Verticale (2017) Optimal content prefetching in ndn vehicle-to-infrastructure scenario. IEEE Trans. on Vehicular Tech. 66 (3), pp. 2513–2525. Cited by: §VI.
  • [22] K. Mohammad-Irfan and J. Härri (2017) Can IEEE 802.11p and WI-FI coexist in the 5.9GHz ITS band?. In IEEE WoWMoM, Cited by: §IV-A.
  • [23] OMNet++ discrete event simulator. Cited by: §IV-B.
  • [24] E. Ozfatura and D. Gündüz (2018) Mobility and popularity-aware coded small-cell caching. IEEE Comm. Lett. 22 (2), pp. 288–291. External Links: Document, ISSN 1089-7798 Cited by: §VI.
  • [25] D. Raychaudhuri, K. Nagaraja, and A. Venkataramani (2012) MobilityFirst: a robust and trustworthy mobility-centric architecture for the future Internet. ACM Mobile Comp. and Comm. Rev. 16 (3), pp. 2–13. Cited by: §VI.
  • [26] Z. Su, Y. Hui, and Q. Yang (2017) The next generation vehicular networks: a content-centric framework. IEEE Wireless Comm. 24 (1), pp. 60–66. Cited by: §VI.
  • [27] S. Tewari and L. Kleinrock (2006) Proportional replication in peer-to-peer networks. IEEE INFOCOM. Cited by: §I.
  • [28] (2017) Toward fully connected vehicles: Edge computing for advanced automotive communications. Note: White paper, 5GAA Cited by: §I.
  • [29] T. X. Tran, F. Kazemi, E. Karimi, and D. Pompili (2017) Mobee: mobility-aware energy-efficient coded caching in cloud radio access networks. In IEEE MASS, Vol. , pp. 461–465. External Links: Document, ISSN 2155-6814 Cited by: §VI.
  • [30] R. Wang, X. Peng, J. Zhang, and K. B. Letaief (2016) Mobility-aware caching for content-centric wireless networks: modeling and methodology. IEEE Comm. Mag. 54 (8), pp. 77–83. External Links: Document, ISSN 0163-6804 Cited by: §VI.
  • [31] S. Woo, E. Jeong, S. Park, J. Lee, S. Ihm, and K. Park (2013) Comparison of caching strategies in modern cellular backhaul networks. In ACM MobiSys, Cited by: §VI.
  • [32] F. Zhang, C. Xu, Y. Zhang, K. Ramakrishnan, S. Mukherjee, R. Yates, and T. Nguyen (2015) EdgeBuffer: caching and prefetching content at the edge in the MobilityFirst future internet architecture. In IEEE WoWMoM, Cited by: 4th item, §I, §IV-C, §VI.
  • [33] K. Zhang, S. Leng, Y. He, S. Maharjan, and Y. Zhang (2018) Cooperative content caching in 5g networks with mobile edge computing. IEEE Wireless Comm. 25 (3), pp. 80–87. Cited by: §VI.

Appendix A Proof of Corollary 1

Proof:

When ’s are i.i.d.,

Thus, (3) can be rewritten as:

(7)