DeepAI
Log In Sign Up

NV-Fogstore : Device-aware hybrid caching in fog computing environments

Edge caching via the placement of distributed storages throughout the network is a promising solution to reduce latency and network costs of content delivery. With the advent of the upcoming 5G future, billions of F-RAN (Fog-Radio Access Network) nodes will created and used for for the purpose of Edge Caching. Hence, the total amount of memory deployed at the edge is expected to increase 100 times. Currently, used DRAM-based caches in CDN (Content Delivery Networks) are extremely power-hungry and costly. Our purpose is to reduce the cost of ownership and recurring costs (of power consumption) in an F-RAN node while maintaining Quality of Service. For our purpose, we propose NV-FogStore, a scalable hybrid key-value storage architecture for the utilization of Non-Volatile Memories (such as RRAM, MRAM, Intel Optane) in Edge Cache. We further describe in detail a novel, hierarchical, write-damage, size and frequency aware content caching policy H-GREEDY for our architecture. We show that our policy can be tuned as per performance objectives, to lower the power, energy consumption and total cost over an only DRAM-based system for only a relatively smaller trade-off in the average access latency.

READ FULL TEXT VIEW PDF

page 6

page 7

06/29/2019

Joint Functional Splitting and Content Placement for Green Hybrid CRAN

A hybrid cloud radio access network (H-CRAN) architecture has been propo...
12/06/2020

Security and Privacy for Mobile Edge Caching: Challenges and Solutions

Mobile edge caching is a promising technology for the next-generation mo...
02/13/2020

Tradeoff between Ergodic Rate and Delivery Latency in Fog Radio Access Networks

Wireless content caching has recently been considered as an efficient wa...
05/06/2020

Caching Video-on-Demand in Metro and Access Fog Data Centres

This paper examines the utilization of metro fog data centres and access...
04/06/2022

How SVC enables Distributed Caching in MEC?

With an ever increasing demand for the delivery of internet video servic...
02/01/2021

Optimized Energy Efficient Virtualization and Content Caching in 5G Networks

Network function virtualization (NFV) and content caching are two promis...
12/10/2018

An Efficient Hybrid I/O Caching Architecture Using Heterogeneous SSDs

SSDs are emerging storage devices which unlike HDDs, do not have mechani...

I Introduction

I-a Rising Need for Edge Caching

Edge caching via the placement of distributed storages throughout the network is a promising solution to reduce latency and network costs of content delivery. [1, 2]. Employing caching in upcoming 5G wireless networks has been discussed and proposed in 3GPP 5G guidelines. For example, T-DOC R3-160688 [3] proposes to place an edge cache at an LTE base station either embedded in eNodeB or standalone.
The objective of Local Caching (by memory embedded in the eNodeB-LTE Base Station), is to serve local user requests in a Fog-RAN Cell and to reduce network congestion in the backhaul and delivery latency. It can reduce redundancy of streaming popular multimedia contents, reduce duplicate mobile caching and prefetch preference learning-based predicted-content for users.
Currently deployed Content Delivery Networks (or CDNs) serve trillions of user requests a day from millions of nodes all across the globe and carry a majority of the internet traffic [4]. DRAM-based caches have been typically used in currently prevalent Content Delivery Networks [4, 5].
In the near future, billions of F-RAN (Fog-Radio Access Network) nodes will also be deployed for the purpose of caching. Hence, the total amount of memory deployed at the edge is expected to increase 100 times [1].

I-B Why NVM based ?

Emerging Non-Volatile Main Memory such as RRAM, MRAM, off-the-shelf Intel’s Optane is useful for applications that need a large memory or require lower power and energy consumption. NVMs can store a large amount of data frequently to meet the constraints of low-latency and high bandwidth in 5G. Further, the cost of ownership can be reduced by using NVMs. Compared to Flash, an NVM device offers 10 faster reads and has 5 better durability. By adding NVM to mitigate the access latency on the Flash/HDD layer, the overall cache size can be increased significantly.

DRAM is up to 8x more expensive and uses 25x more power per bit than NVM [6].

Our objective is to significantly reduce the total ownership cost by reducing the DRAM footprint and replacing it with NVM. If NVM is directly used as a DRAM-replacement without modification, it will wear out too quickly, due to its write durability constraints [6]. A possible alternative is to use a Hybrid DRAM-NVM based system, with intelligent caching policies such that most of the writes are performed in DRAM and block reads are performed in NVM.

Fig. 1: An Illustration of Fog-RAN which consists of Base-Station and mobile servers. Edge-Caching is used to reduce backhaul traffic.

I-C Contributions of this Work

  • We propose a scalable hybrid key-value storage architecture for caching in fog computing environments.

  • We formulate -GREEDY, a hierarchical, write-damage, size and frequency aware content eviction/placement policy for our architecture.

  • We compare our caching policy with traditional caching techniques by conducting extensive experiments on the traces of workloads envisioned in 5G for different configurations of file size, variance and Zipf’s popularity.

  • We show that our policy can be tuned as per performance objectives, to lower the energy consumption and cost over an only DRAM-based scenario for a relatively smaller trade-off in latency.

Ii Background

Ii-a Key-Value Store

Key-value stores are used to persist data, enhance both read and write performance. Key-value storage is different from traditional storage in the implementation of its data access techniques. In a traditional computer, an OS page (typically of 4K Bytes is fetched), whenever a data-access is required. The pages are fetched continuously by looking up the B+-Tree file-system. Hence, the caching policies run on the OS page of a fixed size.

In a key-value store, the path to retrieve data in a key-value store is a direct request to the object in memory or on disk. Hence, all the data is fetched simultaneously, as a block at the maximum bandwidth of the memory. For our architecture, we assume a simple exact-match hash table: which interfaces with common REST HTTP requests such as GET (retrieve the value), PUT (store a value for a given key) and DELETE (delete the key-value pair from store).

Ii-B Edge Caching Policies

Web-traffic typically follows patterns in the popularity of requests, size of the content requested etc. This is usually modelled by the Independent Reference Model (IRM), which we briefly describe in subsection 1.

Previous works [7, 8] on finding the caching policy, has shown that they can be formulated as a Linear Utility Maximization problem. We briefly describe the background on how to write the Knapsack problem for a single memory system in subsection 2. Further it is described how an optimal caching policy is achieved from the solution of this Knapsack Problem. The notations used in the equations are given in Table I.

Ii-B1 Independent Reference Model

The distribution of file size is assumed to be in a normal distribution of

where is the mean file size and

is the standard deviation, which can be determined statistically by trace generation over a period of time. The read density is given by the joint probability mass function of a file with parameters (size, popularity index) given as (s,z),

(1)

The popularity of files for both read, update/delete operations are in a Zipfian distribution, independent of each other, with 80% read requests and 20% write requests.

(2)

where z is the file index and .
Requests for content arrive according to the Poisson process with a rate corresponding to their popularity and mean access rate, the Poisson processes for different contents are independent [9].

Total Catalogue of Requests
Estimated Popularity of Content
Number of writes to Content i
Size of Content i
Number of Writes Exponent
Average File Size and Variance
Content stored in DRAM, NVM
Size of DRAM, NVM
Current Minimum Rank in DRAM, NVM
Current Threshold for DRAM, NVM
Rank for Storage in DRAM, NVM of Content k
TABLE I: Summary of notations

Ii-B2 Caching Policy Derivation as a KnapSack Problem

Cache policies have often been designed with the purpose to maximize the hit rate. The objective to maximize the hit-rate, can be reiterated as the contents that should be duplicated in the memory from the storage. This has been formulated as a Knapsack Problem that describes the caching policies [8], as follows :

(3)

subject to

It has been shown that that under Zipf’s law for popularities [7], asymptotic hit ratio is optimized if we do a greedy caching of contents according to the ratio .

Formally, an optimal eviction policy evicts a file in that satisfies the following:

(4)

Iii Related Work

Several sizes, frequency, access-time and device-aware caching policies for edge caching have also been proposed in the literature [10, 11, 8, 7, 12, 13].

Writing files in a cache using the longest retention time damages the memory device thus reducing its lifetime. However, writing using a small retention time can increase the content retrieval delay, since, at the time a file is requested, the file may already have been expired from the memory.

This motivates us to consider a joint optimization wherein we obtain optimal policies for jointly minimizing the content retrieval delay (which is a network-centric objective) and the flash damage (which is a device-centric objective). Caching decisions now not only involve what to cache but also for how long to cache each file. We design provably optimal policies and numerically compare them against prior policies.

Many of these approaches are shown to be combined in a linear utility maximization objective, which translates ot a Knapsack formulation for approaches that assume a single memory-based system [8].

Reducing DRAM footprint has been explored in cloud-based scenarios, by utilization of NVM [6]. Several key-value stores using DRAM/NVM based systems in cloud databases have also been investigated. [14, 15, 16].

Hybrid NVM-DRAM Systems in Computing have been explored which involves techniques for hot-page migration and mitigation write amplification for management of Last-level caches, Persistent memory and Storage-class Memory (SCM). [17, 18, 19].

However, no architecture and caching-policies are available for hybrid systems which take into account several device-aware properties of the system (such as endurance and unequal access times) for the cached-contents.

Iv NV-FogStore

NV-FogStore targets the hybrid DRAM and NVM memory architecture, leveraging the NVM as the eventual persistent memory medium. Our Architecture (shown in Fig 2) has a flat-addressability of DRAM-NVM and content migration within the memories specified by the Memory Controller Instructions. The Tertiary Storage is assumed to be high endurance.

The key features of our architecture are described below:

Dynamic Memory Bank/Channel Rebalancing : NV-FogStore uses bank partitioning to handle a faster request rate at a similar memory size. Since, in both DRAM and NVM, balanced bank parallelism is crucial in achieving desirable performance, the bank utilization is balanced during the allocation of banks to incoming contents. Memory requests across channels can also be balanced horizontally to achieve higher memory utilization. The admission policies and eviction policies (described in Section 4) take into account channel-utilization balancing using threshold ranking. The admission and eviction rank-threshold is tuned accordingly to the channel utilization and content migration takes place between NVM-DRAM channels to achieve better overall performance.
Write Amplification Aware : The NVM device in its lifetime can accommodate only a fixed total number of writes. Under a naive caching policy, the number of writes would significantly exceed the endurance limit of the device (i.e., its DWPD limit), which would cause it to rapidly wear out [6]. Therefore, our policy should only stores blocks in NVM that have less frequent writes.

Fig. 2: Flat Addressable Memory Architecture of NV-FogStore

V H-GREEDY Policy

V-a Knapsack Formulation for a hybrid system

Let M and M* be set of contents that should be duplicated in the DRAM and NVM respectively in order to reduce the expected HDD workload generated from the next request. Then the caching policy can be written for a hybrid system in the utility maximization formulation as a simple extension of a single-memory system described above, as follows:

(5)

subject to

V-B Hierarchical Rank-based Policy

From the insights of the knapsack formulation above, we propose a novel generalized algorithm, H-GREEDY, (Algorithm 1) for caching contents in NV-FogStore and dynamically moving data between the NVM and DRAM memories.

For our hierarchical greedy method, every content/file is provided with two ranks, each for content storage in DRAM and NVM.

Rank for content storage in DRAM (denoted by ) is given by, . Since minimum rank contents will be removed from memory, this rank means that content with a high number of reads and writes and lower file size will not be evicted to NVM.

Rank for content storage in NVM (denoted by ) is given by, . This rank indicates that content with lower popularity, higher file size and a high number of writes will be evicted from NVM to HDD Storage.

The popularity/read and write rate for content is updated only at the arrival of a request for that content. All the estimates are also updated every cycles, to do a complete refresh of the statistics stored. Due to the noisy popularity estimates, randomness is a collateral effect in our scheme.

V-C Convergence Issues

Running caching policies is difficult due to various convergence issues [20]. Fast convergence is paramount to the performance of these policies in real-world scenarios. To allow this, we also define the threshold number for each DRAM and NVM (), which is dependent on the utilization of the memories (Algorithm 1).

We also track the minimum rank content currently stored in DRAM and NVM (). The combination of these two numbers, threshold and minimum rank, is used to make the rank-threshold product (, ) will be used for our admission and eviction policies. This allows better convergence at a lower utilization of memories since early requests for storing content are not rejected because the threshold number is also lower.

V-D Overall Execution

The overall execution of the H-GREEDY policy is in the following manner: Every incoming object request is added to a queue. If the object is already stored in the system, its properties are updated, otherwise, it allocated a store (DRAM/NVM) or it is not stored, depending upon the admission policy. The store allocation for the content is done is by the rank-threshold product described above. Hence, for under-balanced channel allocation, the utilization of the corresponding channel can be increased by the tuning of the threshold parameter. Further, it is also allocated a bank depending upon where it is stored.

The threshold for both DRAM/NVM is then updated based on the current utilized capacity of the memories.

After a certain requests, the eviction policy is run on the contents with updated properties, content evictions carried out from the DRAM to NVM (and vice-versa), or deletion of the content from DRAM or NVM. Here, is a parameter, known during the execution of the policy and can be quickly determined by the request rate and the total number of banks available in the entire system. The data is evicted using the same threshold-rank described earlier. Hence, our eviction policy moves content in a two-level greedy approach that is write-damage aware.

At last, after a certain cycles of time, the properties of all the objects are reset, and cold objects are removed from the system. Here, , is a certain multiple of the DRAM latency cycles determined and set by empirical observation.

1:Procedure : H-Greedy
2:for each cycle do
3:     if   then:
4:         Add Request to Queue
5:         Update Properties
6:     else
7:         
8:               
9:     Adjust DRAM and NVM Threshold
10:     if (then
11:     else       
12:     Content usage predicted
13:     
14:     if (then:
15:          Reset , Remove cold      
16:End Procedure
Algorithm 1 H-Greedy
1:Procedure : RunEvictions
2:for key in Updated: do
3:      GetRank(key)
4:     case key DRAM :
5:     if  and  then:
6:               Evict to NVM
7:     else
8:         if ( and then:
9:               Evict to HDD/Delete               
10:     case key NVM :
11:     if  and  then:
12:               Evict to DRAM
13:     else
14:         if  then:
15:               Evict to HDD/Delete               
16:     case key HDD :
17:     if ( and DRAMFree()) then:
18:          return DRAM      
19:     if  and NVMFree()) then:
20:          return NVM      
21:End Procedure
1:Procedure : AllocateStore
2: GetRank()
3:if ( and DRAMFree()) then:
4:     if (then: ; return DRAM      
5:if  and NVMFree()) then:
6:     if (then: ; return NVM      
7:return HDD
8:End Procedure
1:Procedure : GetRank
2:Rank of Content
3:;
4:return
5:End Procedure
Algorithm 2 Eviction and Admission Policies

Vi Discussion and Performance Analysis

Vi-a Experimental Setup

We simulate the structure of requests using open-source web-traffic trace-generator

[21], each file is given a size and popularity index with values sampled from the respective distributions. The Hybrid-Memory System is simulated in an in-house cycle-accurate simulator (Python-based adapted from [22], modified with policies described), which runs on request traces generated. Our System configuration consists of 256GB DDR4 DRAM and 4TB NVM configured with JEDEC DDR4-SDRAM [23] and Intel Optane Standards respectively [24, 17]. Several traces are generated with varying mean file size, file size variance and Zipf’s Parameter, each consists of 25 million user-requests and 1 million unique objects.

Vi-A1 Trace Configurations

For our evaluation in the sections below, we can test for different configurations of the NVM/DRAM System (defined by their different cache capacities). We run the system on 9 traces, that is for three different values of the Zipf’s parameter = [0.9,0.8,0.7] and three different values of mean value size = [1MB, 5MB, 10 MB] for each of the values of the Zipf’s parameter. For simplification purposes, we assume the file size variance to be equal to the mean file size.

Vi-A2 Cache Capacity

While our setup seems too dependent on the assumptions of the configuration described above, our evaluation is actually agnostic to the configurations. So, to explain this we define the term cache capacity, as the ratio of the size of the respective DRAM and NVM memories to the size of the total request trace.

Hence, mathematically DRAM cache capacity can be written as ) and NVM cache capacity as ( (from the notations defined in Table I).

In other words, cache capacity signifies what percentage of all the content requested can altogether be stored in the given system. Since the number of requests in the trace is fixed, and system configuration in size of memory is fixed, the cache capacity is only dependent on the mean file size.

For the range of the mean file size described above (i.e. [1MB, 5MB, 10MB]), the cache capacity for the (DRAM, NVM) pairs can be written as : [(0.256,4), (0.05,0.8), (0.12,0.2)]. This means, the first pair given as (0.256,4), the DRAM can store 25.6% of the total size of requested content trace and NVM can store 4 times (400%) of the total size of the trace. Similarly, in pair 2, DRAM capacity is 5% and the NVM capacity is 80%, and pair 3, it is 1.2% and 20% respectively. This covers a significant range of real-world hybrid caching configurations.

By defining the term caching capacity, we can symbolically evaluate our policy without the dependency on the configuration.

DRAM System [23] Read(Write) Energy=51.2nJ, Read (Write) Bandwidth=75 GB/s, Read(Write) Latency=75 ns, Standby Power=1W/GB, Endurance=Very High
NVM System [24, 17] Read(Write) Energy=102.4(512)nJ, Read (Write) Bandwidth=2.2 (2.1) GB/s, Read(Write) Latency=10s, Standby Power=0.1W/GB, Endurance= 30DWPD
TABLE II: The parameters of NVM and DRAM

Vi-B Performance Metrics

The performance metrics considered in the evaluation of the system include the cost (initial purchase cost of the memory itself along with the replacement costs of memory due to wear) power-consumption of the system and an average latency of fetching the contents (time of response to user requests). These performance metrics can then be tracked, in our comparison of NV-FogStore with a DRAM-only architecture

Although, cost can be difficult to measure, (including replacement costs due to wear) as they may be vendor-specific, we track the metric of Cost Benefit Ratio (CBR, the higher the better), defined as follows :

(6)

This takes into account the different lifetimes and wear (predominantly due to writes) of DRAM and NVM in the overall cost.

Fig. 3: (a) Reads in DRAM/NVM/Misses(Served by Tertiary Storage), (b) Writes in DRAM/NVM/Misses(Served by Tertiary Storage) with varying parameter , Zipf’s Parameter and Mean File Size . The X-axis Labels are Mean File Size (in MB), DRAM Cache Capacity, NVM Cache Capacity. Cache Capacity is defined as the ratio of Capacity of Memory to the Total Trace Size.
Fig. 4: Average Size of Read/Write Request in DRAM/NVM, with varying Zipf’s Parameter and Mean File Size . The X-axis Labels are Mean File Size (in MB), DRAM Cache Capacity, NVM Cache Capacity. Cache Capacity is defined as the ratio of Capacity of Memory to the Total Trace Size.

Vi-C Evaluation of the Hierarchical Greedy Policy

Vi-C1 Write-Aware Evaluation

We run our policy on the traces generated (as described in section A) with varying the parameter h in our device-dependent model. And we compare our policy with the previous state-of-the-art approach [8] in this subsection. write-aware policy actually works,

Figure 3 (a) and (b) depicts the total number of reads and writes served in DRAM and NVM with varying the parameter (from 0 to 2), of our policy on the nine configurations of traces described in Section 5A.

In each of the nine configurations, increasing the parameter h, the fraction of writes served by NVM reduces and the fraction of reads served by NVM increases. Since, the reads served will take more time, and writes are reduced (less wear), this variation demonstrates the access latency-endurance trade-off. Since 3GPP standards require requests to have a Normalized delivery time of less than 4 milliseconds, it is possible to store several files in NVM using appropriately tuned parameters in our policy, without worrying about the latency.

For further analysis across the configurations, as the mean file size is increased (for a fixed Zipf’s Parameter) (Figure 3), the DRAM/NVM cache capacity decreases. Hence, the number of misses increase. However, even at very low cache capacities, it is visible that most of the requests are served by the DRAM/NVM System.

Similarly, keeping the mean file size fixed, and analyzing across increasing Zipf’s parameter (from the 0.7 to 0.9), it is seen more requests can be served by the DRAM, thus reducing the overall access time. This is because it is easier to estimate the popularity’s of each file correctly, with a more skewed distribution. Also, it can be seen that most of the reads and writes can be served even when DRAM and NVM are a small fraction of required capacity, and the fraction of requests served increases on increasing the Zipf’s parameter. For modern internet traffic, the parameter lies close to 1, so we can expect most of the requests to be served in a relatively short time, with very few misses.

Vi-C2 Request Size-Aware Evaluation

Fig 4. shows the average size of the file stored in DRAM and NVM, for the nine trace configurations. As seen, the average size of the file stored in NVM is much larger, thus leveraging as a block device (for block reads) compared to DRAM. This proves the working of the size-aware nature of our policy.

Fig 4, can be analyzed for varying mean file size and Zipf’s parameter. On decreasing the cache capacity, the average file decreases, because only the smallest most popular files are given priority. On increasing the Zipf’s parameter, the distribution gets more skewed and the average file size that needs to be stored reduces. This can be reasoned as the content popularity’s decrease faster due to more skewed distribution, size of the file in admission reduces for the same rank.

Vi-D Comparison with Optimal Offline Policy

We perform a comparison of our methodology with the optimal offline policy. This is necessary to determine the effectiveness of our policy in real-time execution (where statistics of the data items, such as popularity’s of the content are determined on the go and are noisy), compared to an offline version where the popularity’s, number of writes of and the sequence of requests for each content are known beforehand.

The offline optimal policy selects content on the H-Greedy method and allocates the store on the rank calculated from the known properties beforehand.

Figure 5 shows in detail the comparison, across all the nine configurations, when the policy is run with parameter .

For simplification, we discuss average statistics across the nine configurations here. The average number of reading requests served by DRAM/NVM/Misses are in ratio for the offline policy, whereas, in the online policy, they are in ratio . The average number of write requests served by DRAM/NVM/Misses are in the ratio for the offline policy, whereas, in the online policy, they are in the ratio . So, in comparison with an optimal offline policy, our online policy does comparatively well in serving read requests, although the ratio of write requests served by NVM is 28% in online policy is worse compared to 22% in an optimal offline policy.

This makes sense because the number of read requests are larger and read popularity estimation is easier. Better heuristics may be created for closing the gap between offline and online policies.

Fig. 5: (a) Reads in DRAM/NVM/Misses(Served by Tertiary Storage), (b) Writes in DRAM/NVM/Misses(Served by Tertiary Storage) with = 2, Zipf’s Parameter and Mean File Size showing comparison of Online and Offline Policies. The X-axis Labels are Mean File Size (in MB), DRAM Cache Capacity, NVM Cache Capacity. Cache Capacity is defined as the ratio of Capacity of Memory to the Total Trace Size.

Vi-E Benchmarking with DRAM-Based System

Figure 6 compares NV-FogStore with a DRAM-only architecture. The parameters of the NVM and DRAM System used for this comparison are given in Table 2.

Since DRAM forms less than 10% of the capacity in NV-FogStore, both power consumption and cost are dramatically decreased. Although, the average access time is increased compared to a DRAM-based system, it is about 2 lower than an only NVM-based system. The access time can be increased because the effect on the overall delivery time is negligible by Fog-RAN standards [3].

Even under different objectives, our proposed policy can be easily tuned to match the owner’s requirement. Preventing the storage of content with a large number of writes in NVM to increase lifetime is another trade-off with the latency of content fetch as it means that more popular contents will now be stored in NVM instead of DRAM.

Our scheme is configurable to explore the trade-off between the performance taken as average access latency and cost-benefit ratio.

Fig. 6: Effect of System Configuration of NV-FogStore on various Parameters such as Average Access Time, Power Consumption, Cost and Miss Rate. Miss Rate is zero for the first configuration.

Vii Conclusions

In this paper, we investigate utilizing non-volatile-dram memory for content caching to optimize cost and power consumption in F-RAN (Fog-Radio Access Networks) and CDN (Content Delivery Networks) with storage under modern-5G workloads. An optimal caching policy , which takes into consideration the user requests, user preference learning-based recommendation files, local popularity of content to determine the file trace produced and it’s key-value commands. Mechanisms of such a caching policy maybe hand-optimized [25, 26, 27]

, incorporating a machine-learned based advice or value-function-based reinforcement learning techniques

[28, 29, 30, 31].
Our work is focused on the design of memory policies which meet the demands of these emerging workloads while being deployed at the edge and give a significant reduction in energy costs. For this, we proposed a write-amplification aware, hierarchical greedy caching policy which is aware in both the size and frequency of request objects. Further optimizations are done, based on threshold-rank product to allow optimum convergence. The performance of our policies and are compared against prior policies using simulations. To the best of our knowledge, this is the first study exploring the usage of NVM in a Fog-RAN environment. Future work may involve the exploration of Probabilistic caching approaches based on Dynq-LRU or Simulated Annealing based approaches for Hybrid Systems.

References

  • [1] G. S. Paschos, G. Iosifidis, M. Tao, D. Towsley, and G. Caire, “The role of caching in future communication systems and networks,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 6, pp. 1111–1125, 2018.
  • [2] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey on mobile edge computing: The communication perspective,” IEEE Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2322–2358, 2017.
  • [3] “3GPP Docs,” https://www.3gpp.org/ftp/tsg_ran/WG3_Iu/TSGR3_91bis/Docs/.
  • [4] E. Nygren, R. K. Sitaraman, and J. Sun, “The akamai network: a platform for high-performance internet applications,” ACM SIGOPS Operating Systems Review, vol. 44, no. 3, pp. 2–19, 2010.
  • [5] J. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl, “Globally distributed content delivery,” IEEE Internet Computing, vol. 6, no. 5, pp. 50–58, 2002.
  • [6] A. Eisenman, D. Gardner, I. AbdelRahman, J. Axboe, S. Dong, K. Hazelwood, C. Petersen, A. Cidon, and S. Katti, “Reducing dram footprint with nvm in facebook,” in Proceedings of the Thirteenth EuroSys Conference.   ACM, 2018, p. 42.
  • [7] P. R. Jelenković and A. Radovanović, “Optimizing lru caching for variable document sizes,” Combinatorics, Probability and Computing, vol. 13, no. 4-5, pp. 627–643, 2004.
  • [8] G. Neglia, D. Carra, and P. Michiardi, “Cache policies for linear utility maximization,” IEEE/ACM Transactions on Networking, vol. 26, no. 1, pp. 302–313, 2018.
  • [9] C. Fricker, P. Robert, and J. Roberts, “A versatile and accurate approximation for lru cache performance,” in 2012 24th International Teletraffic Congress (ITC 24).   IEEE, 2012, pp. 1–8.
  • [10] S. Shukla and A. A. Abouzeid, “Optimal device-aware caching,” IEEE Transactions on Mobile Computing, vol. 16, no. 7, pp. 1994–2007, 2016.
  • [11] G. Neglia, D. Carra, M. Feng, V. Janardhan, P. Michiardi, and D. Tsigkari, “Access-time-aware cache algorithms,” ACM Trans. Model. Perform. Eval. Comput. Syst., vol. 2, no. 4, pp. 21:1–21:29, Nov. 2017. [Online]. Available: http://doi.acm.org/10.1145/3149001
  • [12] S. Shukla and A. A. Abouzeid, “Proactive retention aware caching,” in IEEE INFOCOM 2017-IEEE Conference on Computer Communications.   IEEE, 2017, pp. 1–9.
  • [13] A. Dabirmoghaddam, M. M. Barijough, and J. Garcia-Luna-Aceves, “Understanding optimal caching and opportunistic caching at the edge of information-centric networks,” in Proceedings of the 1st ACM conference on information-centric networking.   ACM, 2014, pp. 47–56.
  • [14] H. Liu, L. Huang, Y. Zhu, and Y. Shen, “Librekv: A persistent in-memory key-value store,” IEEE Transactions on Emerging Topics in Computing, 2017.
  • [15] K. A. Bailey, P. Hornyack, L. Ceze, S. D. Gribble, and H. M. Levy, “Exploring storage class memory with key value stores,” in Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads.   ACM, 2013, p. 4.
  • [16] B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny, “Workload analysis of a large-scale key-value store,” in ACM SIGMETRICS Performance Evaluation Review, vol. 40, no. 1.   ACM, 2012, pp. 53–64.
  • [17] L. Liu, M. Xie, and H. Yang, “Memos: revisiting hybrid memory management in modern operating system,” arXiv preprint arXiv:1703.07725, 2017.
  • [18] H. Liu, Y. Chen, X. Liao, H. Jin, B. He, L. Zheng, and R. Guo, “Hardware/software cooperative caching for hybrid dram/nvm memory architectures,” in Proceedings of the International Conference on Supercomputing, ser. ICS ’17.   New York, NY, USA: ACM, 2017, pp. 26:1–26:10. [Online]. Available: http://doi.acm.org/10.1145/3079079.3079089
  • [19] S. Mittal and J. S. Vetter, “A survey of software techniques for using non-volatile memories for storage and main memory systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1537–1550, 2016.
  • [20] B. Jiang, P. Nain, and D. Towsley, “Lru cache under stationary requests,” ACM SIGMETRICS Performance Evaluation Review, vol. 45, no. 2, pp. 24–26, 2017.
  • [21] D. S. Berger, R. K. Sitaraman, and M. Harchol-Balter, “Adaptsize: Orchestrating the hot object memory cache in a content delivery network,” in 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), 2017, pp. 483–498.
  • [22] J. Stevens, P. Tschirhart, M.-T. Chang, I. Bhati, P. Enns, J. Greensky, and Z. Chisti, “An integrated simulation infrastructure for the entire memory hierarchy: Cache, dram, nonvolatile memory, and disk,” Intel Technology Journal, vol. 17, no. 1, pp. 184–200, 2013.
  • [23] “JEDEC DDR4 Standards,” https://www.jedec.org/category/technology-focus-area/main-memory-ddr3-ddr4-sdram.
  • [24] “Optane Manual,” https://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/optane-memory-brief.pdf.
  • [25] F. Qazi, O. Khalid, R. N. B. Rais, I. A. Khan et al., “Optimal content caching in content-centric networks,” Wireless Communications and Mobile Computing, vol. 2019, 2019.
  • [26] C. Ren, X. Lyu, W. Ni, H. Tian, and R. P. Liu, “Profitable cooperative region for distributed online edge caching,” IEEE Transactions on Communications, 2019.
  • [27] G. S. Paschos, A. Destounis, L. Vigneri, and G. Iosifidis, “Learning to cache with no regrets,” arXiv preprint arXiv:1904.09849, 2019.
  • [28] M. Tao, D. Gündüz, F. Xu, and J. S. P. Roig, “Content caching and delivery in wireless radio access networks,” arXiv preprint arXiv:1904.07599, 2019.
  • [29] J. Song, H. Song, and W. Choi, “Which one is better to cache requested contents or interfering contents?” IEEE Wireless Communications Letters, 2019.
  • [30] A. Sadeghi, F. Sheikholeslami, and G. B. Giannakis, “Reinforcement learning for adaptive caching with dynamic storage pricing,” arXiv preprint arXiv:1812.08593, 2018.
  • [31] A. Sadeghi, G. Wang, and G. B. Giannakis, “Adaptive caching via deep reinforcement learning,” arXiv preprint arXiv:1902.10301, 2019.