I Introduction
Caching content at the network edge can mitigate the heavy traffic burden at network peak times. Contents are proactively stored in caches at the Edge Nodes (ENs) or at the end-users during low-traffic periods, relieving network congestion at peak hours [1, 2]. Edge caching at the ENs can enable cooperative wireles transmission in the presence of shared cached contents across multiple ENs [4, 9, 14]. In contrast, caching of shared content at the users enables the multicasting of coded information that is useful simultaneously for multiple users [5, 6, 7, 8].
In practice, not all contents can be cached, and requested uncached contents should be fetched from a central server through finite-capacity fronthaul links. This more general set-up, illustrated in Fig. 1, was studied in [9, 10, 11, 12] (see also reference therein) in the absence of users’ caches. These references consider as the performance metric of interest the overall delivery latency, including both fronthaul and wireless contributions. In particular, in prior works [14][15], the delivery latency is measured in the high Signal-to-Noise Ratio (SNR) regime. While [9, 10, 11] allow any form of delivery strategy, including interference alignment, in [12], the optimal high-SNR latency performance is studied under the assumption that wireless transmission can only use practical one-shot linear precoding strategies. Reference [12] presents a caching-fronthaul-wireless transmission scheme that is shown to be latency-optimal within a multiplicative factor of .
In this paper, we extend the results in [12] by allowing caching not only at the ENs but also at the end-users. To this end, we consider the cloud-RAN scenario in Fig. 1 and evaluate the impact of both cooperative transmission opportunities at the ENs and multicasting opportunities brought by caching at the users. Caching at the users has, in fact, two potentially beneficial effects on the network performance. First, since users have already cached some parts of the library, they need not receive from the network the cached portion of the requested file — this is known as the local caching gain. Second, as assumed, by a careful cache content placement, a common coded message can benefit more than one user, which is known as the global caching gain [5]. Assuming that entire library is cached across ENs and users and that the fronthaul links are absent, reference [14] proved that the gains accrued form cooperative transmission by the ENs and the global caching gain provided by users’ caches are additive. Here, we generalize this conclusion by considering the role of finite-capacity fronthaul links and by allowing for partial caching of the library of popular files across ENs and users.
The rest of the paper is organized as follows. In Section II we describe the system model. Section III presents the main results along with an intuitive discussion. In Section IV we detail the proposed caching and delivery scheme. Then, we derive a converse in Section V, which is proved to be within a multiplicative gap of 3/2 as compared to the high-SNR performance achievable by the proposed scheme. Finally, Section VI concludes the paper.
Ii System Model
We consider a content delivery scenario, illustrated in Fig. 1, in which Edge Nodes (ENs), each with antennas, deliver requested contents to single-antenna users via a shared wireless medium. The contents library includes files, each of bits, which are collected in set . Furthermore, each file is divided into packets, collected in the set , where is an arbitrary integer and each packet consists of bits. Each EN is connected to a central server, where the library resides, via a wired fronhaul link of capacity bits per symbol of the wireless channel. Moreover, each EN is equipped with a cache of size files, for . In this paper, in contrast to [12], we assume that the users are also cache-enabled, each with a cache of size files, for . Henceforth, for simplicity, we assume that both and are integers, with extensions following directly as in [9, 12, 14]
The system operation includes two phases, namely the cache content placement and content delivery phases. In the first phase, each EN and each user caches uncoded fractions of the files in the library at network off-peak traffic hours and without knowing the actual requests of the users in the next phase. In the second phase, at network peak traffic hours, at any transmission slot, each active user requests access to one of the files in the library, i.e., user requests file , . For delivery, first, the cloud sends on each fronthaul link some uncoded fractions of the requested files to the ENs. For more general ways to use the fronthaul links, we refer to [9]. After fronthaul transmission, the ENs collaboratively deliver the requested contents to the users via the edge wireless downlink channel based on the cached contents and fronthaul signals.
The signal received by each user on the downlink channel is given as
(1) |
in which
is the complex representation of the fading channel vector from EN
to user ; is the transmitted vector from EN ; is unit-power additive Gaussian noise; and represents the Hermitian transpose. The fading channels are drawn from a continuous distribution and are constant in each transmission slot. The transmission power of each EN is constrained by . Furthermore, as in [12, 14], the ENs transmit using one-shot linear precoding, so that the vector transmitted by each EN at time slot is given as(2) |
where is a symbol encoding file fraction , and is the corresponding beamforming vector. Furthermore, we assume that Channel State Information (CSI) is available to all the entities in the network.
The performance metric of interest is the Normalized Delivery Time (NDT) introduced in [9], which measures the high-SNR latency due to fronthaul and wireless transmissions. To this end, we write , hence allowing the fronthaul capacity to scale with the wireless edge with a scaling constant . Then, denoting the time required to complete the fronthaul and wireless edge transmissions as and (measured in symbol periods of the wireless channel) respectively, the total NDT is defined as the following limit over SNR and file size
(3) | ||||
In (3), the term represents the normalizing delivery time on an interference-free channel; the term is defined as the fronthaul NDT; and as the edge NDT.
Accordingly, for given cloud and caching resources defined by the triple , the minimal NDT over all achievable policies is defined as
(4) |
where the infimum is over all uncoded caching, uncoded fronthaul, and one-shot linear edge transmissions policies that ensure reliable delivery for any set of requested files [12, 9].
Iii Main Result
In this section we state our main result and its implications. We proceed by first proposing an achievable scheme and then proving its optimality within a constant multiplicative gap.
In the cache content placement phase, the scheme follows the standard approach of sharing a distinct fraction of a file to all subsets of ENs and users, hence satisfying the cache capacity constraints [14]. As a result, each fraction of any requested file is available at users, which we define as receive-side multiplicity, and at ENs. As we will see, in the content delivery phase, the transmit-side multiplicity , i.e., the number of ENs at which any fraction of a requested files is available, can be increased beyond by means of fronthaul transmission.
As proved in [14], and briefly reviewed below, the content multiplicities and can be leveraged in order to derive a delivery scheme that serves simultaneously
(5) |
users at the maximum high-SNR rate of (SNR). Unlike [14], however, here the transmit-side multiplicity is not fixed, since any uncached fraction of a file can be delivered to an EN by the cloud on the fronthaul. The multiplicity can be hence increased at the cost of a larger fronthaul delay . Therefore, the multiplicity should be chosen carefully, by accounting for the fronthaul latency as well as for the wireless NDT , which decreases with the size of the number of users that can be served simultaneously. Our main result below obtains an approximately optimal solution in terms of minimum NDT.
Before detailing the main result, we briefly present how the scheme in [14] serves users simultaneously at rate (SNR) by leveraging both multicasting and cooperative Zero-Forcing (ZF) precoding. Assume that (the complement case follows in a similar way). At any given time, ENs transmit simultaneously to deliver fractions of the requested files to users. To this end, the active ENs are grouped into all subsets of active ENs. Note that there are such groups, and that each EN generally belongs to multiple groups. All groups transmit at the same time, with each group delivering collaboratively a shared fraction of a file to the requesting user. Transmission by a group is done within the null space of the channel of other active users by means of Zero-Forcing (ZF) one-shot linear precoding. The interference created by this transmission to the remaining active users is removed by leveraging the information in the receive-side caches. This is possible since the caching strategy ensures that the message transmitted by a group of ENs is also available to users. Note that the scheme in [14] assumes but the extension described above is straightforward.
Based on the above mentioned achievable scheme, along with an optimized transmit-side multiplicity, the following theorem characterizes the minimum NDT (4) to within a multiplicative constant equal to .
Theorem 1 (Multiplicative gap on minimum NDT).
The NDT
(6) |
is achievable, where
(7) |
and
(8) |
with
(9) |
and
(10) |
Moreover, the minimum NDT satisfies the inequalities
(11) |
Theorem 1 implies that the multiplicity in (7) is optimal in terms of NDT, up to a constant multiplicative gap. Importantly, in contrast to [12], the choice of in (7) depends also on the caching capacity at the users, and it reduces to selection in [12] when . The first term in (1) is the fronthaul NDT required to convey the uncached portions of files to achieve the desired multiplicity in (7). The second term is the edge transmission NDT , which accounts for the local caching gain (i.e., ), and for the combined global caching gain due to the users’ caches and for the cooperation gain due to the ENs’ caches and to fronthaul transmission (i.e., ). The result hence generalizes the main conclusion from [13] and [14] that the gains from coded caching multicasting opportunities at the receive side and cooperation at the transmit side are additive.
Example 1.
The achievable NDT in (1), along with the lower bound derived in Lemma 1 in Section V, are plotted in Fig. 2 as a function of the users’ cache capacity for different values of the parameters , and . We set the number of ENs and users to and and the fronthaul rate to . Note that for non-integer values of , the achievable NDT is obtained by memory sharing between the receive-side multiplicities and [16]. It is observed that caching at the end-users is more effective when the number of EN transmit antennas and/or the transmit-side caches are small. Furthermore, when the transmit-side multiplicity is sufficient to serve all users at the same time, end-user caching only provides local caching gains. In particular, this happens when and , in which case the NDT is seen to decrease linearly with .
Iv Achievable Scheme
The achievable scheme generalizes the strategies proposed in [12] and [14] by accounting for fronthaul transmission and for the caches available at the users. As discussed in Section III, the cache content placement phase uses the same approach proposed in [14], which guarantees content replication of and at the transmit and receive sides, respectively.
In the content delivery phase, fronthaul transmission provides packets from the requested files to the ENs in order to increase the transmit-side multiplicity to the desired value . This is at the cost of the fronthaul delay
(12) |
given that bits need to be delivered for each requested file (see also [12]).
Based on the multiplicities and , the number of users that can be served at the same time is (5). Since each user has cached a -fraction of its requested file, the edge NDT is given by [12]
(13) |
The transmit-side multiplicity should be tuned such that the total delivery latency is minimized. First, we determine the maximum possible multiplicity from the following necessary conditions
(14) |
which result in given in (10). To proceed, we first focus on the case of , and find a close-to-optimal multiplicity . Then, based on the expression for , we propose a specific choice for the multiplicity for the general case where .
To start, in the case when , from (12) and (13), the total NDT is
(15) |
In order to optimize over , we find the (only) stationary point for function (15) as
(16) |
We then approximate the integer solution of the original problem to be the nearest positive integer smaller than , yielding (8).
For the general case , we propose the choice (7) for the transmit-side multiplicity. Accordingly, when , and hence the transmit-side caches are small, packets are sent over the fronthaul links so that the aggregate multiplicity is equal to the value selected above when . For the case , instead, the transmit-side multiplicity (7) only relies on EN caching, and fronthaul transmission is not carried out. In particular, when , the maximum multiplicity can be guaranteed directly by EN caching. Theorem 1 demonstrates the near-optimality of this choice.
As illustration for how the user cache capacity affects transmit-side multiplicity is shown in Fig. 3 for , , . As increases, user-side caching becomes more effective, and less EN-side cooperation is needed to null out interference. Accordingly, the transmit-side multiplicity decreases with , and it depends on only when is sufficiently small.
V Multiplicative Optimality
In this section we demonstrate that the achievable NDT in Theorem 1 is within a multiplicative constant gap of the minimum NDT by proving (11). To this end, we extend the converse proof developed in [12] in order to account for the presence of users’ caches. First, without loss of generality, consider a split of each file into subsets of packets , such that each part is indexed by the subsets of indices and . The subset includes the packets of that are present at all the ENs after fronthaul transmissions and at all the users . We also define as the number of packets of file that are cached at all the ENs in , and at all the users in ; to be number of packets from file that are transmitted to all the users in via the fronthaul for a given demand vector and cached at all the users in . Note that these quantities are well-defined for every policy. With these definitions, NDT of any achievable policy can be lower bounded by the solution to the following optimization problem:
(17) | ||||
(18) |
where function is implicitly defined as the minimum edge NDT in (3) for given cache and fronthaul policies when the request vector is , while is a function for that satisfies conditions (e) and (f). We have also defined
(19) |
In (17), the equality (b) guarantees the availability of all the requested files; inequalities (c) are due to the fact that the size of the cached content of each EN is limited by the cache capacity ; similarly, inequalities (d) enforce the cache capacity constraint at each user ; inequalities (e) follow from the definition of fronthaul NDT in (3), since the left-hand side is the number of packets sent to EN via the fronthaul link; in (f), is upper bounded by since the maximum multiplicity is and hence the total number of bits required via fronthaul is .
In (17), the expression of function is generally unknown. Notwithstanding this complication, the following Lemma gives a lower bound to the solution of the above optimization problem. Proof can be found in Appendix A.
Lemma 1.
The minimum value of the optimization problem in (17) is lower bounded by
(20) |
where we have defined
(21) |
Vi Conclusions
For a cache-enabled cloud-RAN architecture where both the ENs and the end-users have caches, this paper has characterized the minimum delivery latency in the high-SNR to within a multiplicative gap of . Under the practical constraint that the ENs can only transmit using one-shot linear precoding, the main result shows that the cooperation gains accrued by EN cooperation via EN caching and fronthaul transmission are additive with respect to the multicasting gains offered by end-user caching.
Acknowledgements
Jingjing Zhang and Osvaldo Simeone have received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreement No. 725731).
References
- [1] J. Kangasharju, J. Roberts, and K. Ross, “Object Replication Strategies in Content Distribution Networks,” Computer Communications, vol. 38, no. 4, pp. 376–383, 2002.
- [2] E. Nygren, R.K. Sitaraman, and J. Sun. “The Akamai Network: A platform for high-performance Internet application”, ACM SIGOPS Operating Systems Review, vol. 44, no. 3, pp. 2–19, 2010.
- [3] K. Shanmugam, N. Golrezaei, A. G. Dimakis, A. F. Molisch and G. Caire, “FemtoCaching: Wireless Content Delivery Through Distributed Caching Helpers,” IEEE Trans. Inf. Theory, vol. 59, no. 12, pp. 8402–8413, Dec. 2013.
- [4] A. Liu and V. K. N. Lau, ”Exploiting Base Station Caching in MIMO Cellular Networks: Opportunistic Cooperation for Video Streaming,” IEEE Trans. Signal Process., vol. 63, no. 1, pp. 57–69, Jan. 2015.
- [5] M. A. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEE Trans. Inf. Theory, vol. 60, no. 5, pp. 2856–2867, 2014.
- [6] N. Karamchandani, U. Niesen, M. A. Maddah-Ali, and S. N. Diggavi, “Hierarchical Coded Caching,” IEEE Trans. Inf. Theory, vol. 62, no. 6, pp. 3212–3229, 2016.
- [7] R. Pedarsani, M. A. Maddah-Ali and U. Niesen, “Online Coded Caching,” IEEE/ACM Trans. Netw., vol. 24, no. 2, pp. 836-845, April 2016.
- [8] J. Zhang and P. Elia, “Fundamental limits of cache-aided wireless BC: Interplay of coded-caching and CSIT feedback,” IEEE Trans. Inf. Theory, vol. 63, no. 5, pp. 3142–3160, 2017.
- [9] A. Sengupta, R. Tandon, and O. Simeone, “Fog-aided wireless networks for content delivery: Fundamental latency tradeoffs,” IEEE Trans. Inf. Theory, vol. 63, no. 10, pp. 6650–6678, Oct. 2017.
- [10] R. Tandon and O. Simeone, “Cloud-aided wireless networks with edge caching: Fundamental latency trade-offs in fog radio access networks,” in Proc. IEEE Int. Symp. on Inf. Theory (ISIT), pp. 2029–2033, 2016.
- [11] S. M. Azimi, O. Simeone, A. Sengupta, and R. Tandon, “Online Edge Caching in Fog-Aided Wireless Network,” in Proc. IEEE Int. Symp. on Inf. Theory (ISIT), pp. 1217–1221, 2017.
- [12] J. Zhang, O. Simeone, “Fundamental Limits of Cloud and Cache-Aided Interference Management with Multi-Antenna Base Stations” in Proc. IEEE Int. Symp. on Inf. Theory (ISIT), pp. 1425–1429, 2018.
- [13] S. P. Shariatpanahi, S. A. Motahari and B. H. Khalaj, “Multi-Server Coded Caching,” IEEE Trans. Inf. Theory, vol. 62, no. 12, pp. 7253–7271, Dec. 2016.
- [14] N. Naderializadeh, M. A. Maddah-Ali and A. S. Avestimehr, ”Fundamental Limits of Cache-Aided Interference Management,” IEEE Trans. Inf. Theory, vol. 63, no. 5, pp. 3092–3107, May 2017.
- [15] M. A. Maddah-Ali and U. Niesen, “Cache-aided interference channels,” in Proc. IEEE Int. Symp. on Inf. Theory (ISIT), pp. 809–813, 2015.
- [16] H. Ghasemi and A.Ramamoorthy, ”Improved Lower Bounds for Coded Caching,” IEEE Trans. Inf. Theory, vol. 63, no. 7, pp. 4388-4413, July, 2017.
Appendix A Proof of Lemma 1
To prove Lemma 1, we substitute the maximum over all the possible request vectors in the objective of (17) with an average over them. The solution of the resulting problem yields a lower bound to the solution of the orginal problem. Mathematically, the objective (a) in (17) is substituted with
(22) | ||||
where we have defined . In order to deal with the unknown function , we need the following lemma.
Lemma 3.
Define as the subset of edge nodes that have access to the packet after the fronhaul transmission, and as the subset of users that have cached the packet . Then, the number of users that can be served at the same time is upper bounded by
(23) |
Proof.
Using the above lemma we will have the following lower bound on the minimum edge NDT:
(24) | ||||
since at most users can be served simultaneously when the multiplicities at the ENs and the users are and , respectively. Now we lower bound the first term in (22) as
where (a) holds because of Lemma 3; in (b) we have defined
(25) |
in (c) we have defined
(26) |
in (d) we have used the inequality
(27) |
with and ; and finally (e) results from the equality . This results from summing up (18b) for all request vectors and for all files in each request vector as follows:
(28) | ||||
Now we lower bound for the term related to the fronthaul delay as follows:
(29) | ||||