I Introduction
A key challenge for the future wireless networks is the increasing video traffic demand, which reached 70% of total mobile IP traffic in 2015 [2]. Classical downlink systems cannot meet this demand since they have limited resource blocks, and therefore as the number of simultaneous video transfers increases, the pervideo throughput vanishes as . Recently it was shown that scalable pervideo throughput can be achieved if the communications are synergistically designed with caching at the receivers. Indeed, the recent breakthrough of coded caching [3] has inspired a rethinking of wireless downlink. Different video subfiles are cached at the receivers, and video requests are served by coded multicasts. By careful selection of subfile caching and exploitation of the wireless broadcast channel, the transmitted signal is simultaneously useful for decoding at users who requested different video files. This scheme has been theoretically proven to scale well, and therefore has the potential to resolve the challenge of downlink bottleneck for future networks. Nevertheless, several limitations hinder its applicability in practical systems [4]. In this work, we take a closer look to the limitations that arise from the fact that coded caching was originally designed for a symmetric errorfree shared link.
If instead we consider a realistic wireless channel, we observe that coded caching faces a shortterm limitation. Namely, its performance is limited by the user in the worst channel condition because the wireless multicast capacity is determined by the worst user [5, Chapter 7.2]. This is in stark contrast with standard downlink techniques such as opportunistic scheduling [6, 7, 8], which serve the user with the best instantaneous channel quality. Thus, a first challenge is to modify coded caching for exploitation of fading peaks, similar to the opportunistic scheduling.
In addition to the fast fading consideration, there is also a longterm limitation due to a network topology. Namely, the illpositioned users, e.g. users at the cell edge, may experience consistently poor channel quality during a whole video delivery. The classical coded caching scheme is designed to provide video files at equal data rates to all users, which leads to illpositioned users consuming most of the air time and hence driving the overall system performance to low efficiency. In the literature of wireless scheduling without caches at receivers, this problem has been resolved by the use of fairness among user throughputs [7]. By allowing poorly located users to receive less throughput than others, precious air time is saved and the overall system performance is greatly increased. Since the sum throughput rate and equalitarian fairness are typically the two extreme objectives, past works have proposed the use of alphafairness [9] which allows to select the coefficient and drive the system to any desirable tradeoff point in between of the two extremes. Previously, the alphafair objectives have been studied in the context of (i) multiple user activations [6], (ii) multiple antennas [10] and (iii) broadcast channels [11]. However, in the presence of caches at user terminals, the fairness problem is further complicated by the interplay between user scheduling and designing codewords for multiple users. In particular, we wish to shed light into the following questions: Which user requests shall we combine together to perform coded caching? How shall we schedule a set of users to achieve our fairness objective while adapting to timevarying channel quality?
To address these questions, we study the content delivery over a realistic blockfading broadcast channel, where the channel quality varies across users and time. Although the decisions of user scheduling and codeword design are inherently coupled, we design a scheme which decouples these two problems, while maintaining optimality through a specifically designed queueing structure. On the transmission side, we select the multicast user set dynamically depending on the instantaneous channel quality and user urgency captured by queue lengths. On the coding side, we adapt the codeword construction of [1] to the set of users chosen by the appropriate routing which depends also on the past transmission side decisions. Combining with an appropriate congestion controller, we show that this approach yields our alphafair objective.
More specifically, our approaches and contributions are summarized below:

We design a novel queueing structure which decouples the channel scheduling from the codeword construction. Although it is clear that the codeword construction needs to be adaptive to channel variation, our scheme ensures this through our backpressure that connects the user queues and the codeword queues. Hence, we are able to show that this decomposition is without loss of optimality (see Theorem 6).

We then provide an online policy consisting of (i) admission control of new files into the system; (ii) combination of files to perform coded caching; (iii) scheduling and power control of codeword transmissions to subset of users on the wireless channel. We prove that the longterm video delivery rate vector achieved by our scheme is a near optimal solution to the alphafair optimization problem under the restriction to policies that are based on the decentralized coded caching scheme
[1]. 
Through numerical examples, we demonstrate the superiority of our approach versus (a) standard coded caching with multicast transmission limited by the worst channel condition yet exploiting the global caching gain, (b) opportunistic scheduling with unicast transmissions exploiting only the local caching gain. This shows that our scheme not only is the best among online decentralized coded caching schemes, but moreover manages to exploit opportunistically the timevarying fading channels.
Ia Related work
Since coded caching was first introduced in [3] and its potential was recognized by the community, substantial efforts have been devoted to quantify the gain in realistic scenarios, including decentralized placement [1], nonuniform popularities [12, 13], and more general network topologies (e.g. [14, 15, 16]). A number of recent works have studied coded caching by replacing the original perfect shared link with wireless channels [17, 18, 19, 20, 21, 22]. In particular, the harmful effect of coded caching over wireless multicast channels has been highlighted recently [23, 17, 18, 21], while similar conclusions and some directions are given in [23, 17, 18, 20]. Although [23] consider the same channel model and address a similar question as in the current work, they differ in their objectives and approaches. [23] highlights the scheduling part and provides rigorous analysis on the longterm average peruser rate in the regime of large number of users. In the current work, a new queueing structure is proposed to deal jointly with admission control, routing, as well as scheduling for a finite number of users.
Furthermore, most of existing works have focused on offline caching where both cache placement and delivery phases are performed once without capturing the random and asynchronous nature of video traffic. The works [24, 25] addressed partly the online aspect by studying cache eviction strategies, the delivery delay, respectively. In this work, we will explore a different online aspect. Namely, we assume that the file requests from users arrive dynamically and the file delivery is performed continuously over timevarying fading broadcast channels.
Finally, online transmission scheduling over wireless channels has been extensively studied in the context of opportunistic scheduling [6] and network utility maximization [26]. Prior works emphasize two fundamental aspects: (a) the balancing of user rates according to fairness and efficiency considerations, and (b) the opportunistic exploitation of the timevarying fading channels. There have been some works that study scheduling policies over a queuedfading downlink channel; [27] gives a maxweighttype of policy and [28] provides a throughput optimal policy based on a fluid limit analysis. Our work is the first to our knowledge that studies coded caching in this setting. The new element in our study is the joint consideration of user scheduling with codeword construction for the coded caching delivery phase.
Ii Coded Caching over Wireless Channels
Iia System Model
We consider a content delivery network where a server (or a base station) wishes to convey requested files to user terminals over a wireless channel; in Fig. 1 we give an example with K=3. The wireless channel is modeled by a standard blockfading broadcast channel, such that the channel state remains constant over a slot and changes from one slot to another in an i.i.d. manner. Each slot is assumed to allow for channel uses. The channel output of user in any channel use of slot is given by
(1) 
where the channel input is subject to the power constraint ; are additive white Gaussian noises with covariance matrix identity of size , assumed independent of each other; are channel fading coefficients independently distributed across time. At each slot , the channel state is perfectly known to the base station while each user knows its own channel realization.
We follow the network model considered in [3] as well as its followup works. The server has an access to equally popular files , each bits long, while each user is equipped with cache memory of bits, where . We restrict ourselves to decentralized cache placement [1]. More precisely, each user independently caches a subset of bits of file , chosen uniformly at random for , under its memory constraint of bits. For later use, we let denote the normalized memory size. By letting denote the subfile of stored exclusively in the cache memories of the user set , the cache memory of user after decentralized placement is given by
(2) 
Under the assumption of large file size (
), we use the law of large numbers to calculate the size of each subfile (measured in bits) as the following
(3) 
Once the requests of all users are revealed, decentralized coded caching proceeds to the delivery of the requested files (delivery phase). Assuming that user demands file , and writing , the server generates and conveys the following codeword simultaneously useful to the subset of users :
(4) 
where denotes the bitwise XOR operation. The central idea of coded caching is to create a codeword simultaneously useful to a subset of users by exploiting the receiver side information established during the placement phase. This multicasting operation leads to a gain: let us consider the uncoded delivery such that subfiles are sent sequentially. The total number of transmissions intended to users is equal to . The coded delivery requires the transmission of , yielding a reduction of a factor . It can be shown that the transmitted signal as per (4
) can be decoded correctly with probability 1 by all intended receivers. In order to further illustrate the placement and delivery of decentralized coded caching, we provide a threeuser example in Fig.
1.Example 1.
Let us assume that user 1, 2, 3, requests file , respectively. After the placement phase, a given file will be partitioned into 8 subfiles, one per user subset. Codewords to be sent are the following:

, and to user , and , respectively.

is intended to users . Once received, user decodes by combining the received codeword with given in its cache. Similarly user decodes . The same holds for codeword to users and codeword to users , respectively.

is intended users . User can decode by combining the received codeword with given in its cache. The same approach is used for user , to decode , respectively.
In order to determine the user throughput under this scheme we must inspect the achievable transmission rate per codeword, then determine the total time to transmit all codewords, and finally extract the user throughput. To this aim, the next subsection will specify the transmission rates of each codeword by designing a joint scheduling and power allocation to subsets of users.
IiB Degraded Broadcast Channel with Private and Common Messages
The placement phase creates independent subfiles , each intended to a subset of users. We address the question on how the transmitter shall convey these subfiles while opportunistically exploiting the underlying wireless channel. We start by remarking that the channel in (1) for a given channel realization is stochastically degraded BC which achieves the same capacity region as the physically degraded BC [5, Sec. 5]. The capacity region of the degraded broadcast channel for private messages and a common message is wellknown [5]. Here, we consider the extended setup where the transmitter wishes to convey mutually independent messages, denoted by , where denotes the message intended to the users in subset . We require that each user must decode all messages for . By letting denote the multicast rate of the message , we say that the ratetuple is achievable if there exists encoding and decoding functions which ensure the reliability and the rate condition as the slot duration is taken arbitrarily large. The capacity region is defined as the supremum of the achievable ratetuple as shown in [23], where the rate is measured in bit/channel use.
Theorem 1.
The capacity region of a user degraded Gaussian broadcast channel with fading gains and independent messages is given by
(5)  
(6) 
for nonnegative variables such that .
Proof.
The proof is quite straightforward and is based on ratesplitting and the privatemessage region of degraded broadcast channel. For completeness, see details in Appendix IXA. ∎
The achievability builds on superposition coding at the transmitter and successive interference cancellation at receivers. For , the transmit signal is simply given by
where denotes the signal corresponding to the message intended to the subset . We suppose that all
are mutually independent Gaussian distributed random variables satisfying the power constraint. User 3 (the weakest user) decodes
by treating all the other messages as noise. User 2 decodes first the messages and then jointly decodes . Finally, user 1 (the strongest user) successively decodes and, finally, .Later in our online coded caching scheme, we will need to characterize specific boundary points of the capacity region that maximize a weighted sum rate. To this end, it suffices to consider the weighted sum rate maximization:
(7) 
We first simplify the problem using the following theorem.
Theorem 2.
The weighted sum rate maximization with variables in (7) reduces to a simpler problem with variables, given by
(8) 
where is a positive real vector satisfying the total power constraint, and denotes the largest weight for user
Proof.
The proof builds on the simple structure of the capacity region. We remark that for a given power allocation of users to , user sees messages for all such that with the equal channel gain. For a given set of
, the capacity region of these messages is a simple hyperplane characterized by
vertices for , where is the sum rate of user in the RHS of (6) and is a vector with one for the th entry and zero for the others. Therefore, the weighted sum rate is maximized for user by selecting the vertex corresponding to the largest weight, denoted by . This holds for any . ∎We provide an efficient algorithm to solve this power allocation problem as a special case of the parallel Gaussian broadcast channel studied in [29, Theorem 3.2]. Following [29], we define the rate utility function for user given by
(9) 
where is a Lagrangian multiplier. The optimal solution corresponds to selecting the user with the maximum rate utility at each and the resulting power allocation for user is
(10) 
with satisfying
(11) 
Throughout the paper, we assume that each slot is arbitrarily large to achieve transmission rates of the whole capacity region of the broadcast channel (as given above) without errors, for each possible channel realization. This is necessary to ensure the successful decoding of each subfile at the receivers.
IiC Application to Online Delivery
In this subsection, we wish to apply the superposition encoding over different subsets of users, proposed in the previous subsection to the online delivery phase of decentralized coded caching. Compared to the original decentralized coded caching in [1], we introduce here the new ingredients: i) at each slot, the superposition based delivery scheme is able to serve multiple subsets of users, such that each user shall decode multiple subfiles; ii) users’ requests arrive randomly and each user decodes a sequence of its requested files. In the original framework [3, 1], the vector of user requests, denoted by , is assumed to be known by all users. This information is necessary for each user to recover its desired subfiles by operating XOR between the received signal and the appropriate subfiles available in its cache content. Let us get back to the threeuser example in Fig. 1. Upon the reception of , user 1 must identify both its desired subfile identity () and the combined subfile available in its cache (). Similarly upon the reception of , user 1 must identify its desired subfile and the combined subfiles . In the case of a single request per user, the base station simply needs to disseminate the vector of user requests. However, if user requests arrive dynamically and the delivery phase is run continuously, we associate a header to identify each subfile (combined files index and intended receivers) as we discuss in details in Section VC.
At the end of the whole transmission as , each receiver decodes its sequence of requested files by applying a decoding function to the sequence of the received signals , that of its channel state , its cache . Namely, the output of the th user’s decoding function at slot is given by
(12) 
where is defined to be the number of decoded files by user up to slot . Under the assumption that is arbitrarily large, each receiver can successfully decode the sequence of the encoded symbols and reconstruct its requested files.
Iii Problem Formulation
After specifying the codeword generation and the transmission scheme over the broadcast channel, this section will formulate the problem of alphafair file delivery.
Now we are ready to define the feasible rate region as the set of the average number of successfully delivered files for users. We let denote time average delivery rate of user , measured in files par slot. We let denote the set of all feasible delivery rate vectors.
Definition 1 (Feasible rate).
A rate vector , measured in file/slot, is said to be feasible if there exist a file combining and transmission scheme such that
(13) 
where denotes the number of successfully delivered files to user up to .
It is worth noticing that as the number of decoded files shall coincide with the number of successfully delivered files under the assumptions discussed previously. In contrast to the original framework [3, 1], our rate metric measures the ability of the system to continuously and reliably deliver requested files to the users. Since finding the optimal policy is very complex in general, we restrict our study to a specific class of policies given by the following mild assumptions:
Definition 2 (Admissible class policies ).
The admissible policies have the following characteristics:

The caching placement and delivery follow the decentralized scheme [1].

The users request distinct files, i.e. the IDs of the requested files of any two users are different.
Since we restrict our action space, the feasibility rate region, denoted by , under the class of policies is smaller than the one for the original problem . However, the joint design of caching and online delivery appears to be a very hard problem; note that the design of an optimal code for coded caching alone is an open problem and the proposed solutions are constant factor approximations. Restricting the caching strategy to the decentralized scheme proposed in [1] makes the problem amenable to analysis and extraction of conclusions for general cases such as the general setup where users may not have the symmetrical rates. Additionally, if two users request the same file simultaneously, it is efficient to handle exceptionally the transmissions as naive broadcasting instead of using the decentralized coded caching scheme, yielding a small efficiency benefit but complicating further the problem. Note, however, the probability that two users simultaneously request the same parts of video is very low in practice, hence to simplify our model we exclude this consideration altogether.
Our objective is to solve the fair file delivery problem:
(14) 
where the utility function corresponds to the alpha fair family of concave functions obtained by choosing:
(15) 
for some arbitrarily small (used to extend the domain of the functions to ). Tuning the value of changes the shape of the utility function and consequently drives the system performance to different operating points: (i) yields max sum delivery rate, (ii) yields maxmin delivery rate [9], (iii) yields proportionally fair delivery rate [30]. Choosing leads to a tradeoff between max sum and proportionally fair delivery rates.
The optimization (14) is designed to allow us tweak the performance of the system; we highlight its importance by an example. Suppose that for a 2user system is given by the convex set shown on Fig. 2. Different boundary points are obtained as solutions to (14). If we choose , the system is operated at the point that maximizes the sum . The choice leads to the maximum such that , while maximizes the sum of logarithms. The operation point A is obtained when we always broadcast to all users at the weakest user rate and use [3] for coded caching transmissions. Note that this results in a significant loss of efficiency due to the variations of the fading channel, and consequently A lies in the interior of . To reach the boundary point that corresponds to we need to carefully group users together with good instantaneous channel quality but also serve users with poor average channel quality. This shows the necessity of our approach when using coded caching in realistic wireless channel conditions.
Iv Queued Delivery Network
This section presents the queued delivery network and then the feasible delivery rate region, based on stability analysis of the queueing model.
Iva Queueing Model
At each time slot , the controller admits files to be delivered to user , and hence is a control variable. We equip the base station with the following types of queues:

User queues to store admitted files, one for each user. The buffer size of queue is denoted by and expressed in number of files.

Codeword queues to store codewords to be multicast. There is one codeword queue for each subset of users . The size of codeword queue is denoted by and expressed in bits.
A queueing policy performs the following operations: (i) it decides how many files to admit into the user queues in the form of variables, (ii) it combines files destined to different users to create multiple codewords. When a new codeword is form in this way, we denote this with the codeword routing control variable , that denotes the number of combinations among files from the subset f users according to the coded caching scheme in [3], (iii) it decides the encoding function for the wireless transmission. Below we explain in detail the queue operations and the queue evolution:

Admission control: At the beginning of each slot, the controller decides how many requests for each user, should be pulled into the system from the infinite reservoir.

Codeword Routing: The admitted files for user are stored in queues for . At each slot, files from subsets of these queues are combined into codewords by means of the decentralized coded caching encoding scheme. Specifically, the decision at slot for a subset of users , denoted by , refers to the number of combined requests for this subset of users. ^{1}^{1}1It is worth noticing that standard coded caching lets for and zero for all the other subsets. On the other hand, uncoded caching can be represented by for . Our scheme can, therefore be seen as a combination of both, which explains its better performance. The size of the user queue evolves as:
(16) If , the server creates codewords by applying (4) for this subset of users as a function of the cache contents .

Scheduling: The codewords intended to the subset of users are stored in codeword queue whose size is given by for . Given the instantaneous channel realization and the queue state , the server performs multicast scheduling and rate allocation. Namely, at slot , it determines the number of bits per channel use to be transmitted for the users in subset . By letting denote the number of bits generated for codeword queue when coded caching is performed to the users in , codeword queue evolves as
(17) where .
A control policy is fully specified by giving the rules with which the decisions are taken at every slot . The first step towards this is to characterize the set of feasible delivery rates, , which is the subject of the next subsection.
IvB Feasibility Region
The main idea here is to characterize the set of feasible file delivery rates via characterizing the stability performance of the queueing system. To this end, let denote the time average number of admitted files for user . We use the following definition of stability:
Definition 3 (Stability).
A queue is said to be (strongly) stable if
A queueing system is said to be stable if all its queues are stable. Moreover, the stability region of a system is the set of all vectors of admitted file rates such that the system is stable.
If the queueing system we have introduced is stable the rate of admitted files (input rate) is equal to the rate of successfully decoded files (output rate), hence we can characterize the system performance by means of the stability region of our queueing system. We let denote the capacity region for a fixed channel state , as defined in Theorem 1. Then we have the following:
Theorem 3 (Stability region).
Let be a set to which a rate vector of admitted files belongs to, if and only if there exist , such that:
(18)  
(19) 
Then, the stability region of the system is the interior of , where the above inequalities are strict.
Constraint (18) says that the aggregate service rate is greater than the arrival rate, while (19) implies that the longterm average rate for the subset is greater than the arrival rate of the codewords intended to this subset. In terms of the queueing system defined, these constraints impose that the service rates of each queue should be greater than their arrival rates, thus rendering them stable ^{2}^{2}2We restrict vectors to the interior of , since arrival rates at the boundary are exceptional cases of no practical interest, and require special treatment.. The proof of this theorem relies on existence of static policies, i.e. randomized policies whose decision distribution depends only on the realization of the channel state. See the Appendix, Section IXB for a definition and results on these policies.
Since the channel process is a sequence of i.i.d. realizations of the channel states (the same results hold if, more generally,
is an ergodic Markov chain), we can obtain any admitted file rate vector
in the stability region by a Markovian policy, i.e. a policy that chooses based only the state of the system at the beginning of time slot t, , and not the time index itself. This implies that evolves as a Markov chain, therefore our stability definition is equivalent to that Markov chain being ergodic with every queue having finite mean under the stationary distribution. Therefore, if we develop a policy that keeps user queues stable, then all admitted files will, at some point, be combined into codewords. Additionally, if codeword queues are stable, then all generated codewords will be successfully conveyed to their destinations. This in turn means that all receivers will be able to decode the admitted files that they requested:Lemma 4.
The region of all feasible delivery rates is the same as the stability region of the system, i.e. .
Proof.
Please refer to Appendix IXC. ∎
Lemma 4 implies the following Corollary.
Corollary 5.
Solving (14) is equivalent to finding a policy such that
(20)  
s.t.  the system is stable. 
This implies that the solution to the original problem (14) in terms of the longterm average rates is equivalent to the new problem in terms of the admission rates stabilizing the system. Next Section provides a set of the explicit solutions to this new problem.
V Proposed Online Delivery Scheme
Va Admission Control and Codeword Routing
Our goal is to find a control policy that optimizes (20). To this aim, we need to introduce one more set of queues. These queues are virtual, in the sense that they do not hold actual file demands or bits, but are merely counters to drive the control policy. Each user is associated with a queue which evolves as follows:
(21) 
where represents the arrival process to the virtual queue and is an additional control parameter. We require these queues to be stable: The actual mean file admission rates are greater than the virtual arrival rates and the control algorithm actually seeks to optimize the time average of the virtual arrivals . However, since is stable, its service rate, which is the actual admission rate, will be greater than the rate of the virtual arrivals, therefore giving the same optimizer. Stability of all other queues will guarantee that admitted files will be actually delivered to the users. With thee considerations, will be a control indicator such that when is above then we admit files into the system else we set . In particular, we will control the way grows over time using the actual utility objective such that a user with rate and rapidly increasing utility (steep derivative at ) will also enjoy a rapidly increasing and hence admit more files into the system.
In our proposed policy, the arrival process to the virtual queues are given by
(22) 
In the above, is a parameter that controls the utilitydelay tradeoff achieved by the algorithm (see Theorem 6).
We present our onoff policy for admission control and routing. For every user , admission control chooses demands given by
(23) 
For every subset , routing combines demands of users in given by
(24) 
VB Scheduling and Transmission
In order to stabilize all codeword queues, the scheduling and resource allocation explicitly solve the following weighted sum rate maximization at each slot where the weight of the subset corresponds to the queue length of
(25) 
We propose to apply the power allocation algorithm in subsection IIB to solve the above problem by sorting users in a decreasing order of channel gains and treating as . Algorithm 1 summarizes our online delivery scheme.
VC Practical Implementation
When user requests arrive dynamically and the delivery phase is run continuously, it is not clear when and how the base station shall disseminate the useful side information to each individual users. This motivates us to consider a practical solution which associates a header to each subfile for and . Namely, any subfile shall indicate the following information prior to message symbols: a) the indices of files; b) the identities of users who cache (know) the subfiles ^{3}^{3}3We assume here for the sake of simplicity that the overhead due to a header is negligible. This implies in practice that each of subfiles is arbitrarily large..
At each slot , the base station knows the cache contents of all users , the sequence of the channel state , as well as that of the demand vectors . Given this information, the base station constructs and transmits either a message symbol or a header at channel use in slot as follows.
(26) 
where denotes the header function, the message encoding function, respectively, at channel use in slot .

          


;   

      



  

Example 2.
We conclude this section by providing an example of our proposed online delivery scheme for users as illustrated in Fig. 3.
We focus on the evolution of codeword queues between two slots, and . The exact backlog of codeword queues is shown in Table I. Given the routing and scheduling decisions ( and ), we provide the new states of the queues at the next slot in the same Table.
We suppose that . The scheduler uses (25) to allocate positive rates to user set and given by and multicasts the superposed signal . User decodes only . User decodes first , then subtracts it and decodes . Note that the subfile is simply a fraction of the file whereas the subfile is a linear combination of two fractions of different files. In order to differentiate between each subfile, each user uses the data information header existing in the received signal. In the next slot, the received subfiles are evacuated from the codeword queues.
For the routing decision, the server decides at slot to combine requested by user with requested by user and to process requested by user uncoded. Therefore, we have and otherwise. Given this codeword construction, codeword queues have inputs that change its state in the next slot as described in Table I.
codeword queue storing XORpackets intended users in .  

user queue storing admitted files for user . 
virtual queue for the admission control.  
decision variable of number of combined requests for users in .  
decision variable for multicast transmission rate to users .  

decision variable of the number of admitted files for user in . 
the arrival process to the virtual queue in , given by eq. (22).  
cache content for user  
number of successfully decoded files by user up to slot .  
number of (accumulated) requested files by user k up to slot .  
time average delivery rate equal to in files/slot.  
mean of the arrival process.  
length of codeword intended to users from applying coded caching for user in .  
number of channel use per slot.  
the capacity region for a fixed channel state .  
the set of all possible channel states.  
the probability that the channel state at slot is . 
VD Performance Analyis
Here we present the main result of the paper, by proving that our proposed online algorithm achieves nearoptimal performance for all policies within the class :
Theorem 6.
Let the mean timeaverage delivery rate for user achieved by the proposed policy. Then
where is the sum of all queue lengths at the beginning of time slot , thus a measure of the mean delay of file delivery. The quantities an are constants that depend on the statistics of the system and are given in the Appendix.
The above theorem states that, by tuning the constant , the utility resulting from our online policy can be arbitrarily close to the optimal one, where there is a tradeoff between the guaranteed optimality gap and the upper bound on the total buffer length . We note that these tradeoffs are in direct analogue to the converge error vs step size of the subgradient method in convex optimization.
Sketch of proof.
For proving the Theorem, we use the Lyapunov function
and specifically the related driftpluspenalty quantity, defined as: . The proposed algorithm is such that it minimizes (a bound on) this quantity. The main idea is to use this fact in order to compare the evolution of the driftpluspenalty under our policy and two ”static” policies, that is policies that take random actions (admissions, demand combinations and wireless transmissions), drawn from a specific distribution, based only on the channel realizations (and knowledge of the channel statistics). We can prove from Theorem 4 that these policies can attain every feasible delivery rate. The first static policy is one such that it achieves the stability of the system for an arrival rate vector such that . Comparing with our policy, we deduce strong stability of all queues and the bounds on the queue lengths by using a FosterLyapunov type of criterion. In order to prove nearoptimality, we consider a static policy that admits file requests at rates and keeps the queues stable in a weaker sense (since the arrival rate is now in the boundary ). By comparing the driftpluspenalty quantities and using telescopic sums and Jensen’s inequality on the time average utilities, we obtain the nearoptimality of our proposed policy.
Vi Dynamic File Requests
In this Section, we extend our algorithm to the case where there is no infinite amount of demands for each user, rather each user requests a finite number of files at slot . Let be the number of files requested by user at the beginning of slot . We assume it is an i.i.d. random process with mean and such that almost surely. ^{4}^{4}4
The assumptions can be relaxed to arrivals being ergodic Markov chains with finite second moment under the stationary distribution
In this case, the alpha fair delivery problem is to find a delivery rate that solvesMaximize  
s.t.  
where the additional constraints denote that a user cannot receive more files than the ones actually requested.
The fact that file demands are not infinite and come as a stochastic process is dealt with by introducing one ”reservoir queue” per user, , which stores the file demands that have not been admitted, and an additional control decision on how many demands to reject permanently from the system, . At slot , no more demands then the ones that arrived at the beginning of this slot and the ones waiting in the reservoir queues can be admitted, therefore the admission control must have the additional constraint
and a similar restriction holds for the number of rejected files from the system, . The reservoir queues then evolve as
The above modification with the reservoir queues has only an impact that further constrains the admission control of files to the system. The queuing system remains the same as described in Section V, with the user queues , the codeword queues and the virtual queues . Similar to the case with infinite demands we can restrict ourselves to policies that are functions only of the system state at time slot , without loss of optimality. Furthermore, we can show that the alpha fair optimization problem equivalent to the problem of controlling the admission rate. That is, we want to find a policy such that
s.t.  the queues are strongly stable  
The rules for scheduling, codeword generation, virtual queue arrivals and queue updating remain the same as in the case of infinite demands in subsections C and D of Sec. V. The only difference is that there are multiple possibilities for the admission control; see [7] and Chapter 5 of [31] for more details. Here we propose that at each slot , any demand that is not admitted get rejected (i.e. the reservoir queues hold no demands), the admission rule is
(27) 
and the constants are set as . Using the same ideas employed in the performance analysis of the case with infinite demands and the ones employed in [7], we can prove that the utilityqueue length tradeoff of Theorem 6 holds for the case of dynamic arrivals as well.
Vii Numerical Examples
In this section, we compare our proposed delivery scheme with two other schemes described below, all building on the decentralized cache placement described in (2) and (3).

Our proposed scheme: We apply Algorithm 1 for slots. Using the scheduler (25), we calculate denoting the rate allocated to a user set at slot . As defined in (13), the longterm average rate of user measured in file/slot is given by
(28) Notice that the numerator corresponds to the average number of useful bits received over a slot by user and the denominator corresponds to the number of bits necessary to recover one file.

Unicast opportunistic scheduling: For any request, the server sends the remaining bits to the corresponding user without combining any files. Here we only exploit the local caching gain. In each slot the transmitter sends with full power to the following user
where is the empirical average rate for user up to slot . The resulting longterm average rate of user measured in file/slot is given by
(29) 
Standard coded caching: We use decentralized coded caching among all users. For the delivery, nonopportunistic TDMA transmission is used. The server sends a sequence of codewords at the worst transmission rate. The number of packets to be multicast in order to satisfy one demand for each user is given by [1]
(30) Thus the average delivery rate (in file per slot) is symmetric, and given as the following