I Introduction
Enabled by the proliferation of ubiquitous sensing devices and the pervasive wireless data connectivity, realtime monitoring has become a reality in largescale cyberphysical systems, such as power grids, manufacturing facilities, and smart transportation systems. However, the unprecedented highdimensionality and generation rate of the sensing data also impose critical challenges on its timely delivery. In order to measure and ensure the freshness of information available to the central controller, a metric called Age of Information (AoI) has been introduced and analyzed in various networks [1]. Specifically, at time , the AoI in the system is defined as , where is the time stamp of the latest received update at the destination. Since AoI depends on data generation as well as queueing and transmission, it exhibits fundamental differences between traditional network performance metrics, such as throughput and delay.
Modeling the status updating process as a queueing process, time average AoI has been analyzed in systems with a single server [1, 2, 3, 4, 5, 6, 7, 8], and multiple servers [9, 10, 11]. Peak Age of Information (PAoI) has been introduced and studied in [12, 13, 14]. The optimality properties of a preemptive Last Generated First Served service discipline are identified in [15].
AoI minimization has also been investigated, either by controlling the generation process of the updates [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26], or by scheduling the transmission of updates that have already been generated [27, 28, 29, 30, 31]. Optimal status updating policy with knowledge of the server state has been studied in [16]. AoIoptimal sampling of a Wiener process is investigated in [17]. Under an energy harvesting setting, optimal status updating have been studied in [18, 19, 20, 21, 22, 23, 24, 25, 26]. Transmission scheduling in a broadcast channel has been studied in [27, 28, 29]. Reference [27] shows that a greedy policy which always tries to update the most outdated client is optimal in a symmetric setting. Reference [28] formulates the problem as a Markov Decision Process (MDP), and show that the optimal policy is a switchtype. It also proposes a sequence of finitestate approximations for the infinitestate MDP and proves its convergence. A restless bandits based formulation and a Whittle’s index based scheduling have been studied in [29]. Different transmission scheduling policies for AoI minimization in a multiple access channel under throughput constraints on individual nodes have been analyzed in [30]. Ageoptimal link scheduling in a multiplesource system with conflicting links is studied in [31], and the problem is shown to be NPcomplete in general. Headofline agebased scheduling algorithms have been shown to be throughput optimal in wireless networks in [32].
In this paper, we investigate the optimal online transmission scheduling for a single link under the assumption that the link capacity is limited and each update takes multiple time slots to transmit. During the transmission of an update, new updates may arrive. Therefore, the source has to decide whether to switch to the new arrival, or to continue its current transmission and drop the new update. What makes the problem challenging is that the impact of a decision on the AoI evolution won’t become clear immediately. This is because the instantaneous AoI at the destination will be reset only when a transmission is completed. Even if the source decides to transmit an update at an earlier time, it may drop the update later before the transmission is complete, leading to uncertain AoI evolution in the system. To overcome this challenge, we first prove that within a broadly defined class of online policies, the optimal policy should be a renewal policy, and the decisionmaking over each renewal interval only depends on the arrival time of the updates in that interval. Then, we show that the optimal renewal policy has a multiplethreshold structure, which enables us to formulate the problem as an MDP, and identify the thresholds numerically through structured value iteration.
Ii System Model and Problem Formulation
We consider a singlelink status monitoring system where the source keeps sending timestamped status updates to a destination through a ratelimited link. We assume the time axis is discretized into time slots, which are labeled as . At the beginning of time slot , an update packet is generated and arrives at the source according to an independent and identically distributed (i.i.d.) Bernoulli process with parameter . We assume each update is of the same size, and it takes exactly time slots, , to transmit one update to the destination. Similar to [27, 28, 29], we assume that at most one update can be transmitted during each time slot, and there is no buffer at the source to store the updates that are not being transmitted. Therefore, once an update arrives at the source, it needs to decide whether to transmit it and drop the one currently under transmission if there is any, or to drop the new arrival.
A status update policy is denoted as , which consists of a sequences of transmission decisions . We let . Specifically, when , can take both values 1 and 0: If , the source will start transmitting the new arrival in time slot and drop the unfinished update if necessary. We term this as switch; Otherwise, if , the source will drop the new arrival, and continue the unfinished transmission. We term this as skip. When , we can show that dropping the update being transmitted is suboptimal. Thus, we restrict to the policies under which can only take value 0, i.e., to continue transmitting the unfinished update if there is one, or to idle.
Let be the the time slot when an update is completely transmitted to the destination. Then, the interupdate delays can be denoted as , for . Without loss of generality, we assume . Note that under the bufferless assumption, the AoI after a completed transmission is always equal to . An example sample path of the AoI evolution under a given status update policy is shown in Fig. 1. As illustrated, some updates are skipped when they arrive, while others are transmitted partially or completely.
We use to denote the total number of successfully delivered status updates over . Define as the total age of information experienced by the system over . Denote , i.e., the total AoI experienced by the receiver over the
th epoch
. Then,We focus on a set of online policies , in which the information available for determining includes the decision history , the update arrival profile , as well as the update statistics (i.e., in this scenario). The optimization problem can be formulated as
(1) 
where the expectation in the objective function is taken over all possible update arrival sample paths.
Iii Structure of the Optimal Policy
Consider the th epoch, i.e., the duration between time slots and under any online policy in . Let be the time slot when the th update after arrives, and let . Denote the update arrival profile in epoch as . Then, we introduce the following definition.
Definition 1 (Uniformly Bounded Policy)
Under an online policy , if there exists a function , such that for any , the length of the corresponding epoch is upper bounded by , and , then this policy is a uniformly bounded policy.
Denote the subset of uniformly bounded policies as . Then, using techniques similar to the proof of Theorem 1 in [22], we can show the following theorem.
Theorem 1
Any uniformly bounded policy is suboptimal to a renewal policy. That is, form a renewal process. Besides, the decision over the th renewal epoch only depends on causally.
Due to space limitation, the proof of Theorem 1, as well as the proofs of Lemma 1, Lemma 3 and Theorem 2 are omitted.
Based on Theorem 1, in the following, we will focus on renewal policies that depend on only.
Lemma 1
If the source is idle when an update arrives, it should start transmitting the update immediately.
Definition 2 (Sequential Switching Policy)
A sequential switching (SS) policy is a renewal policy under which the source switches to an update arriving at time slot only if it switches to all update arrivals prior to in the same epoch.
Remark: The definition of SS policy implies that once a source skips a new update arrival at , it will skip all of the upcoming update arrivals until it finishes the one being transmitted at . We point out that an SS policy is in general different from threshold type of policies, as it does not impose any threshold structure on when the source should skip or switch to a new update arrival.
Lemma 2
The optimal renewal policy in is an SS policy.
Proof: We prove this lemma through contradiction. Now assume the optimal policy is not an SS policy. Without loss of generality, we consider the first renewal epoch starting at time 0 (the beginning of time slot ). We assume under there exists a sample path under which the source transmits the new update arrival at time slot and does not switch to the next arrival at time slot in the same epoch, i.e., . Depending on the upcoming random arrivals, the sample path may evolve into different sample paths. Denote the set of such sample paths as , as they share the same history up to time slot . We can partition into two subsets:

: The source skips all the upcoming arrivals and finishes transmitting the update arrives at .

: The source switches to some later arrival.
Let be the corresponding length of the renewal epoch under policy . Then, for sample paths in , and for sample paths in .
We now construct two policies and as follows. Under both and , the source will behave exactly the same as under for all sample paths not in . However, for the sample paths in , the actions the source will take after will be different. Specifically, under , the source will finish the update that arrives at time slot irrespective of other factors. Therefore, for all sample paths in under , the corresponding length of the renewal epoch under will be under . For , we will let the source first switch to the arrival at time slot , and then switch to a later arrival whenever the source switches under . Then, for the sample paths in under , the corresponding length of renewal epoch will be changed to under ; while for those in , .
Therefore, considering all possible sample paths under those policies, we have , which implies that there must exist a , , such that
(2) 
We will then construct a randomized policy , under which it follows
with probability
and follows with probability . Apparently, the expected length of the renewal epoch under , denoted as , will be the same as that under .Next, we will show that . Denote . Then, (2) can be expressed as
(3) 
which can be reduced to
(4) 
Since , , (III) implies that . Dividing both sides of (III) by , we have
Note that and form a valid distribution. Therefore, based on Jensen’s inequality, we have
(5)  
(6) 
which is equivalently to
(7) 
I.e.,
(8) 
Combining (2) and (8), we have
(9) 
i.e., the new policy achieves a lower expected average AoI than , which contradicts with the assumption that is optimal.
Lemma 3
Under the optimal SS policy in , if the source is transmitting an update that arrives at the th time slot in an epoch when the new update arrives, then, there exists a threshold , , which depends on only, such that if the new update arrives before or at the th time slot in that epoch, the source will switch to the new arrival; otherwise, it will skip the new arrival and complete the current transmission.
Theorem 2
Under the optimal policy in , there exists a sequence of thresholds , such that if the source is transmitting an update that arrives in the th () time slot in an renewal epoch when a new update arrives, and the arrival time of the new update is before or at the th time slot in the epoch, the source will switch to the new arrival; Otherwise, if the next update arrives after , or the update being transmitted arrives after , the source will skip all upcoming arrivals until it finishes the current transmission.
Theorem 2 indicates that the optimal decision of the source only depends on two parameters: the arrival time of the update being transmitted, and the arrival time of the new update, both relative to the beginning of the renewal epoch. Therefore, the problem is essentially an MDP. In Sec. IV, we will cast the problem as an MDP, and numerically search for the optimal thresholds and .
Iv MDP based Scheduling
Iva MDP formulation
Motivated by the Markovian structure of the optimal policy in Theorem 2, we formulate the problem as an MDP as follows.
States: We define the state , where and are the AoI in the system, and the age of the unfinished update, at the beginning of time slot , respectively. is the update arrival status. Then, , , and the state space can be determined accordingly.
Actions: , as defined in Sec. II.
Transition probabilities: The transition probability from a state to another state under action , denoted as , is shown in Table I.
Cost: Let be the immediate cost after the action is taken at under state . We consider the instantaneous AoI after the action as the immediate cost, i.e.,
In order to reduce the computational complexity, we define an approximate MDP as follows: We define as the boundary AoI, and truncate the state space of the original MDP as . In the transition probabilities, we bound by , i.e., .
Then, the optimal policy can be determined through relative value iteration as follows:
(10) 
where is a reference state and we set it as . For each iteration , we need to update the optimal cost function for all states by minimizing the right hand side of (10), which causes a high computational complexity as the number of states increases. Motivated by [28], we then leverage the multithreshold structure of the optimal policy to reduce the computational complexity, as detailed in the structured value iteration algorithm in Algorithm 1.
With the multiplethreshold structure, Algorithm 1 does not need to seek the optimal action by equation (10) for all states in each iteration as the traditional value iteration algorithm does. Specifically, if the optimal action for a state is to skip the new arrival, the optimal action for state , must be to skip as well. Similarly, if the optimal action for a state is to switch to the new arrival, the optimal action for state , , must be to switch.
IvB Numerical results
We then search for the optimal policy numerically using Algorithm 1. We set , , and the number of iterations to be . We set for the approximate MDP. Fig. 2(a) shows the optimal action for each state . We then plot the optimal action for each pair of arrival time of the update being transmitted and that of the new arrival in a renewal epoch in Fig. 2(b). We note the thresholds , , , . They are monotonically decreasing, as predicted by Theorem 2. When the update being transmitted arrives after the fourth time slot in that epoch, all upcoming updates will be skipped.
Then, we compare the average AoI under the optimal policy identified by Algorithm 1 and a myopic policy over time slots. Under the myopic policy, the source will never switch to a new update arrival until it finishes the one being transmitted. The performance gap is plotted in Fig. 3. As we observe, the optimal policy always outperforms the myopic policy. Although the greedy policy minimizes the length of the each epoch greedily, it does not render the minimum average AoI. This is because
has a larger second moment in this case, leading to higher AoI. We note that when
gets sufficiently small or large, the performance gap between both policies becomes close to zero. This is because for such extreme cases, the multiplethreshold policy and the myopic policy become identical to each other.References
 [1] S. K. Kaul, R. D. Yates, and M. Gruteser, “Realtime status: How often should one update?” in IEEE INFOCOM, Orlando, FL, USA, Mar. 2012, pp. 2731–2735.
 [2] ——, “Status updates through queues,” in Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, Mar. 2012, pp. 1–6.
 [3] R. D. Yates and S. K. Kaul, “Realtime status updating: Multiple sources,” in IEEE International Symposium on Information Theory (ISIT), Cambridge, MA, USA, Jul. 2012, pp. 2666–2670.
 [4] ——, “The age of information: Realtime status updating by multiple sources,” ArXiv eprints, 2016. [Online]. Available: http://arxiv.org/abs/1608.08622
 [5] N. Pappas, J. Gunnarsson, L. Kratz, M. Kountouris, and V. Angelakis, “Age of information of multiple sources with queue management,” in IEEE International Conference on Communications (ICC), Jun. 2015, pp. 5935–5940.
 [6] E. Najm and R. Nasser, “Age of information: The gamma awakening,” in IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, Jul. 2016, pp. 2574–2578.
 [7] C. Kam, S. Kompella, G. D. Nguyen, J. E. Wieselthier, and A. Ephremides, “Age of information with a packet deadline,” in IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, Jul. 2016, pp. 2564–2568.
 [8] K. Chen and L. Huang, “Ageofinformation in the presence of error,” in IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, Jul. 2016, pp. 2579–2583.
 [9] C. Kam, S. Kompella, and A. Ephremides, “Age of information under random updates,” in IEEE International Symposium on Information Theory (ISIT), Istanbul, Turkey, Jul. 2013, pp. 66–70.
 [10] ——, “Effect of message transmission diversity on status age,” in IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, Jun. 2014, pp. 2411–2415.
 [11] C. Kam, S. Kompella, G. D. Nguyen, and A. Ephremides, “Effect of message transmission path diversity on status age,” IEEE Trans. Inf. Theory, vol. 62, no. 3, pp. 1360–1374, Mar. 2016.
 [12] M. Costa, M. Codreanu, and A. Ephremides, “Age of information with packet management,” in IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, Jun. 2014, pp. 1583–1587.
 [13] ——, “On the age of information in status update systems with packet management,” IEEE Trans. Inf. Theory, vol. 62, no. 4, pp. 1897–1910, Apr. 2016.
 [14] L. Huang and E. Modiano, “Optimizing ageofinformation in a multiclass queueing system,” in IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, Jun. 2015, pp. 1681–1685.
 [15] A. M. Bedewy, Y. Sun, and N. B. Shroff, “Optimizing data freshness, throughput, and delay in multiserver informationupdate systems,” in IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, Jul. 2016, pp. 2569–2573.
 [16] Y. Sun, E. UysalBiyikoglu, R. D. Yates, C. E. Koksal, and N. B. Shroff, “Update or wait: How to keep your data fresh,” in IEEE INFOCOM, San Francisco, CA, USA, Apr. 2016, pp. 1–9.
 [17] Y. Sun, Y. Polyanskiy, and E. UysalBiyikoglu, “Remote estimation of the wiener process over a channel with random delay,” CoRR, vol. abs/1701.06734, 2017.
 [18] R. D. Yates, “Lazy is timely: Status updates by an energy harvesting source,” in IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, Jun. 2015, pp. 3008–3012.
 [19] B. T. Bacinoglu, E. T. Ceran, and E. UysalBiyikoglu, “Age of information under energy replenishment constraints,” in Information Theory and Applications Workshop, San Diego, CA, USA, Feb. 2015, pp. 25–31.
 [20] X. Wu, J. Yang, and J. Wu, “Optimal status update for age of information minimization with an energy harvesting source,” IEEE Trans. Green Commun. Netw., vol. 2, no. 1, pp. 193 – 204, Mar. 2018.
 [21] B. T. Bacinoglu and E. UysalBiyikoglu, “Scheduling status updates to minimize age of information with an energy harvesting sensor,” in IEEE International Symposium on Information Theory (ISIT), Jun. 2017.
 [22] A. Arafa, J. Yang, and S. Ulukus, “Ageminimal online policies for energy harvesting sensors with random battery recharges,” in IEEE International Conference on Communications (ICC), May 2018.
 [23] A. Arafa, J. Yang, S. Ulukus, and V. Poor, “Ageminimal online policies for energy harvesting sensors with incremental battery recharges,” in Information Theory and Applications Workshop, San Diego, CA, USA, Feb. 2018.
 [24] B. Tan Bacinoglu, Y. Sun, E. UysalBiyikoglu, and V. Mutlu, “Achieving the ageenergy tradeoff with a finitebattery energy harvesting source,” in IEEE International Symposium on Information Theory (ISIT), Jun. 2018.
 [25] S. Feng and J. Yang, “Optimal status updating for an energy harvesting sensor with a noisy channel,” in IEEE INFOCOM  Workshop on Age of Information, Apr. 2018.
 [26] ——, “Minimizing age of information for an energy harvesting source with updating failures,” in IEEE International Symposium on Information Theory (ISIT), Jun. 2018.
 [27] I. Kadota, A. Sinha, E. UysalBiyikoglu, R. Singh, and E. Modiano, “Scheduling Policies for Minimizing Age of Information in Broadcast Wireless Networks,” ArXiv eprints, Jan. 2018.
 [28] Y. P. Hsu, E. Modiano, and L. Duan, “Age of information: Design and analysis of optimal scheduling algorithms,” in IEEE International Symposium on Information Theory (ISIT), June 2017, pp. 561–565.
 [29] Y.P. Hsu, “Age of Information: Whittle Index for Scheduling Stochastic Arrivals,” ArXiv eprints, Jan. 2018.
 [30] I. Kadota, A. Sinha, and E. Modiano, “Optimizing age of information in wireless networks with throughput constraints,” in IEEE INFOCOM, Apr. 2018.
 [31] Q. He, D. Yuan, and A. Ephremides, “Optimal link scheduling for age minimization in wireless systems,” IEEE Trans. Inf. Theory, vol. PP, no. 99, pp. 1–1, 2017.
 [32] B. Li, A. Eryilmaz, and R. Srikant, “On the universality of agebased scheduling in wireless networks,” in IEEE INFOCOM, Apr 2015, pp. 1302–1310.
Comments
There are no comments yet.