Timely updates of the system state are of significant importance for state estimation and decision making in networked control and cyber-physical systems, such as UAV navigation, robotics control, mobility tracking, and environment monitoring systems. To evaluate the freshness of state updates, the concept of Age of Information, or simply age, was introduced to measure the timeliness of state samples received from a remote transmitter , . Let be the generation time of the freshest received state sample at time . The age of information, as a function of , is defined as , which is the time difference between the freshest samples available at the transmitter and receiver.
Recently, the age of information concept has received significant attention, because of the extensive applications of state updates among systems connected over communication networks. The states of many systems, such as UAV mobility trajectory and sensor measurements, are in the form of a signal , that may change slowly at some time and vary more dynamically later. Hence, the time difference described by the age only partially characterize the variation of the system state, and the state update policy that minimizes the age of information does not minimize the state estimation error. This result was first shown in [3, 4], where a challenging sampling problem of Wiener processes was solved and the optimal sampling policy was shown to have an intuitive structure. As the results therein hold only for non-stationary signals that can be modeled as a Wiener process, one would wonder how to, and whether it is possible to, extend [3, 4] for handling more general signal models.
In this paper, we generalize [3, 4] by exploring a problem of sampling an Ornstein-Uhlenbeck (OU) process . From the obtained results, we hope to find useful structural properties of the optimal sampler design that can be potentially applied to more general signal models. The OU process is the continuous-time analogue of the discrete-time AR(1) process, which is defined as the solution to the stochastic differential equation (SDE) [5, 6]
where , , and are parameters and represents a Wiener process. The OU process is the only nontrivial continuous-time process that is stationary, Gaussian, and Markovian . It has been widely used to model financial prices such as interest rates, currency exchange rates, and commodity prices (with modifications) , node mobility in mobile ad hoc networks, robotic swarms, or UAV networks [8, 9], and physical processes such as fluid dynamics .
As shown in Fig. 1, samples of an OU process are forwarded to a remote estimator through a FIFO queue with i.i.d. service times. The service time distributions considered in this paper are quite general: they are only required to have a finite mean. This queueing model is helpful to analyze the robustness of remote estimation systems with occasionally long communication delay. For example, UAVs flying close to WiFi access points may suffer from long communication delay and instability issues, if they receive strong interference from the WiFi access points.111This example was suggested to the authors by Thaddeus A. Roppel. We remark that this paper focuses on a queueing model; whereas our analysis can be readily applied to other channel models, such as erasure channels and fading channels.
The estimator utilizes causally received samples to construct an estimate of the real-time signal value . The quality of remote estimation is measured by the time-average mean-squared estimation error, i.e.,
Our goal is to find the optimal causal sampling policy that minimizes by choosing the sampling times subject to a maximum sampling rate constraint. In practice, the cost (e.g., energy, CPU cycle, storage) for state updates increases with the average sampling rate. Hence, we are striking to find the optimum tradeoff between estimation error and update cost. In addition, the unconstrained problem will also be solved. The contributions of this paper are summarized as follows:
The optimal sampling problem for minimize the
under a sampling rate constraint is formulated as a constrained continuous-time Markov decision process (MDP) with an uncountable state space. Because of the curse of dimensionality, such problems are often lack of low-complexity solutions that are arbitrarily accurate. However, we were able to solve this MDP exactly: The optimal sampling policy is proven to be a threshold policy oninstantaneous estimation error, where the threshold is a non-linear function of a parameter . The value of is equal to the summation of the optimal objective value of the MDP and the optimal Lagrangian dual variable associated to the sampling rate constraint. If there is no sampling rate constraint, the Lagrangian dual variable is zero and hence is exactly the optimal objective value. Among the technical tools developed to prove this result is a free boundary method ,  for finding the optimal stopping time of the OU process.
The optimal sampler design of Wiener process in [3, 4] is a limiting case of the above result. By comparing the optimal sampling policies of OU process and Wiener process, we find that the threshold function changes according to the signal model, where the parameter is determined in the same way for both signal models.
Further, we consider a class of signal-ignorant sampling policies, where the sampling times are determined without using knowledge of the observed OU process. The optimal signal-ignorant sampling problem is equivalent to an MDP for minimizing the time-average of a nonlinear age function , which has been solved recently in . The age-optimal sampling policy is a threshold policy on expected estimation error, where the threshold function is simply and the parameter is determined in the same way as above.
The above results hold for (i) general service time distributions with a finite mean and (ii) sampling problems both with and without a sampling rate constraint. Numerical results suggest that the optimal sampling policy is better than zero-wait sampling and the classic uniform sampling.
One interesting observation from these results is that the threshold function varies with respect to the signal model and sampling problem, but the parameter is determined in the same way.
I-a Related Work
The results in this paper are tightly related to recent studies on the age of information , e.g., [1, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 13], which does not have a signal model. The average age and average peak age have been analyzed for various queueing systems in, e.g., [1, 14, 16, 17]. The optimality of the Last-Come, First-Served (LCFS) policy, or more generally the Last-Generated, First-Served (LGFS) policy, was established for various queueing system models in [20, 21, 22, 23, 24]. Optimal sampling policies for minimizing non-linear age functions were developed in [15, 13]. Age-optimal transmission scheduling of wireless networks were investigated in, e.g., [18, 19, 25, 26, 27, 28, 29].
On the other hand, this paper is also a contribution to the area of remote estimation, e.g., [30, 31, 32, 33, 34, 35, 36], by adding a queue between the sampler and estimator. In [32, 34], optimal sampling of Wiener processes was studied, where the transmission time from the sampler to the estimator is zero. Optimal sampling of OU processes was also considered in , which is solved by discretizing time and using dynamic programming to solve the discrete-time optimal stopping problems. However, our optimal sampler of OU processes is obtained analytically. Remote estimation over several different channel models was recently studied in, e.g., [35, 36]. In [30, 31, 32, 33, 34, 35, 36], the optimal sampling policies were proven to be threshold policies. Because of the queueing model, our optimal sampling policy has a different structure from those in [30, 31, 32, 33, 34, 35, 36]. Specifically, in our optimal sampling policy, sampling is suspended when the server is busy and is reactivated once the server becomes idle. The optimal sampling policy for Wiener processes in [3, 4] is a limiting case of ours.
Ii Model and Formulation
Ii-a System Model
We consider the remote estimation system illustrated in Fig. 1, where an observer takes samples from an OU process and forwards the samples to an estimator through a communication channel. The channel is modeled as a single-server FIFO queue with i.i.d. service times. The system starts to operate at time . The -th sample is generated at time and is delivered to the estimator at time with a service time , which satisfy , , , and for all . Each sample packet contains the sampling time and the sample value . Let be the sampling time of the latest received sample at time . The age of information, or simply age, at time is defined as 
which is shown in Fig. 2. Because , can be also expressed as
The initial state of the system is assumed to satisfy , , and are finite constants. The parameters , , and in (1) are known at both the sampler and estimator.
Let represent the idle/busy state of the server at time . We assume that whenever a sample is delivered, an acknowledgement is sent back to the sampler with zero delay. By this, the idle/busy state of the server is known at the sampler. Therefore, the information that is available at the sampler at time can be expressed as .
Ii-B Sampling Policies
In causal sampling policies, each sampling time is chosen by using the up-to-date information available at the sampler. To characterize this statement precisely, let us define the -fields
Hence, is a filtration (i.e., a non-decreasing and right-continuous family of -fields) of the information available at the sampler. Each sampling time is a stopping time with respect to the filtration such that
Let represent a sampling policy and represent the set of causal sampling policies that satisfy two conditions: (i) Each sampling policy satisfies (6) for all . (ii) The sequence of inter-sampling times forms a regenerative process [37, Section 6.1]: There exists an increasing sequence of almost surely finite random integers such that the post- process has the same distribution as the post- process and is independent of the pre- process ; further, we assume that , , and 222We will optimize , but operationally a nicer criterion is . These criteria are corresponding to two definitions of “average cost per unit time” that are widely used in the literature of semi-Markov decision processes. These two criteria are equivalent, if is a regenerative process, or more generally, if has only one ergodic class. If not condition is imposed, however, they are different. The interested readers are referred to [38, 39, 40, 41, 42] for more discussions.
From this, we can obtain that is finite almost surely for all . We assume that the OU process and the service times are mutually independent, and do not change according to the sampling policy.
A sampling policy is said to be signal-ignorant (signal-aware), if is (not necessarily) independent of . Let denote the set of signal-ignorant sampling policies, defined as
Ii-C MMSE Estimator
At any time , the estimator uses causally received samples to construct an estimate of the real-time signal value . The information available to the estimator consists of two parts: (i) , which contains the sampling time , sample value , and delivery time of the samples that have been delivered by time and (ii) the fact that no sample has been received after the last delivery time . Similar to [32, 44, 4], we assume that the estimator neglects the second part of information.333In [30, 31, 32, 33, 34, 35, 36], it was shown that when the sampler and estimator are jointly optimized, the MMSE estimator has the same expression no matter with or without the second part of information. We will consider the joint optimization of the sampler and estimator in our future work. Then, as shown in Appendix A, the minimum mean square error (MMSE) estimator is determined by
Ii-D Problem Formulation
The goal of this paper is to find the optimal sampling policy that minimizes the mean-squared estimation error subject to an average sampling-rate constraint, which is formulated as the following problem:
where is the optimum value of (10) and is the maximum allowed sampling rate. When , this problem becomes an unconstrained problem.
Iii Main Result
Iii-a Signal-aware Sampling
Problem (10) is a constrained continuous-time MDP with a continuous state space. However, we found an exact solution to this problem. To present this solution, let us consider an OU process with initial state and parameter . According to (II-C), can be expressed as
In the sequel, we will see that and are the lower and upper bounds of , respectively. We will also need to use the function444If , is defined as its right limit .
where is the error function, defined as
The proof of Theorem 1 is explained in Section IV. The optimal sampling policy in Theorem 1 has a nice structure. Specifically, the -th sample is taken at the earliest time satisfying two conditions: (i) The -th sample has already been delivered by time , i.e., , and (ii) the estimation error is no smaller than a pre-determined threshold , where is a non-linear function defined in (18).
In Theorem 1, it holds that .
See Appendix M. ∎
Hence, . Further, because is strictly increasing on and , the inverse function and the threshold are properly defined and . We note that the service time distribution affects the optimal sampling policy in (17) and (18) through and .
The calculation of falls into two cases: In one case, can be computed by solving (19) via the bisection search method in Algorithm 1. For this case to occur, the sampling rate constraint (11) needs to be inactive at the obtained in Algorithm 1. Because , we can obtain and hence (20) holds when the sampling rate constraint (11) is inactive. In the other case, is selected to satisfy the sampling rate constraint (11) with equality, which is implemented by using another bisection method in Algorithm 2. The upper and lower bounds for bisection search in Algorithms 1-2 are determined by Lemma 1. If , because , (20) is always satisfied and only the first case can happen. By comparing (19) and (22), it follows immediately that
If the sampling rate constraint is removed, i.e., , then .
The calculation of the expectations in Algorithms 1-2 can be greatly simplified by using the following lemma, which is obtained based on Dynkin’s formula [12, Theorem 7.4.1] and the optional stopping theorem.
See Appendix N. ∎
Because the ’s are i.i.d., the expectations in (3) are functions of and are irrelevant of . One can improve the accuracy of the solution in Algorithms 1-2 by (i) reducing the tolerance and (ii) increasing the number of Monte Carlo realizations for computing the expectations. Such a low-complexity solution for solving a constrained continuous-time MDP with a continuous state space is rare.
Iii-A1 Sampling of Wiener Processes
Theorem 2 is an alternative form of Theorem 1 in [3, 4]. The benefit of the new expression in (30)-(32) is that they allows to character by the optimal objective value and the sampling rate constraint (11), in the same way as in Theorem 1, which appears to be more fundamental than the expression in [3, 4].
Iii-B Signal-ignorant Sampling
In signal-ignorant sampling policies, the sampling times are determined based only on the service times , but not on the observed OU process .
If , then the mean-squared estimation error of the OU process at time is
which a strictly increasing function of the age .
According to Lemma 4 and Fubini’s theorem, for every policy ,
Hence, minimizing the mean-squared estimation error among signal-ignorant sampling policies can be formulated as the following MDP for minimizing the expected time-average of the nonlinear age function :
where is the optimal value of (35). By (33), and are bounded. Problem (35) is one instance of the problems recently solved in Corollary 3 of  for general strictly increasing functions . From this, a solution to (35) is given by
Because , it follows immediately that .
Iii-C Discussions of the Results
The difference among Theorems 1-3 is only in the expressions (17), (30), (37) of threshold policies, whereas the parameter is determined by the optimal objective value and the sampling rate constraint in the same manner. Later on in (55), we have shown that is exactly equal to the summation of the optimal objective value of the MDP and the optimal Lagrangian dual variable associated to the sampling rate constraint.
In signal-aware sampling policies (17) and (30), the sampling time is determined by the instantaneous estimation error , and the threshold is determined by the signal model. In the signal-ignorant sampling policy (37), the sampling time is determined by the expected estimation error at time . We note that if , then is the delivery time of the new sample. Hence, (37) requires that the expected estimation error upon the delivery of the new sample is no less than . Finally, it is worth noting that Theorems 1-3 hold for all distributions of the service times satisfying , and for both constrained and unconstrained sampling problems.
Iv Proof of the Main Result
We prove Theorem 1 in four steps: (i) We first show that sampling should be suspended when the server is busy, which can be used to simplify (10). (ii) We use an extended Dinkelbach’s method  and Lagrangian duality method to decompose the simplified problem into a series of mutually independent per-sample MDP. (iii) We utilize the free boundary method from optimal stopping theory  to solve the per-sample MDPs. (iv) Finally, we use a geometric multiplier method  to show that the duality gap is zero. The above proof framework is an extension to that used in [13, 3, 4], where the most challenging part is in finding the analytical solution of the per-sample MDP in Step (iii).
The OU process in (12) with initial state and parameter is the solution to the SDE
In addition, the infinitesimal generator of is [48, Eq. A1.22]
See Appendix D. ∎
Iv-B Suspend Sampling when the Server is Busy
By using the strong Markov property of the OU process and the orthogonality principle of MMSE estimation, we obtain the following useful lemma:
In (10), it is suboptimal to take a new sample before the previous sample is delivered.
See Appendix E. ∎
A similar result was obtained in  for the sampling of Wiener processes. By Lemma 6, there is no loss to consider a sub-class of sampling policies such that each sample is generated and sent out after all previous samples are delivered, i.e.,
For any policy , the information used for determining includes: (i) the history of signal values and (ii) the service times of previous samples. Let us define the -fields and . Then, is the filtration (i.e., a non-decreasing and right-continuous family of -fields) of the OU process . Given the service times of previous samples, is a stopping time with respect to the filtration of the OU process , that is
Hence, the policy space can be expressed as
Let represent the waiting time between the delivery time of the -th sample and the generation time of the -th sample.As is the waiting time between the delivery time of the -th sample and the generation time of the -th sample. Then, and for each Given , is uniquely determined by . Hence, one can also use to represent a sampling policy.
Iv-C Reformulation of Problem (49)
In order to solve (49), let us consider the following MDP with a parameter :