The Kalman filter is the most powerful tool for estimation of the states of a linear Gaussian system. In addition, using this method, an expectation maximization algorithm can be used to estimate the parameters of the model. However, this algorithm cannot function in real time. Thus, we propose a new method that can be used to estimate the transition matrices and the states of the system in real time. The proposed method uses three ideas: estimation in an observation space, a time-invariant interval, and an online learning framework. Applied to damped oscillation model, we have obtained extraordinary performance to estimate the matrices. In addition, by introducing localization and spatial uniformity to the proposed method, we have demonstrated that noise can be reduced in high-dimensional spatio-temporal data. Moreover, the proposed method has potential for use in areas such as weather forecasting and vector field analysis.READ FULL TEXT VIEW PDF
A quick tool for noise reduction and short-term prediction is important for areas such as weather forecasting and adjusting a scanning probe microscope (SPM). In weather forecasting, engineers require a fast denoising method in order to use the results for instantaneous forecasting. Since SPMs require adjustment of some parameters for use, a method for real-time adjustment is helpful in carrying out experiments.
. In recent years, denoising methods using deep learning, such as Noise2Noise and Noise2Void, have achieved great success, , . However, these methods cannot predict future states.
Focusing on the state space model, we can avoid a lack of interpretation. The Kalman filter (KF) is a powerful tool for estimate states of the linear Gaussian state space model in a parameter-given scenario . Combined with an expectation maximization (EM) algorithm, we can estimate the parameters and the states, i.e., achieve noise reduction as refined states and short-term forecasting from the constructed model . However, the algorithm requires a great deal of calculation time and is unsuitable for real-time situation.
We therefore propose a method by which to estimate the states and the state transition of the model in real time. Using three ideas, namely, assumption of an observation transition, a time-invariant interval, and an online learning framework, the method works well for some experiments. We refer to this method as linear operator construction with the Kalman filter (LOCK).
In addition, by introducing localization and spatial uniformity to the proposed method, referred to respectively as local LOCK (LLOCK) and spatially uniform LOCK (SLOCK), we can apply the proposed method to spatio-temporal data, such as image sequence data and grid sequence data. Applied to synthetic data, the proposed methods were found to be superior to the existing method and to be efficient in terms of calculation and memory cost.
We introduce the proposed methods in the following section. In Section 3, we state the experimental results obtained using a damped oscillation model, object moving, global flow, and local stationary flow data111Our code is available at https://github.com/ZoneTsuyoshi/lock. Finally, we conclude the present paper in Section 4.
In this section, we first explain the notation, followed by state Kalman filter as a benchmark method. We then propose new methods in order to estimate the states of the system and the transition matrices in real time.
We use bold small letters, e.g., , to denote the variable as vector. The set of natural numbers is denoted as , and we denote as as the set of natural numbers that are less than or equal to , i.e., . For a matrix and a vector that depends on time , we use and for the time dependency. For a matrix and a vector , we use and to denote the -th element of the matrix and the -th element of the vector, respectively. Moreover, for a matrix and a vector , denotes the matrix
Similarly, for vectors and , denotes the vector . Moreover, for a set that satisfies , denotes the vector, the -th element of which corresponds to -th smallest value of the matrix.
Consider the state space model
where , , , and represent the state of the system, the observation, the system noise, and the observation noise, respectively, at a given time . In the linear scenario, using
we can obtain
). Suppose the system noise and the observation noise follow a Gaussian distribution, the mean of which is 0, i.e.,and . Under this assumption, we can apply the Kalman filter (KF) to the estimation of the state vectors.
The filter uses the following update formula
where and represent the estimated value and the covariance, respectively, of the state at time given observation .
The expectation maximization (EM) algorithm is one of the most powerful tools for estimation of the transition matrix. However, this algorithm is unsuitable for a real-time situation due to the requirement for a sequence of smoothed values of and , where is final time-step of time-series data. We can consider an ad hoc strategy, namely, expectation maximization KF (EMKF), which applies the algorithm every step. However, in the next section, we show that the proposed method only works for a limited situation. Thus, the following sections propose new methods for estimation of the transition matrices and the states in real time.
In order to overcome the real-time estimation problem, we propose a new method referred to as linear operator construction with the Kalman filter (LOCK) by introducing three ideas: assumption of an observation transition, a time-invariant interval, and an online learning framework.
First, we introduce the assumption of a linear and Gaussian observation transition
where is the noise of the observation transition at time and follows Gaussian distribution . If the observation matrices are regular, this assumption is satisfied by
Figure 1 shows these relationships.
Then, by taking the expectation, the following relation is satisfied:
Therefore, we can obtain an unbiased estimator
where and indicate the Moore-Penrose type pseudo-inverse matrix of and the estimated value of , respectively.
In addition, suppose these matrices are time-invariant in interval , then, we can obtain
where is a matrix composed of the observation vectors from time to time . Therefore, we can obtain the estimation algorithm
Moreover, based on an idea incorporated from online learning
, we introduce parameters in order to treat outliers in the observation
where and are the learning rate and the cutoff distance, respectively, which control the maximum amount of difference between the old estimate and the new estimate. Algorithm 1 summarizes the LOCK method.
We introduced the LOCK method in the previous section. However, this method cannot function properly for the case in which , where and are the update interval and the dimension of the observation, respectively. In particular, applying LOCK to movie data of large dimension, due to the requirement for a large , is unrealistic for a real-time situation. Hence, we propose the local LOCK (LLOCK) method, which adopts localization in order to overcome this problem.
Consider a lattice point in spatio-temporal data. The variable of the point at time is affected only by the neighborhood of the point at time . This localization can reduce the effective observation dimension. In other words, in order to update the -th element of the transition matrix, we can use only variables of the observation in the vicinity of points and .
We define localization matrix , where is adjacency matrix
Using this matrix, we can obtain
where and are local indices of points and , respectively. Algorithm 2 summarizes the local calculation of the observation transition matrix .
The LOCK and LLOCK methods consider the elements of the transition matrix as independent parameters, as a result, these methods treat many parameters and require a long update interval in order to avoid . Thus, both methods are unsuitable for rapid change of the transition matrix. Therefore, we introduce the spatially uniform LOCK (SLOCK) method to solve this problem. This method considers multiple elements of the transition matrix as a common parameter. In other words, we assume that parameters are spatially uniform in given vector field. Since the number of parameters is smaller than that of LOCK or LLOCK, the algorithm of this method can update with a shorter interval.
First, we design a matrix , each element of which represents a parameter number, in which if two elements are same, then the corresponding two elements of are same, i.e.,
In addition, similar to the localization matrix, the zero elements of correspond to the zero elements of , i.e.,
If we use the numbers in ascending order, then we have unique values, excluding zero, which are . Next, we use the vector to denote the values of
Thanks to this notation, we can obtain
where is the parameter vector. Using this equation, we can obtain the following update equations:
Algorithm 3 summarizes this method.
We applied EMKF and LOCK to the damped oscillation model, SLOCK to object moving data and global flow data, and LLOCK to global flow data and local stationary flow data.
Damped oscillation model
represents the behavior of a damped oscillator, which is subject to a damping force , where is the position of the object, is the velocity, is the mass, is the oscillation constant, and is the damping constant. Using the Euler forward method, we can obtain
where is the step size. By combining and into the state vector , we can obtain
where is the true transition matrix of this model. Appling this model to the linear state space model, we can use the EMKF and LOCK methods.
We conducted five experiments. In the first and second experiments, the parameters of the model are time-invariant. In the other experiments, the parameters of the model are time-variant. The five experiments use the same setting, excluding the model parameters, as shown in Table 1.
In experiment 1, we first simulated the true model from for adding system noise (referred to herein as “true” states). Then, adding observation noise , we can obtain “pseudo-observation” data. Then, we applied EMKF and LOCK to the observation and obtained the experimental results. Here, the initial transition matrix is the true one, and the initial state and the transition covariance are different from the condition for generating the true data.
Experiment 2 differs from the previous experiment with respect to the initial transition matrix and the number of simulations. The initial transition matrix is not the true one but rather follows isotropic Gaussian distribution , and each element of the matrix independently follows Gaussian distribution . Regarding the number of simulations, considering randomness, we executed 100 simulations and obtained the experimental results.
The main difference between experiment 3 and experiments 1 and 2 is whether parameters are time-invariant or time-variant. Experiments 3 and 4 use the same settings as experiments 1 and 2, respectively, excluding , , and (see Table 1). Experiments 4 and 5 have different distributions of the initial transition matrices, which are and for experiments 4 and 5, respectively.
|damped oscillation model||, , final time-step|
|state space model||, , ,|
|EMKF||, , #(iteration)|
|experiment 1||, , ,|
|experiment 2||, , ,|
|experiment 3||, ,|
|experiment 4||, ,|
|experiment 5||, ,|
Figure 2 illustrates the time transition of the true states, the observation, and the estimated results used by LOCK and EMKF in experiment 1. From this figure, LOCK and EMKF function well in terms of the estimation of the states.
Figure 3 shows similar results to those for experiment 3. Although the estimated results for LOCK sometimes extend away from the true states, both methods work well overall for state estimation of the time-variant model.
In addition, Figures 4 and 5 show the time transition of the mean squared error (MSE) between the true matrices and the estimated matrices used by LOCK and EMKF, respectively. The four panels correspond to the four elements of the matrices; e.g., the upper-left panel corresponds to . From Figure 4, the estimated matrices of LOCK are similar to the true matrix, even if the initial transition matrices are far from the true matrix. On the other hand, as shown in Figure 5, EMKF works only when the initial transition matrix is similar to the true matrix. In the real world, since the initial transition matrix is often unknown, LOCK is expected to be better than EMKF, from the perspective of approaching the true matrix.
Similar results for experiments 2 and 5 are presented in Appendix A.
We generated movie data in which an object moves various directions for , referred to herein as “object moving” data. Details are presented in Appendix B.1. After generating the true data, we added Gaussian noise . Figure 6 shows the pseudo-observation at . The directions of transition often change and summarized in Figure 7. In addition, we used the “adjacency distance” as the neighbors considered during the localization phase, as shown in Figure 8. We set the parameters of SLOCK as , , and and the state space model given by and .
Figure 9 shows the time transition of the MSE of the observations and estimated results used by KF and SLOCK. From this figure, since the MSE of SLOCK is lower than that of KF and the observations, this method is a powerful tool for estimating the spatially uniform model.
In addition, we calculated the MSE for the transition matrices and executed sensitivity analysis for , , and (as shown in Appendix B.2).
We generated “global flow” data, where various objects exist in images and move in each direction in each interval, as shown in Figure 10. A more detailed description of the generating process is stated in Appendix C.1. We added Gaussian noise at each grid and obtained pseudo-observation data. We set the parameters of LLOCK to , , , and and the state space model given by and .
Figure 11 shows the time transition of the MSE of the observations and estimated results used by KF and LLOCK. As shown in this figure, LLOCK provides better estimated results, excluding near the changing points. In this data setting, since the changes are rapid, the MSE of the proposed method is worse than the observation in the vicinity of the changes.
In addition, we conducted short-term prediction by KF and LLOCK. The results are shown in Figure 12. We simulate each method until and perform prediction thereafter, where the predicted MSE of the observations represent the MSE between the true state at a time and the observation at . From this, the predictive ability of LLOCK is shown to be superior to that of KF and ad hoc observation for this data.
Figures regarding the MSE for the transition matrices and sensitivity analysis for , , and are shown in Appendix C.2.
Moreover, we measured the calculation time of updating the transition matrices for the case in which images are , , and . We applied SLOCK and LLOCK to data in this setting, as shown in Figure 13. From this figure, the calculation times of both methods are low enough to be executed in a real-time situation.
In addition, we calculated ideal memory cost and ad hoc memory cost for the proposed methods and the EM algorithm. Figure 14 shows these results when we assume a float-type array. Here, “ad hoc” means that we execute no memory-saving code, i.e., that we memorize the matrices as is. However, thanks to localization and the spatially uniform assumption, we need only manage arrays, referred to as the “ideal” cost, where represents the local dimension. According to this figure, the ideal memory cost is much less than that of the EM algorithm and the methods are easier to apply.
We generated “local stationary flow” data for which objects spring up in the boundary and move in four directions corresponding to each field, as shown in Figure 15. The left-hand panel represents the flow of the data. For example, the flow direction of the upper-left part of the image is upward. The other panels show the pseudo-observation data at , 5, and 10. A more detailed explanation of the generating process is shown in Appendix D.1. Generating the true data, we added Gaussian noise and gained the observations.
We then applied LLOCK to these data to confirm that the method can capture local information. We set the parameters of LLOCK as , , , and and the state space model given by and .
Figure 16 shows the time transition of the MSE of the observations and the estimated results used by KF and LLOCK. From this figure, LLOCK has a lower MSE than that of KF and the observations.
In addition, we performed prediction for the short period used by KF and LLOCK as shown in Figure 17. Similar to the results for global flow data, the proposed method has better performance than the other methods for short-term prediction.
The MSE regarding the transition matrix and sensitivity analysis for , , and are presented in Appendix D.2.
In the present paper, we propose three real-time methods to estimate the states and state transition matrices in a linear Gaussian state space model. The first proposed method, namely, linear operator construction with the Kalman filter (LOCK), can approach the true transition matrices and the true states via application of the damped oscillation model. The advance methods, namely, SLOCK and LLOCK, achieve better performance in terms of noise reduction and short-term prediction through the three synthetic data: object moving, global flow, and local stationary flow. These methods are also superior to the EM algorithm in terms of computational and memory cost. In fact, the calculation time and memory usage of these methods are much less than those of the existing method. Therefore, these methods have the potential to estimate the transition of data, such as weather forecasting and object tracking, in real time.
Nevertheless, the proposed methods have three main drawbacks: dependence on a linear Gaussian framework, tuning of hyper-parameters, and tight assumption of SLOCK. First, as the proposed methods use a linear Gaussian formulation, we cannot directly apply these methods to nonlinear or non-Gaussian data. Then, the proposed methods need to provide interpretability for such data. Second, the proposed methods include the hyper-parameters , , and . An automatic tuning method for these parameters is necessary for application to real data. Finally, although SLOCK has better performance for spatially uniform data, this assumption is rather limiting for real data. In order to overcome this issue, we are eager to develop a combined method of LLOCK and SLOCK because these assumptions and precisions are in trade-off relationships.
International Conference on Genetic and Evolutionary Computing, 563-572, DOI: 10.1007/978-981-13-5841-8_59.
Here, we present the residual results of the damped oscillation model. Figures 18 and 19 show the estimated transition matrices for experiment 2 by LOCK and EMKF, respectively. From these figures, the estimated elements of LOCK approach the true elements, whereas the correspondence of the EMKF method does not.
In addition, Figures 20 and 21 correspond to experiment 5. Figure 20 indicates that the estimated matrices by LOCK can match the true matrix, excluding the last few updates, the noise ratio of which are higher than in other matrices. Figure 21 indicates that the 100 simulated results are similar in terms of the time transition of the matrices.
First, we constructed a array, the elements of which are 20. Then, we created random core points, the and coordinates of which exist from 7 to 16. Second, we selected two random core points from the points and generated a link, which links the two points, the width of which is . Third, we assigned the coordinates in the link linear values from
, added a Gaussian noise, mean and standard deviation of which are both 10, where, , , and are the minimum value, the maximum value, the iteration number, and the number of iterations, respectively. We iterated the second and third processes for times and obtained the base true image as shown in Figure 22.
Fourth, we set the direction list and direction change list as shown in Figure 7. In each direction, we constructed the translation matrix regarding the direction and then produced the image vector and the matrix. As a result, we obtained the true data. Finally, we added a Gaussian noise having a mean and standard deviation of 0 and 20, respectively, to the true data. We applied the absolute operator to the noise because the image data are greater than 0. Figure 22 shows images of the true data and the pseudo-observation data. This generating process is also shown on our GitHub page.
Figure 23 shows the Substantial MSE (SMSE) between the estimated transition matrices and the true transition matrices formulated by
where represents the number of non-zero elements of the localization matrix, i.e.,
From this figure, the estimated matrices are close to the true matrix, especially the elements having true values of zero.
In addition, we conducted sensitivity analysis as shown in Figure 24. This figure indicates that the hyper-parameters of SLOCK are robust for this data.
First, we made a matrix, the elements of which are 20. Then, we created random objects, the sizes of which are randomly chosen from two to four and the values of which are i.i.d. . Second, we set the directions at each interval and move the objects corresponding to the direction at each time interval. Finally, we added a zero-mean Gaussian noise, the standard deviation of which is 20. This generating process is also shown on our GitHub page.
We calculated the SMSE between the true transition matrices and the estimated matrices as shown in Figure 25. The results indicate the estimated matrices are close to the true matrices, excluding rapid change points.
In addition, Figure 26 represents the results of sensitivity analysis for , , and . From this figure, LLOCK has robustness for the synthetic data.
First, we constructed a matrix, the elements of which are 20, where and indicate the number of time steps and the block length, respectively. Then, we created random objects, the sizes of which were randomly chosen from two to four and the values of which follow Gaussian distribution , that were substituted into the object values for the random elements of the matrix. Second, we set the directions of four square blocks, the size of which are , divided from a field, as shown in Figure 15. Then, we set the initial value from the first 15 columns of the source matrix. Third, we moved the objects following the block flow and set new source values from the column of the matrix. Finally, we added zero-mean Gaussian noise having a standard deviation of 20. This generating process is also shown on our GitHub page.
Figure 27 shows time transition of the SMSE between the true transition matrices and estimated correspondence. This figure shows that the estimated results by LLOCK are close to the values of the true matrix.
In addition, we conducted sensitivity analysis for , , and , as shown in Figure 28. This figure shows that LLOCK is a robust method for this local stationary flow data.
We used a GPU resource and Python. The following is the detailed environment of the present study.
CPU: Intel Xeon E5-2670 2.6GHz (8core) x 2
Memory: 64 GB
HDD: SAS 300 GB 2 (RAID 1)
OS: SuSE 12.0 Enterprise LINUX
GPU: NVIDIA Tesla K20, NVIDIA Quadro K5000