lock
None
view repo
Kalman filter is the most powerful tool for estimation of the states of the linear Gaussian system. In addition, used this method, expectation maximization algorithm can estimate the parameters of the model. However, this algorithm cannot function in real-time situation. Thus, we propose new method that can estimate the transition matrices and the states of the system in real-time. This method utilizes two ideas: estimation in observation space and online learning framework. Applied to damped oscillation model, we have obtained extraordinary performance to estimate the matrices. Also, introduced localization and spatially uniformity to the method, we have demonstrated that our methods could reduce noise in high-dimensional spatio-temporal data. Moreover, this methodology has potential in areas such as weather forecast and vector field analysis.
READ FULL TEXT VIEW PDFA quick tool of noise reduction and short-term prediction is important for areas such as weather forecast and adjusting scanning probe microscope (SPM). In weather forecast, engineers need a speedy denoising method to utilize the result for instantaneous forecast. Because SPMs require adjustment of some parameters for use, a method to real-time adjustment helps them to carry out their experiments.
Until now, many noise reduction methods have been proposed [2], [7]
. In recent years, denoising methods utilizing deep learning achieve a great success, such as Noise2Noise and Noise2Void
[21], [16], [14]. However, these methods cannot predict future states.For future states, combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been proposed since ConvLSTM
[18]. These methods use no model constraint and are applied to image sequence data [1], [17], [19], [23]. They, however, only focus on future forecast, and we cannot interpret time-series model.Focusing on state space model, we can avoid lack of interpretation. Kalman filter (KF) is powerful tool to estimate states of linear Gaussian state space model in parameter-given scenario [12]. Combining with expectation maximization (EM) algorithm, we can estimate the parameters and the states, that is, achieve noise reduction as refined states and short-term forecast from the constructed model [9]. However, the algorithm requires much calculation time and is unsuitable for real-time situation.
We, therefore, propose a method to estimate the states and the state transition of the model in real-time. Using three ideas, assumption of observation transition, time-invariant interval, and online learning framework, the method function well for some experiments. We call this method linear operator construction with Kalman filter (LOCK).
Also, introducing localization and spatially uniformity to our method, called local LOCK (LLOCK) and spatially uniform LOCK (SLOCK), we can apply to spatio-temporal data, such as image sequence data and grid sequence data. Applied to synthetic data, we found our methods are superior to the existing method and efficient in terms of calculating and memory cost.
We introduce the methods in the following chapter. In Chapter 3, we state experimental results applied to damped oscillation model, object moving, global flow, and local stationary flow data^{1}^{1}1Our code is available at https://github.com/ZoneTsuyoshi/lock. Finally, we conclude this paper in Chapter 4.
In this section, we explain notation at first, then state Kalman filter as benchmark method, afterward propose new data assimilation methods in order to estimate the states of the system and the transition matrices in real-time.
We use bold small letters, e.g., , to denote the variable as vector. The set of natural number is , furthermore, we use as the set of natural number that are less than , i.e., . For a matrix and a vector that depends on time , we use and for the time dependency. For a matrix and a vector , we use and to denote the element of the matrix and -the element of the vector, respectively. Moreover, for a matrix and a vector , denotes the matrix
Similar to this, for vectors and , denotes the vector . Also, for a set that meets , denotes the vector, whose -th element correspond to -th smallest value of the matrix.
Consider the state space model
(1) | ||||
(2) |
where , , and represent the state of the system, the observation, the system noise, and the observation noise, respectively, at a given time . In the linear scenario, representing
(3) | ||||
(4) |
we can obtain
(5) | ||||
(6) |
). Suppose the system noise and the observation noise are following Gaussian distribution whose mean is 0, that is,
and . By this assumption, we can apply Kalman filter (KF) to the estimation of the state vectors.By using the algorithm, we can get update formula
(7) | ||||
(8) | ||||
(9) | ||||
(10) | ||||
(11) |
where and represent the estimated value and the covariance, respectively, of the state at a time given observation .
Regarding estimation problem of the transition matrix, expectation maximization (EM) algorithm is one of the most powerful tool. However, the algorithm is unsuitable for real-time situation due to requirement for sequence of the smoothed value and , where is final time-step of time-series data. We can consider ad hoc strategy, expectation maximization KF (EMKF), which apply the algorithm every step, however, we show this method only work for limited situation in the next chapter. Thus, the following sections propose new methods for estimation of the transition matrices and the states in real-time.
To overcome the real-time estimation problem, we propose a new method called linear operator construction with Kalman filter (LOCK) by introducing three ideas: assumption of observation transition, time-invariant interval, and online learning framework.
First, we introduce an assumption of linear and Gaussian observation transition
(12) |
where is the noise of the observation transition at a time and follows Gaussian distribution . If the observation matrices are regular, this assumption meets by
(13) | ||||
(14) |
Figure 1 shows these relationships.
By equations (5), (6) and (12), we can obtain
(15) |
then, by taking expectation,
(16) |
meets because these noise are following mean 0 Gaussian distribution. Therefore, we can get unbiased estimator
(17) |
where and represent pseudo inverse matrix of and the estimated value of , respectively.
In addition, suppose these matrices are time-invariant in interval , the second idea, then, we can get
(18) |
where is a matrix which is composed of the observation vectors from the time to . Therefore, we can obtain the estimation algorithm
(19) |
Moreover, incorporated idea from online learning[4]
, we introduce parameters in order to treat outliers in the observation
(20) | |||
(21) |
where and are learning rate and cutoff distance, which control maximum amount of difference between the old estimate and the new estimate, respectively. Algorithm 1 summerizes LOCK method.
We introduced LOCK method in the previous section, however, this algorithm cannot function properly in the case of , where and are update interval and the dimension of the observation, respectively. In particular, applying LOCK to a movie data which has large dimensions, due to requirement of large , is unrealistic for real-time situation. Hence, we propose local LOCK (LLOCK) method, which adopts localization idea in order to overcome this problem[11].
Consider a lattice point in spatiotemporally data, the variable of the point at a time is only affected by the neighborhood of the point at the time . This localization idea can reduce effective observation dimension. In other words, to update of element of the transition matrix, we can only use variables of the observation in the vicinity of the point and .
We define localization matrix , where is adjacency matrix
(22) |
By using , bool type array of the matrix , we can get
(23) | |||
(24) | |||
(25) |
where and are local indices of the points and , respectively. Algorithm 2 summarizes the local calculation of the observation transition matrix .
LOCK and LLOCK consider the elements of the transition matrix as independent parameters, as a result, they treat many parameters and require long update interval to avoid . Thus, both methods are unsuitable for rapid change of the transition matrix. Therefore, we introduce spatially uniform LOCK (SLOCK) method to solve this problem. This method considers multiple elements of the transition matrix as a common parameter. In other words, we assume parameters are spatially uniform in given vector field. As the number of parameters is smaller than that of LOCK and LLOCK, this algorithm can update with shorter interval.
First, we design a matrix , whose each element represents a parameter number, if two elements are same, the corresponding two elements of are same, i.e.,
(26) |
Also, similar to the localization matrix, zero element means the corresponding element of are zero, that is,
(27) |
We use the numbers in ascending order, then we have unique values, excluding zero, that are . By using this, we use a vector to denote the values of ,
(28) |
Thanks to this notation, we can obtain the equation
(29) | |||
(30) | |||
(31) |
where represents the parameter vector. Utilizing this equation, we can obtain following update equations
(32) | ||||
(33) | ||||
(34) |
Algorithm 3 summarizes this method.
We applied EMKF and LOCK to damped oscillation model, SLOCK to object moving data and global flow data, and LLOCK to global flow data and local stationary flow data.
Damped oscillation model
(35) | ||||
(36) |
represents behavior of a damped oscillator, which is subject to a damping force , where is the position of the object, is the velocity, is the mass, is the oscillation constant, and is the damping constant. Utilizing Euler forward method, we can obtain
(37) | ||||
(38) |
where represents the step size. By combining and into the state vector , we can get
(39) | |||
(40) |
where means true transition matrix of this model. Applied this model to the linear state space model, we can use EMKF and LOCK methods.
We conducted five experiments; first and second one are cases that parameters of the model are time-invariant, the others correspond to time-variant setting. The five use same setting, excluding the model parameters, as shown in Table 1.
In the experiment 1, we first simulated the true model from for adding system noise , called “true” states. Then, adding observation noise , we can get pseudo “observation” data. Then, we applied EMKF, LOCK to the observation and got experimental result. Here, we notice that the initial transition matrix is the true one, the initial state and transition covariance are different from the condition of generating true data.
The experiment 2 are different from the previous experiment w.r.t. two points: initial transition matrix and number of simulation. The former, the initial transition matrix is not the true one but following isotropic Gaussian distribution , each element of the matrix independently follows Gaussian distribution . The later one, considering randomness, we execute 100 simulations and got experimental results.
From the 3, main difference from the the previous two are time-invariant or time-variant. The experiment 3 and the 4 use same setting of the 1 and the 2, excluding , , and (see Table 1), respectively. A difference point of the experiment 5 from the 4 is distribution of the initial transition matrices, this follows and in the 4 and the 5, respectively.
damped oscillation model | , , final time-step |
---|---|
state space model | , , , |
LOCK | , |
EMKF | , , #(iteration) |
experiment 1 | , , , |
experiment 2 | , , , |
experiment 3 | , , |
, | |
experiment 4 | , , |
, | |
experiment 5 | , , |
, |
Figure 2 illustrates time transition of the true states, the observation, and the estimated results used by LOCK and EMKF in the experiment 1. From this, LOCK and EMKF function well in terms of the estimation of the states.
Figure 3 shows similar results in the experiment 3. Although the estimated results of LOCK sometimes protrude away from the true states, both methods work well in overall for the state estimation of the time-variant model.
Also, Figure 4 and 5 show time-transition of MSE between the true matrices and the estimated matrices used by LOCK and EMKF, respectively. Four figures correspond to four elements of the matrices, e.g., the left upper one corresponds to . From the former figure, the estimated matrices of LOCK are nearby the true one, even if an initial transition matrix is far from the true. On the other hand, EMKF works only when an initial transition matrix is nearby the true, from the later figure. In real case, since an initial transition matrix is often unknown, LOCK is expected to be better than EMKF in perspective of approaching the true matrix.
Similar results in experiment 2 and 5 are found in Appendix A.
We made movie data that an object moves various directions for , called “object moving” data; we present detail in Appendix B.1. After making true data, we added Gaussian noise . Figure 6 shows the pseudo observation at . The directions of transition often change and summarized in Figure 7. Also, we use “adjacency distance”, as the neighbors we consider at localization phase as shown in Figure 8. We set parameters of SLOCK to , , and state space model , .
Figure 9 illustrates time-transition of MSE of the observations, estimated results used by KF and SLOCK. From this, because the MSE of SLOCK is lower than KF and the observation, the method is powerful tool to estimate the spatially uniform model.
Also, we calculated MSE regarding the transition matrices and executed sensitivity analysis for , , and ; these are shown in Appendix B.2.
We made “global flow” data, various objects exist in images and move each direction at each intervals, as shown in Figure 10; more detailed generating process is stated in Appendix C.1. Generating data, we added Gaussian noise at each grid and got the pseudo observation data. We set parameters of LLOCK to , , , and state space model , .
Figure 11 shows time transition of MSE of the observations, estimated results used by KF and LLOCK. As shown in this, LLOCK has better estimated results, excluding nearby the changing points. In this data setting, since the changes are rapid, the MSE of our method are worse than the observation in the vicinity of the changes.
Also, we conducted short-term prediction by KF and LLOCK; this result is shown in Figure 12. We simulate each method until and predict from at this time point, where the predicted MSE of the observations represent MSE between the true state at a time and the observation at . From this, predictive ability of LLOCK is superior to KF and ad-hoc observation for this data.
Figures regarding MSE for the transition matrices and sensitivity analysis for , , and are shown in Appendix C.2.
Moreover, we measured calculation time of updating the transition matrices in the case when images are , , and . We applied SLOCK and LLOCK to data in this setting, as shown in Figure 13. From this figure, calculation time of both methods are low enough to execute in real-time situation.
Also, we calculated ideal memory cost and ad-hoc memory cost for our methods and EM algorithm; Figure 14 illustrates this when we assume float-type array. The “ad-hoc” means that when we execute no memory-saving code, in the other words, we memorize the matrices as it is. However, thanks to localization and spatially uniform assumption, we need only manage arrays, called “ideal” cost, where represents local dimension. According to this figure, so the memory cost of the ideal one is much less than that of EM algorithm, our methods are easier to apply.
We generated “local stationary flow” data that objects spring up in boundary and move four directions corresponding to each field as shown in Figure 15. The left one represents the flows of the data; for example, the flow direction of the upper left part of the images is up. The others show the pseudo observation data at time , 5, and 10. More detailed generating process is represented in Appendix D.1. Generating the true data, we added Gaussian noise and gained the observations.
Afterword, we applied LLOCK to this data to check the method can catch up local information. We set parameters of LLOCK to , , , and state space model , .
Figure 16 shows time transition of MSE of the observations, estimated results used by KF and LLOCK. From this, LLOCK has lower MSE than that of KF and the observation.
Also, we predicted for short period utilized by KF and LLOCK as shown in Figure 17. Similar to the result of global flow data, the our method has better performance than the others for short-term prediction.
MSE regarding the transition matrix and sensitivity analysis for , , and are presented in Appendix D.2.
In this paper, we propose three real-time methods to estimate the states and state transition matrices in linear Gaussian state space model. The first proposed method, linear operator construction with Kalman filter (LOCK), can approach the true transition matrices and the true states via application to damped oscillation model. The advance methods, SLOCK and LLOCK, achieve better performance for noise reduction and short-term prediction through the three synthetic data. These methods are also superior to EM algorithm in terms of computational and memory cost. In fact, calculation time and memory usage of these methods are much less than the existing method. Therefore, these have potential to estimate transition of data, such as weather forecast and object tracking, in real-time.
Nevertheless, our methods has mainly three drawbacks: dependence of linear Gaussian framework, tuning of hyper-parameters, and tight assumption of SLOCK. First, as the methods utilize linear Gaussian formulation, we cannot directly apply to nonlinear or non-Gaussian data; then, these need providing interpretability for such data. Second, our method include the hyper-parameters , , and . An automatic tuning method to these parameters are necessary for applying to real data. The last one, although SLOCK has better performance for spatially uniform data, this assumption is quite tight for real data. To overcome this issue, we eager to develop a combined method of LLOCK and SLOCK because these assumptions and precisions are trade-off relationships.
International Conference on Genetic and Evolutionary Computing
, 563-572, DOI: 10.1007/978-981-13-5841-8_59.In this section, we presented residual results of damped oscillation model. Figure 18 and 19 represent the estimated transition matrices in the experiment 2 by LOCK and EMKF, respectively. From these figures, the estimated elements of LOCK approach true elements while correspondence of EMKF do not.
Also, Figure 20 and 21 correspond to the experiment 5. The former figure indicate the estimated matrices by LOCK can catch up the true matrix, excluding last few update, whose noise ratio are higher than the others. The later figure indicate 100 simulated results are similar to each other in terms of the time transition of the matrices.
Firstly, we made array whose elements are 20. Then, we created random core points whose and coordinates exist from 7 to 16. Second, we selected 2 random core points from the points and generated a link, which links the two, whose width are . Third, we gave the coordinates in the link linear values from
, added Gaussian noise whose mean and standard deviation are 10, where
, , , and represent minimum value, maximum one, iteration number, and number of iterations, respectively. We iterated the second and third process for times, as a result, we got the base true image as shown in Figure 22.Fourth, we set direction list and direction change list as shown in Figure 7. In each direction, we made the translation matrix regarding the direction, then product the image vector and the matrix, hence we got the true data. Finally, we added the Gaussian noise whose mean and standard deviation are 0 and 20 to the true data, here we applied the absolute operator to the noise because the image data are greater than 0. Figure 22 shows the images regarding the true data and the pseudo observation data. This generating process is also seen in our GitHub page.
Figure 23 shows Substantial MSE (SMSE) between the estimated transition matrices and the true transition matrices, formulated by
(41) |
where represents the number of non-zero elements of the localization matrix, that is,
(42) |
From this figure, the estimated matrices are close to the true one, especially, the elements whose true values are zero.
Also, we conducted sensitivity analysis as shown in Figure 24. This figure indicate the hyper-parameters of SLOCK are robust for this data.
First, we made matrix whose elements are 20; then, we created random objects whose size are randomly chosen from two to four, whose values are following i.i.d. . Second, we set the directions at each intervals and move the objects corresponding direction at each time interval. Finally, we added the zero-mean Gaussian noise whose standard deviation are 20. This generating process is also seen in our GitHub page.
We calculated SMSE between the true transition matrices and the estimated matrices as shown in Figure 25. The result indicate the estimated matrices are close to the true, excluding in the rapid change point.
Also, Figure 26 represents results of sensitivity analysis for , , and . From this figure, LLOCK has robustness for the synthetic data.
First, we made matrix whose elements are 20, where and represent number of timesteps and block length, respectively. Then, we created random objects whose size are randomly chosen from two to four, whose values are following Gaussian distribution , substituted the object values for the random elements of the matrix. Second, we set the directions of four square blocks whose size are , divided from field as shown in Figure 15. Then, we set the initial value from the first 15 columns of the source matrix. Third, we move the objects following the block flow, set new source values from the column of the matrix. Finally, we added the zero-mean Gaussian noise whose standard deviation are 20. This generating process is also seen in our GitHub page.
Figure 27 shows time transition of SMSE between the true transition matrices and estimated correspondence. It can be observed from this figure that the estimated results by LLOCK are close to the true matrix.
Also, we conducted sensitivity analysis for , , and , as shown in Figure 28. This figure demonstrate LLOCK is a robust method for this local stationary flow data.
We utilized GPU resource and Python, following is our detailed environment.
Calculation machine
CPU: Intel Xeon E5-2670 2.6GHz (8core) x 2
Memory: 64GB
HDD: SAS 300GB 2 (RAID 1)
OS: SuSE 12.0 Enterprise LINUX
GPU: NVIDIA Tesla K20, NVIDIA Quadro K5000
Other: CUDA8.0
Python
Python 3.7.3
Anaconda 4.6.14
cupy-cuda80 5.4.0
numpy 1.16.2
Comments
There are no comments yet.