Variable Splitting Methods for Constrained State Estimation in Partially Observed Markov Processes

05/17/2020 ∙ by Rui Gao, et al. ∙ aalto Universität Tübingen 0

In this letter, we propose a class of efficient, accurate and general methods for solving state-estimation problems with equality and inequality constraints. The methods are based on recent developments in variable splitting and partially observed Markov processes. We first present the generalized framework based on variable splitting, then develop efficient methods to solve the state-estimation subproblems arising in the framework. The solutions to these subproblems can be made efficient by leveraging the Markovian structure of the model as is classically done in so-called Bayesian filtering and smoothing methods. The numerical experiments demonstrate that our methods outperform conventional optimization methods in computation cost as well as the estimation performance.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Many real-world applications in signal processing, such as target tracking, indoor positioning, and robotics, can be formulated as state estimation tasks for restoring the hidden states given a set of incomplete observations [1, 2]. Mathematically, the estimation task is formulated as inference in a statistical model, where the state of the dynamic system evolves under constraints which arise naturally from physical properties or model assumptions. The aim of this letter is to present methods for models which differ from these classical formulations in the sense that they contain additional inequality or equality constraints [3, 4] on the solution itself. For example, the exploitation of the trajectory geometry in ship tracking has been proven to be effective in enhancing the accuracy of the trajectory estimate [5, 6]. Imposing such constraints leads to a more accurate or physically reasonable estimates, but also makes solving the problem significantly more challenging.

In the recent decades, it has become common to formulate the methodology for state estimation in stochastic systems as special cases of Bayesian smoothers [2]. However, it is well-known that some of these algorithms can be used to efficiently solve optimization problems – even non-convex ones – which arise from finding the maximum a posteriori (MAP) estimate or similar point-estimates of the system state. For instance, Kalman smoother (KS) is equivalent to the batch least-square solution [7]; iterated extended Kalman smoothers (IEKS) can be seen as Gauss–Newton methods for computing MAP estimates [8, 9]. One main advantage of these methods is that in large data regime, due to leveraging of the Markov properties of the systems, they are computationally efficient [10].

Variable splitting methods such as Peaceman–Rachford splitting (PRS) [11], split Bregman method (SBM) [12], and alternating direction method of multipliers (ADMM) [13, 14] are efficient optimization methods, which can be applied to many kinds of constrained optimization problems. The main idea is to convert an inequality constrained problem into an equality constrained problem by variable splitting. Unfortunately, because of the nature of the problems, the direct solution via subgradient methods will remain computationally expensive when the number of data points is large. However, it turns out that when applied to dynamic models, the computational demand can be lowered by using filtering and smoothing type of methods [15, 16].

In this letter, we focus on state estimation with nonlinear equality and/or inequality constraints while leveraging the Markovian structure of the model. First, we develop a class of the constrained smoother-based variable splitting methods, which can be instantiated by adopting different filters and smoothers as well as variable splitting methods (i.e., ADMM, PRS, SBM), gaining the benefits of both. Their combination to solve constrained state-estimation problems leads to effective methods that have not previously been considered. For the special case of Gaussian-driven nonlinear systems, we present the constrained Kalman smoother and the constrained iterated extended Kalman smoother, which arise within the update steps of variable splitting. Our experiments demonstrate a promising performance of the methods.

I-a Problem Formulation

We consider general probabilistic state-space models, also called partially observed Markov processes (see [2]):

(1)

where denotes an -dimensional state of the system, is an -dimensional noisy measurement at the time step , is the transition density of the Markovian state process, and

is the conditional probability density of the measurement. The prior distribution of the state is given as

. A particularly important special case is a model of the form

(2)

where is a measurement function and is a state transition function. The initial state is assumed to be Gaussian with mean and covariance . The errors and

are assumed to be mutually independent zero-mean Gaussian random variables with known positive definite covariance matrices

and .

The goal is to estimate the state sequence from the noisy measurement sequence , but we also have a set of equality and inequality constraints on the state. To make the problem tractable, we replace the state constraints with constraints of the state estimate, which we choose to be the MAP estimate. We aim at computing

(3)

where and are constraint functions. For the model (2), we obtain a nonlinear quadratic minimization problem. In particular, when the functions , , and are affine, the problem in (3) is a quadratic equality/inequality constrained optimization problem which can be solved in closed form.

I-B Overview of Variable Splitting

Consider a minimization optimization problem with equality and inequality constraints:

(4)

where is the cost function, and , are constraint functions. Our approach to solving problems of the form (4) proceeds by introducing auxiliary constrained variables to separate the components in the cost function, which is called variable splitting [17]. More specifically, we introduce an additional variable and a barrier function to replace the original inequality constraint, which leads to an equality constrained optimization problem

(5)

We then define the so-called augmented Lagrangian function by introducing Lagrangian multipliers and penalty parameters. The process alternates among the updates of the split variables. For solving (5), ADMM, PRS, and SBM variable splitting optimization methods are discussed.

ADMM was developed in part to blend the decomposability of dual ascent with the superior convergence properties of the method of multipliers [13]. Given , , , and , ADMM solves the constrained optimization problem (5) via the iterative steps:

(6a)
(6b)
(6c)
(6d)

where , are Lagrange multipliers associated with the constraints, and are penalty parameters.

The PRS method [11, 18] is similar to ADMM, but computes the primal variable once and updates the Lagrange multiplier twice, by updating the intermediate multipliers and ). The iteration can be written as

(7a)
(7b)
(7c)
(7d)
(7e)
(7f)

with the parameters .

In SBM [12], we have updates for , as follows:

(8a)
(8b)

for times, and update the extra variable by

(9a)
(9b)

In particular, SBM is equivalent to the scaled ADMM [13] when the inner iteration number .

All these methods discussed above, ADMM, PRS, and SBM, can solve optimization problems with equality and inequality constraints. Although the methods are slightly different, their updates of the primal variable are similar. However, when the dynamic system is described in terms of a partially observed Markov process, the minimization problems typically become very high-dimensional.

Ii The Proposed Approach

In this section, we first employ the generalized constrained smoother-based variable splitting framework. The problem corresponding to the model (2) can then be solved by the constrained Kalman smoother and the constrained iterated extended Kalman smoother, respectively.

Ii-a The General Framework

The methods we employ here rely on the variable splitting methods. Let be a family of cost functions

(10)

For the special case of (2), the function has the form

(11)

The equality/inequality constrained optimization problem corresponding to (5) is now formed as

(12)

As discussed in Section I-B, the unified steps of solving (12) are to alternate minimization with respect to , , , and , which depend on the specific method we choose to use. All the subproblems in ADMM, PRS, and SBM are of nonlinear least-square type. The , , and subproblems in (6), (7), and (8) remain the same as in batch setting, except that we do the updates for each separately.

When is extremely large, the computation in the subproblems have high computational costs. Therefore, in the following, we explicitly leverage the Markov structure of the problems which enables the use of computationally efficient Bayesian recursive smoothers, aiming at computing the MAP estimate of a constrained partially observed Markov process.

Ii-B Constrained Kalman Smoother (CKS) for Affine Systems

In this section, we present the method for solving the constrained state-estimation problem in affine systems. This solution which is based on KS, will later be used in the nonlinear filters and smoothers. Let us now assume that the model and constraint functions are affine

(13)

where and are the transition and measurement matrices, , are constraint matrices, and , , ,

are given vectors.

If we apply, for example, the ADMM method to the optimization problem in (12), then in the affine case the subproblem becomes

(14)

Similarly to [10], this minimization problem corresponds to the MAP state estimate of an affine state-space model, and this estimate can be computed using KS. We now define two artificial measurement noises and with covariances , , and two pseudo-measurements , :

(15)

where

is an identity matrix. The solution to (

14) can then be computed by running KS on the constrained state-space model

(16)

Ii-C Constrained Iterated Extended Kalman Smoother (CIEKS)

The IEKS method [2, 8] is a nonlinear extension of KS, which approximates the nonlinear functions by linearization. When the state-space model and constraint functions are nonlinear, IEKS works by alternating linearisation of all the nonlinear functions around a previous estimate.

At each iteration , we form the affine approximations for the nonlinear functions , , , and , given by

(17)

where denotes the Jacobian of and is the current estimate. When we use the affine approximations on the subproblem, for example , the state-space model (16) can be obtained if we explicitly write

(18)

As discussed in Section II-B, we can obtain the solution by running KS. Hence, the subproblem is solved by iterating these steps. CIEKS is equivalent to constrained Gauss–Newton [8], but the smoother here uses the Markov structure of the problem, which leads to a lower computational cost.

Ii-D Convergence Results

In this section we present theoretical results for the proposed methods, which are derived from the combination of CKS/CIKES and ADMM.

In the affine case, we have the following theorem.

Theorem 1 (Convergence of CKS-ADMM).

Let and be positive definite matrices. The sequence generated by CKS-ADMM globally converges to a stationary point .

Proof.

When the conditions , are satisfied, substituting into the batch form, the cost function will be convex. Hence, the convergence results can be obtained from the standard ADMM convergence proof [13].   

On the other hand, when all the functions and constraints are nonlinear, may not be convex, for this case we have the following local convergence theorem.

Theorem 2 (Convergence of CIEKS-ADMM).

Let be prox-regular [19] with the constant and the Jacobian , have full-column rank. Then there exists such that the sequence generated by CIEKS-ADMM converges to a local minimum .

Proof.

The proof is based on our paper [10], mutats mutandis.   

Iii Experiments

In this section, we evaluate the performance of the proposed constrained smoother-based variable splitting methods, including CIEKS-PRS, CIEKS-SBM, and CIEKS-ADMM, which are examples of those combinations. Consider a four-dimensional ship tracking model [6], where the ship velocity , and its position , are given by

The measurements are captured by two stationary positions, which are located at and . The transition function and the covariance are

with , . The measurement function and the covariance are

with . We impose the inequality constraint into the model. We combine CIEKS with ADMM as the proposed method (CIEKS-ADMM), and then compare it with the unconstrained estimate and the constrained robust Kalman-Bucy smoother (CKBS) [6]. We set and the maximum number of iterations is . The estimation results are plotted in Fig. 1. As can be seen, the results with the constraints are much closer to the ground truth than the unconstrained estimate.

Fig. 2 demonstrates the computational benefits of our CIEKS-based variable splitting methods, compared to the batch PRS, SBM, ADMM methods, and CKBS. The left-hand plot depicts the value of the function versus the iteration number. We observe that all the methods converge in just a few iterations. Our methods, CIEKS-PRS, CIEKS-SBM, and CIEKS-ADMM, have the same convergence rate with the corresponding batch PRS, SBM, and ADMM. They have superior convergence properties over CKBS [6]. Meanwhile, the constrained smoother-based variable splitting methods solve the problem fastest. The right-hand plot in Fig. 2 compares the running time (sec) of all the methods, confirming that our methods are able to keep this growth very mild, in contrast to the batch methods. The proposed methods take around 15 iterations to run, and are complete in about 0.2 seconds. The benefit of our methods is highlighted by the fact that they can efficiently solve constrained state-estimation problems with extremely large numbers of data points. Table I shows that with increasing from to , CKBS, PRS, SBM, and ADMM become significantly slower, and the CIEKS-based PRS, SBM, and ADMM yield significant speed improvements. Particularly, the batch methods run out of memory (‘–’) when .

Fig. 1: Ground truth (dashed gray), bound (red line), and the estimates.
Fig. 2: The cost function values and the running time (sec) of all the methods versus the iteration number.
PRS SBM ADMM CKBS CIEKS-PRS CIEKS-SBM CIEKS-ADMM
569.9 375.8 382.5 23 9.4 3.4 4.0
3256 2193 2284 186 89.3 26.7 31.1
1506 847 212 298
8121 2057 2913
TABLE I: Average running time (sec) at different time step counts .

Iv Conclusion

In this letter, we have developed a general framework for constructing constrained smoother-based variable splitting methods, which can be used to solve state-estimation problems with nonlinear equality and inequality constraints. The solution is computationally efficient, because of leveraging the Markov structure of the problem. The experiments have been used to demonstrate the computational benefits.

References

  • [1] Y. B. Shalom, X. Li, and T. Kirubarajan, Estimation with Applications to Tracking and Navigation.   Wiley, 2001.
  • [2] S. Särkkä, Bayesian Filtering and Smoothing.   Cambridge, U.K.: Cambridge Univ. Press, Aug. 2013.
  • [3] D. Simon and T. L. Chia, “Kalman filtering with state equality constraints,” IEEE Trans. Aerosp. Electron. Syst., vol. 38, no. 1, pp. 128–136, 2002.
  • [4] A. Aravkin, J. V. Burke, L. Ljung, A. Lozano, and G. Pillonetto, “Generalized Kalman smoothing: Modeling and algorithms,” Automatica, vol. 86, pp. 63–86, Dec. 2017.
  • [5] P. Davidson and R. Piché, “A survey of selected indoor positioning methods for smartphones,” IEEE Communications Surveys & Tutorials, vol. 19, no. 2, pp. 1347–1370, 2016.
  • [6] B. M. Bella, J. V. Burkeb, and G. Pillonetto, “An inequality constrained nonlinear Kalman–Bucy smoother by interior point likelihood maximization,” Automatica, vol. 45, pp. 25–33, 2009.
  • [7] T. D. Barfoot, “State estimation for robotics,” Cambridge University Press, 2017.
  • [8] B. M. Bell and F. W. Cathey, “The iterated Kalman filter update as a Gauss–Newton method,” IEEE Trans. Automat. Control, vol. 38, no. 2, pp. 294–297, Feb. 1993.
  • [9] B. Bell, “The iterated Kalman smoother as a Gauss–Newton method,” SIAM J. Optim., vol. 4, no. 3, pp. 626–636, Aug. 1994.
  • [10] R. Gao, F. Tronarp, and S. Särkkä, “Iterated extended Kalman smoother-based variable splitting for L1-regularized state estimation,” IEEE Trans. Signal Process., vol. 97, no. 19, pp. 5078–5092, Oct. 2019.
  • [11] D. W. Peaceman and H. H. Rachford, “The numerical solution of parabolic and elliptic differential equations,” J. Soc. Indust. Appl. Math., vol. 3, no. 1, pp. 28–41, 1955.
  • [12] T. Goldstein and S. Osher, “The split Bregman method for L1-regularized problems,” SIAM J. Imaging Sci., vol. 2, no. 2, pp. 323–343, Apr. 2009.
  • [13] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,”

    Foundations and Trends in Machine Learning.

    , vol. 3, no. 1, pp. 1–122, 2011.
  • [14] R. Gao, F. Tronarp, and S. Särkkä, “Combined analysis-L1 and total variation ADMM with applications to MEG brain imaging and signal reconstruction,” in 26th European Signal Proces. Conf. (EUSIPCO).   Roma, Italy: IEEE, Sep. 2018, pp. 1930–1934.
  • [15] R. Gao, F. Tronarp, and S. Särkkä, “Regularized state estimation and parameter learning via augmented Lagrangian Kalman smoother method,” in 29th Int. Workshop on Machine Learning for Signal Process. (MLSP).   Pittsburgh, PA, USA: IEEE, Oct. 2019, pp. 1–6.
  • [16] S. Särkkä and L. Svensson, “Levenberg–Marquardt and line-search extended Kalman smoothers,” in appear in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.   Barcelona, Spain: IEEE, 2020.
  • [17] R. Glowinski, S. J. Osher, and W. Yin, Splitting Methods in Communication, Imaging, Science, and Engineering.   Cham, Switzerland: Springer, 2017.
  • [18] B. S. He, H. Liu, Z. R. Wang, and X. M. Yuan, “A strictly contractive Peaceman–Rachford splitting method for convex programming,” SIAM J. Optim., vol. 24, no. 3, pp. 1011–1040, 2014.
  • [19] R. A. Poliquin and R. T. Rockafellar, “Prox-regular functions in variational analysis,” Trans. AMS, vol. 348, pp. 1805–1838, 1996.