1 Introduction
Factor graphs are a graphical representation of statistical independence statements regarding probability distributions. These probabilistic graphical models facilitate the application of inference algorithms by means of message passing on the graph
[1, 2, 3, 4]. Many classical algorithms such as recursive least squares (RLS) [4], linear Kalman filtering and smoothing
[5], and expectation maximization
[6] have already been formulated as message passing on a factor graph. This enables the simple generation of novel algorithms by adapting the factor graph of known algorithms to a given problem, or by combining multiple known algorithms into a single graph and, hence, a joint algorithm. Algorithms adapted to a variety of problems such as cooperative localization [7], sparse input estimation [8], and motion planning [9] have been derived.Despite the vast amount of literature on nonlinear Gaussian filtering and smoothing algorithms [10, 11, 12, 13, 14], to the authors’ knowledge very few efforts have been made to represent existing nonlinear filtering algorithms in the factor graph framework. Meyer, Hlinka, and Hlawatsch [15] describe sigmapoint belief propagation (SPBP) algorithms on factor graphs, but they consider a nonsequential system model with observations depending on pairs of states, hence making their results inapplicable to sequential filtering and smoothing problems. Deisenroth and Mohamed [16] propose the use of expectation propagation (EP) as a general framework for Gaussian smoothers, similarly to [17]
, in which moments of distributions from the univariate exponential family are approximated using Gaussian quadrature. Both approaches result in iterative update rules with respect to the marginals.
In the present paper, we make an attempt at providing concise update rules for performing approximate Gaussian message passing through deterministic nonlinear nodes in factor graphs, thus enabling the representation of known nonlinear filtering and smoothing algorithms, and facilitating the derivation of new algorithms for various nonlinear problems. Similarly to [17, 16], we propose the use of numerical quadrature for computing moments, however we focus our exposition on deriving efficient approximate message passing rules for the directed messages instead of the marginals, which results in noniterative filtering and smoothing schemes for graphs without loops.
2 Nonlinear Transformations
Consider a deterministic, nonlinear transformation
(1) 
of a random variable
with values in. The probability density function (PDF) of
is given by(2) 
the expected value of results as
(3) 
and the covariance as
(4) 
For arbitrary nonlinear functions and PDFs , the integrals in eqs. (3) and (4) rarely admit closedform analytical solutions. Hence, the prior density
is usually assumed to be normally distributed, and numerical quadrature procedures are employed to approximate both integrals, such as the unscented transform (UT)
[10], GaussHermite quadrature (GHQ) [11], sphericalradial transform (SRT), which in a filtering setting results in the Cubature Kalman Filter (CKF) [13], or sparsegrid quadrature using the Smolyak rule [14]. The resulting estimates for and are then used as the parameters of a normal approximation to the PDF , which amounts to approximate moment matching and, hence, approximate minimization of the KullbackLeiber divergence between the actual distribution and its normal approximation [18].All of the methods mentioned above have in common that they perform the approximation
(5) 
of a normally weighted integral, where the number of integration points and the weights differ between methods, and denotes the PDF of a normally distributed variable with mean and covariance matrix . Note that in order to solve eq. (4) using the approximation (5), one chooses
and hence is a polynomial of degree if is a polynomial of degree . As a consequence, both the UT and the SRT quadrature formulas, which yield exact results for polynomials up to and including degree , do not calculate the covariance exactly for polynomials of order [13]. In particular, this includes bilinear functions
resulting from a multiplication of two Gaussian random variables. GaussHermite quadrature formulas of arbitrary order can be readily constructed, but unfortunately these formulas suffer heavily from the curse of dimensionality
[13]. Sparsegrid quadrature rules on the other hand, of which the classical UT has been shown to be a subset, can be flexibly adjusted in their degree of precision with a number of quadrature points growing polynomially in the number of dimensions, hence alleviating the curse of dimensionality [14].Having described a feasible strategy for performing the forwards (filtering) pass through a nonlinear transformation, we will now consider the backwards (smoothing) pass. If the inverse of the nonlinear transformation is available, the same approach that has been described so far can also be used to implement the backward pass. If, however, this is not the case, nonlinear RauchTungStriebeltype (RTS) smoothers can be derived, such as the unscented RTS smoother proposed by Särkkä [12]. In the following, a general nonlinear Gaussian RTS smoothertype backward pass through a nonlinear function node in a factor graph is derived, following the derivation in [12] but employing a slightly more general setting to facilitate local interpretation in a factor graph context.
Consider again the deterministic nonlinear transformation (1), and assume the parameters , and of the filtering distributions
(6)  
(7) 
and the smoothing distribution of
(8) 
known, where denotes knowledge on , and denotes knowledge on . The aim is now to compute the parameters of a Gaussian approximation
(9) 
to the smoothing distribution of incorporating all available data. To this end, further assume that the model satisfies the Markov conditions
(10) 
It then holds that
due to the assumed Markov property (10). It follows that
and by marginalization of
(11) 
If the joint distribution
is approximated by a normal distributionwith suitably chosen covariance matrix , the marginalization (11) can be evaluated analytically to [12]
(12) 
with , , , and defined as in eqs. (17), (18), (21) and (22) in table 1, where has been approximated using any of the quadrature methods described previously.
3 Message passing on factor graphs
A factor graph is a graphical representation of a factorization of an arbitrary function. Forneystyle factor graphs (FFGs) consist of nodes, which represent factors, and edges connecting these nodes, which represent the variables that each factor depends on [4]. Inference can be efficiently performed by means of message passing along the edges of the factor graph. Edges are undirected, but arrows are introduced to disambiguate between messages and in and against the direction of an edge, respectively. Figure 1 shows an FFG representation of the PDF of the nonlinear state space model
(13)  
with deterministic nonlinear functions , , and , inputs and process and measurement noise and , respectively. One of the main benefits of the factor graph framework is that it allows for the easy combination of existing algorithms from different fields, such as filtering/smoothing, sparse input estimation, parameter estimation, and control [6, 8, 19, 20] to derive powerful new algorithms.
The dashed box in fig. 1 represents the joint probability , or equivalently the conditional distribution . Accordingly, for samples and initial state
this results in the Markov chain
representing the PDF of the complete sequential model. For linear functions , , and , wellknown algorithms such as Kalman filtering and smoothing can be understood as special cases of the sumproduct message passing algorithm on this model [4], effectively performing Gaussian message passing. Inspired by the conciseness of [4], the aim of the present contribution is to provide tabulated rules for approximate Gaussian message passing through the deterministic nonlinear transformation node depicted in fig. 2, representing the factor .
To this end, the approximation methods presented in the previous section may now be formulated as message passing on factor graphs, the rules for which are summarized in table 1. To obtain these rules, we identify the forward message with the filtering distribution and the marginal message used in the backward pass with the smoothing distribution
, with the parameters of the Gaussian distributions defined as in eqs. (
6)(9), and and denoting available knowledge on and , respectively, as before. Eqs. (3)(5) then directly yield the update rules (15) and (16) for the forward pass. Eq. (12) yields the update rules (17), (18) for the backward pass in the parameterization of the Gaussian messages. This parameterization is, however, not necessarily desirable to use for practical applications since passing backwards through a filtering graph in this parameterization may require multiple matrix inversions for each time slice [5]. Therefore, we additionally derive message passing rules in the(14) 
parameterization, which has proven advantageous for efficient realization of the (backwards) smoothing pass on factor graphs such as the one shown in fig. 1 [5]. Using the Woodbury formula [21] and [4], we obtain
corresponding to update rule (19). For , using [4]
we obtain
as summarized in update rule (20).
4 Nonlinear filtering and smoothing
The approach described in the previous sections can now be used to describe various new and existing nonlinear filtering and smoothing algorithms by performing forward and backward message passing along factor graphs such as the one shown in fig. 1.
One instance of the class of algorithms that can be derived from this framework is given by the following nonlinear Modified BrysonFrazier (MBF) smoother for state space models of the form (13) with linear output :
Nonlinear MBF Smoother:
The smoother is adapted from the factor graph formulation of the MBF smoother [22] for linear systems provided by Loeliger, Bruderer, Malmber, et al. [8]. It requires just a single matrix inversion for each backward time step and is included here mainly to demonstrate the utility of the presented factor graph representation of nonlinear Gaussian message passing for deriving various nonlinear filters and smoothers. The smoother may be implemented using any kind of numerical quadrature procedure and amounts to standard message passing on a statistically linearized factor graph [23].
5 Conclusion
In this contribution, local message passing rules for factor graph nodes representing deterministic nonlinear transformations have been derived. For the forward pass, a linearization is performed using any numerical quadrature method,and for the backward pass, general RauchTungStriebel (RTS)type update rules have been derived in two different message parameterizations.The resulting message passing rules can be employed in any factor graphand in particular can be used to perform filtering and smoothing on state space models with state transition, input, and measurement nonlinearities. Demonstrating the usefulness of transferring results from classical nonlinear filtering theory to the factor graph framework, the modified BrysonFrazier (MBF) smoother is easily augmented to incorporate nonlinear state transitions and input nonlinearities, requiring only a single matrix inversion in each time step. In this way, the present contribution adds the capability of handling nonlinear systems to a range of existing algorithms, hence enabling a factor graph description of a whole range of various new and existing algorithms.
Acknowledgements
The authors would like to thank Maximilian Pilz for interesting discussions on the subject of this article.
References
 [1] F. R. Kschischang, B. J. Frey, and H.A. Loeliger, “Factor graphs and the sumproduct algorithm,” IEEE Transactions on information theory, vol. 47, no. 2, pp. 498–519, 2001.
 [2] ——, “Factor graphs and the sumproduct algorithm,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 498–519, 2001.
 [3] H.A. Loeliger, “An introduction to factor graphs,” IEEE Signal Processing Magazine, vol. 21, no. 1, pp. 28–41, 2004.
 [4] H.A. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. R. Kschischang, “The factor graph approach to modelbased signal processing,” Proceedings of the IEEE, vol. 95, no. 6, pp. 1295–1322, 2007.
 [5] F. Wadehn, L. Bruderer, V. Sahdeva, and H.A. Loeliger, “New squareroot and diagonalized Kalman smoothers,” in Communication, Control, and Computing (Allerton), 54th Annual Allerton Conference on. IEEE, 2016, pp. 1282–1290.
 [6] J. Dauwels, S. Korl, and H.A. Loeliger, “Expectation maximization as message passing,” arXiv preprint, 2009. [Online]. Available: https://arxiv.org/abs/0910.2832
 [7] B. Li, N. Wu, H. Wang, P.H. Tseng, and J. Kuang, “Gaussian message passingbased cooperative localization on factor graph in wireless networks,” Signal Processing, vol. 111, pp. 1–12, 2015.
 [8] H.A. Loeliger, L. Bruderer, H. Malmberg, F. Wadehn, and N. Zalmai, “On sparsity by NUVEM, Gaussian message passing, and Kalman smoothing,” in Information Theory and Applications Workshop (ITA). IEEE, 2016.
 [9] J. Dong, M. Mukadam, F. Dellaert, and B. Boots, “Motion planning as probabilistic inference using Gaussian processes and factor graphs,” Robotics: Science and Systems, vol. 12, 2016.
 [10] E. A. Wan and R. van der Merwe, “The unscented Kalman filter for nonlinear estimation,” in Adaptive Systems for Signal Processing, Communications, and Control Symposium, 2000, pp. 153–158.
 [11] I. Arasaratnam and S. Haykin, “Discretetime nonlinear filtering algorithms using GaussHermite quadrature,” Proceedings of the IEEE, vol. 95, no. 5, pp. 953–977, 2007.
 [12] S. Särkkä, “Unscented RauchTungStriebel smoother,” IEEE Transactions on Automatic Control, vol. 53, no. 3, pp. 845–849, 2008.
 [13] I. Arasaratnam and S. Haykin, “Cubature Kalman filters,” IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254–1269, 2009.
 [14] B. Jia, M. Xin, and Y. Cheng, “Sparsegrid quadrature nonlinear filtering,” Automatica, vol. 48, no. 2, pp. 327–341, 2012.
 [15] F. Meyer, O. Hlinka, and F. Hlawatsch, “Sigma point belief propagation,” IEEE Signal Processing Letters, vol. 21, no. 2, pp. 145–149, 2014.
 [16] M. P. Deisenroth and S. Mohamed, “Expectation propagation in dynamical systems,” in Advances in Neural Information Processing Systems, 2012.
 [17] O. R. Zoeter and T. Heskes, “Gaussian quadrature based expectation propagation,” in Proc. 10th Int. Workshop Artificial Intell. Stat., 2005, pp. 445–452.

[18]
D. Barber,
Bayesian reasoning and machine learning
. Cambridge University Press, 2012.  [19] C. Hoffmann and P. Rostalski, “Linear optimal control on factor graphs – a message passing perspective,” in Proceedings of the 20th IFAC World Congress, 2017.
 [20] C. Hoffmann, E. Petersen, T. Handzsuj, G. Bellani, and P. Rostalski, “A factor graphbased change point detection algorith, with an application to sEMGonset and activity detection,” in Proceedings of the Jahrestagung der Biomedizinischen Technik und Dreiländertagung der Medizinischen Physik, no. 62, 2017, pp. 116–120.
 [21] N. J. Higham, Accuracy and stability of numerical algorithms. SIAM, 2002.
 [22] G. J. Bierman, Factorization Methods for Discrete Sequential Estimation. Dover Publications, Inc., 1977.
 [23] R. Van der Merwe, “Sigmapoint Kalman filters for probabilistic inference in dynamic statespace models,” Ph.D. dissertation, 2004.
Comments
There are no comments yet.