Factor graphs are a graphical representation of statistical independence statements regarding probability distributions. These probabilistic graphical models facilitate the application of inference algorithms by means of message passing on the graph[1, 2, 3, 4]. Many classical algorithms such as recursive least squares (RLS) 
, linear Kalman filtering and smoothing
, and expectation maximization have already been formulated as message passing on a factor graph. This enables the simple generation of novel algorithms by adapting the factor graph of known algorithms to a given problem, or by combining multiple known algorithms into a single graph and, hence, a joint algorithm. Algorithms adapted to a variety of problems such as cooperative localization , sparse input estimation , and motion planning  have been derived.
Despite the vast amount of literature on nonlinear Gaussian filtering and smoothing algorithms [10, 11, 12, 13, 14], to the authors’ knowledge very few efforts have been made to represent existing nonlinear filtering algorithms in the factor graph framework. Meyer, Hlinka, and Hlawatsch  describe sigma-point belief propagation (SPBP) algorithms on factor graphs, but they consider a non-sequential system model with observations depending on pairs of states, hence making their results inapplicable to sequential filtering and smoothing problems. Deisenroth and Mohamed  propose the use of expectation propagation (EP) as a general framework for Gaussian smoothers, similarly to 
, in which moments of distributions from the univariate exponential family are approximated using Gaussian quadrature. Both approaches result in iterative update rules with respect to the marginals.
In the present paper, we make an attempt at providing concise update rules for performing approximate Gaussian message passing through deterministic nonlinear nodes in factor graphs, thus enabling the representation of known nonlinear filtering and smoothing algorithms, and facilitating the derivation of new algorithms for various nonlinear problems. Similarly to [17, 16], we propose the use of numerical quadrature for computing moments, however we focus our exposition on deriving efficient approximate message passing rules for the directed messages instead of the marginals, which results in non-iterative filtering and smoothing schemes for graphs without loops.
2 Nonlinear Transformations
Consider a deterministic, nonlinear transformation
of a random variablewith values in
. The probability density function (PDF) ofis given by
the expected value of results as
and the covariance as
is usually assumed to be normally distributed, and numerical quadrature procedures are employed to approximate both integrals, such as the unscented transform (UT), Gauss-Hermite quadrature (GHQ) , spherical-radial transform (SRT), which in a filtering setting results in the Cubature Kalman Filter (CKF) , or sparse-grid quadrature using the Smolyak rule . The resulting estimates for and are then used as the parameters of a normal approximation to the PDF , which amounts to approximate moment matching and, hence, approximate minimization of the Kullback-Leiber divergence between the actual distribution and its normal approximation .
All of the methods mentioned above have in common that they perform the approximation
of a normally weighted integral, where the number of integration points and the weights differ between methods, and denotes the PDF of a normally distributed variable with mean and covariance matrix . Note that in order to solve eq. (4) using the approximation (5), one chooses
and hence is a polynomial of degree if is a polynomial of degree . As a consequence, both the UT and the SRT quadrature formulas, which yield exact results for polynomials up to and including degree , do not calculate the covariance exactly for polynomials of order . In particular, this includes bilinear functions
resulting from a multiplication of two Gaussian random variables. Gauss-Hermite quadrature formulas of arbitrary order can be readily constructed, but unfortunately these formulas suffer heavily from the curse of dimensionality. Sparse-grid quadrature rules on the other hand, of which the classical UT has been shown to be a subset, can be flexibly adjusted in their degree of precision with a number of quadrature points growing polynomially in the number of dimensions, hence alleviating the curse of dimensionality .
Having described a feasible strategy for performing the forwards (filtering) pass through a nonlinear transformation, we will now consider the backwards (smoothing) pass. If the inverse of the nonlinear transformation is available, the same approach that has been described so far can also be used to implement the backward pass. If, however, this is not the case, nonlinear Rauch-Tung-Striebel-type (RTS) smoothers can be derived, such as the unscented RTS smoother proposed by Särkkä . In the following, a general nonlinear Gaussian RTS smoother-type backward pass through a nonlinear function node in a factor graph is derived, following the derivation in  but employing a slightly more general setting to facilitate local interpretation in a factor graph context.
Consider again the deterministic nonlinear transformation (1), and assume the parameters , and of the filtering distributions
and the smoothing distribution of
known, where denotes knowledge on , and denotes knowledge on . The aim is now to compute the parameters of a Gaussian approximation
to the smoothing distribution of incorporating all available data. To this end, further assume that the model satisfies the Markov conditions
It then holds that
due to the assumed Markov property (10). It follows that
and by marginalization of
If the joint distributionis approximated by a normal distribution
3 Message passing on factor graphs
A factor graph is a graphical representation of a factorization of an arbitrary function. Forney-style factor graphs (FFGs) consist of nodes, which represent factors, and edges connecting these nodes, which represent the variables that each factor depends on . Inference can be efficiently performed by means of message passing along the edges of the factor graph. Edges are undirected, but arrows are introduced to disambiguate between messages and in and against the direction of an edge, respectively. Figure 1 shows an FFG representation of the PDF of the nonlinear state space model
with deterministic nonlinear functions , , and , inputs and process and measurement noise and , respectively. One of the main benefits of the factor graph framework is that it allows for the easy combination of existing algorithms from different fields, such as filtering/smoothing, sparse input estimation, parameter estimation, and control [6, 8, 19, 20] to derive powerful new algorithms.
The dashed box in fig. 1 represents the joint probability , or equivalently the conditional distribution . Accordingly, for samples and initial state
this results in the Markov chain
representing the PDF of the complete sequential model. For linear functions , , and , well-known algorithms such as Kalman filtering and smoothing can be understood as special cases of the sum-product message passing algorithm on this model , effectively performing Gaussian message passing. Inspired by the conciseness of , the aim of the present contribution is to provide tabulated rules for approximate Gaussian message passing through the deterministic nonlinear transformation node depicted in fig. 2, representing the factor .
To this end, the approximation methods presented in the previous section may now be formulated as message passing on factor graphs, the rules for which are summarized in table 1. To obtain these rules, we identify the forward message with the filtering distribution and the marginal message used in the backward pass with the smoothing distribution
, with the parameters of the Gaussian distributions defined as in eqs. (6)-(9), and and denoting available knowledge on and , respectively, as before. Eqs. (3)-(5) then directly yield the update rules (15) and (16) for the forward pass. Eq. (12) yields the update rules (17), (18) for the backward pass in the parameterization of the Gaussian messages. This parameterization is, however, not necessarily desirable to use for practical applications since passing backwards through a filtering graph in this parameterization may require multiple matrix inversions for each time slice . Therefore, we additionally derive message passing rules in the
parameterization, which has proven advantageous for efficient realization of the (backwards) smoothing pass on factor graphs such as the one shown in fig. 1 . Using the Woodbury formula  and , we obtain
as summarized in update rule (20).
4 Nonlinear filtering and smoothing
The approach described in the previous sections can now be used to describe various new and existing nonlinear filtering and smoothing algorithms by performing forward and backward message passing along factor graphs such as the one shown in fig. 1.
One instance of the class of algorithms that can be derived from this framework is given by the following nonlinear Modified Bryson-Frazier (MBF) smoother for state space models of the form (13) with linear output :
Nonlinear MBF Smoother:
The smoother is adapted from the factor graph formulation of the MBF smoother  for linear systems provided by Loeliger, Bruderer, Malmber, et al. . It requires just a single matrix inversion for each backward time step and is included here mainly to demonstrate the utility of the presented factor graph representation of nonlinear Gaussian message passing for deriving various nonlinear filters and smoothers. The smoother may be implemented using any kind of numerical quadrature procedure and amounts to standard message passing on a statistically linearized factor graph .
In this contribution, local message passing rules for factor graph nodes representing deterministic nonlinear transformations have been derived. For the forward pass, a linearization is performed using any numerical quadrature method,and for the backward pass, general Rauch-Tung-Striebel (RTS)-type update rules have been derived in two different message parameterizations.The resulting message passing rules can be employed in any factor graphand in particular can be used to perform filtering and smoothing on state space models with state transition, input, and measurement nonlinearities. Demonstrating the usefulness of transferring results from classical nonlinear filtering theory to the factor graph framework, the modified Bryson-Frazier (MBF) smoother is easily augmented to incorporate nonlinear state transitions and input nonlinearities, requiring only a single matrix inversion in each time step. In this way, the present contribution adds the capability of handling nonlinear systems to a range of existing algorithms, hence enabling a factor graph description of a whole range of various new and existing algorithms.
The authors would like to thank Maximilian Pilz for interesting discussions on the subject of this article.
-  F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Transactions on information theory, vol. 47, no. 2, pp. 498–519, 2001.
-  ——, “Factor graphs and the sum-product algorithm,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 498–519, 2001.
-  H.-A. Loeliger, “An introduction to factor graphs,” IEEE Signal Processing Magazine, vol. 21, no. 1, pp. 28–41, 2004.
-  H.-A. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. R. Kschischang, “The factor graph approach to model-based signal processing,” Proceedings of the IEEE, vol. 95, no. 6, pp. 1295–1322, 2007.
-  F. Wadehn, L. Bruderer, V. Sahdeva, and H.-A. Loeliger, “New square-root and diagonalized Kalman smoothers,” in Communication, Control, and Computing (Allerton), 54th Annual Allerton Conference on. IEEE, 2016, pp. 1282–1290.
-  J. Dauwels, S. Korl, and H.-A. Loeliger, “Expectation maximization as message passing,” arXiv preprint, 2009. [Online]. Available: https://arxiv.org/abs/0910.2832
-  B. Li, N. Wu, H. Wang, P.-H. Tseng, and J. Kuang, “Gaussian message passing-based cooperative localization on factor graph in wireless networks,” Signal Processing, vol. 111, pp. 1–12, 2015.
-  H.-A. Loeliger, L. Bruderer, H. Malmberg, F. Wadehn, and N. Zalmai, “On sparsity by NUV-EM, Gaussian message passing, and Kalman smoothing,” in Information Theory and Applications Workshop (ITA). IEEE, 2016.
-  J. Dong, M. Mukadam, F. Dellaert, and B. Boots, “Motion planning as probabilistic inference using Gaussian processes and factor graphs,” Robotics: Science and Systems, vol. 12, 2016.
-  E. A. Wan and R. van der Merwe, “The unscented Kalman filter for nonlinear estimation,” in Adaptive Systems for Signal Processing, Communications, and Control Symposium, 2000, pp. 153–158.
-  I. Arasaratnam and S. Haykin, “Discrete-time nonlinear filtering algorithms using Gauss-Hermite quadrature,” Proceedings of the IEEE, vol. 95, no. 5, pp. 953–977, 2007.
-  S. Särkkä, “Unscented Rauch-Tung-Striebel smoother,” IEEE Transactions on Automatic Control, vol. 53, no. 3, pp. 845–849, 2008.
-  I. Arasaratnam and S. Haykin, “Cubature Kalman filters,” IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254–1269, 2009.
-  B. Jia, M. Xin, and Y. Cheng, “Sparse-grid quadrature nonlinear filtering,” Automatica, vol. 48, no. 2, pp. 327–341, 2012.
-  F. Meyer, O. Hlinka, and F. Hlawatsch, “Sigma point belief propagation,” IEEE Signal Processing Letters, vol. 21, no. 2, pp. 145–149, 2014.
-  M. P. Deisenroth and S. Mohamed, “Expectation propagation in dynamical systems,” in Advances in Neural Information Processing Systems, 2012.
-  O. R. Zoeter and T. Heskes, “Gaussian quadrature based expectation propagation,” in Proc. 10th Int. Workshop Artificial Intell. Stat., 2005, pp. 445–452.
Bayesian reasoning and machine learning. Cambridge University Press, 2012.
-  C. Hoffmann and P. Rostalski, “Linear optimal control on factor graphs – a message passing perspective,” in Proceedings of the 20th IFAC World Congress, 2017.
-  C. Hoffmann, E. Petersen, T. Handzsuj, G. Bellani, and P. Rostalski, “A factor graph-based change point detection algorith, with an application to sEMG-onset and activity detection,” in Proceedings of the Jahrestagung der Biomedizinischen Technik und Dreiländertagung der Medizinischen Physik, no. 62, 2017, pp. 116–120.
-  N. J. Higham, Accuracy and stability of numerical algorithms. SIAM, 2002.
-  G. J. Bierman, Factorization Methods for Discrete Sequential Estimation. Dover Publications, Inc., 1977.
-  R. Van der Merwe, “Sigma-point Kalman filters for probabilistic inference in dynamic state-space models,” Ph.D. dissertation, 2004.