On Approximate Nonlinear Gaussian Message Passing On Factor Graphs

by   Eike Petersen, et al.
Universität Lübeck

Factor graphs have recently gained increasing attention as a unified framework for representing and constructing algorithms for signal processing, estimation, and control. One capability that does not seem to be well explored within the factor graph tool kit is the ability to handle deterministic nonlinear transformations, such as those occurring in nonlinear filtering and smoothing problems, using tabulated message passing rules. In this contribution, we provide general forward (filtering) and backward (smoothing) approximate Gaussian message passing rules for deterministic nonlinear transformation nodes in arbitrary factor graphs fulfilling a Markov property, based on numerical quadrature procedures for the forward pass and a Rauch-Tung-Striebel-type approximation of the backward pass. These message passing rules can be employed for deriving many algorithms for solving nonlinear problems using factor graphs, as is illustrated by the proposition of a nonlinear modified Bryson-Frazier (MBF) smoother based on the presented message passing rules.



There are no comments yet.


page 1

page 2

page 3

page 4


Double Bayesian Smoothing as Message Passing

Recently, a novel method for developing filtering algorithms, based on t...

A Unified Framework of State Evolution for Message-Passing Algorithms

This paper presents a unified framework to understand the dynamics of me...

Message-Passing Algorithms: Reparameterizations and Splittings

The max-product algorithm, a local message-passing scheme that attempts ...

Bayesian Nonlinear Function Estimation with Approximate Message Passing

In many areas, massive amounts of data are collected and analyzed in ord...

Fast and Differentiable Message Passing for Stereo Vision

Despite the availability of many Markov Random Field (MRF) optimization ...

Online system identification in a Duffing oscillator by free energy minimisation

Online system identification is the estimation of parameters of a dynami...

A Unified Method for Solving Inverse, Forward, and Hybrid Manipulator Dynamics using Factor Graphs

This paper describes a unified method solving for inverse, forward, and ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Factor graphs are a graphical representation of statistical independence statements regarding probability distributions. These probabilistic graphical models facilitate the application of inference algorithms by means of message passing on the graph 

[1, 2, 3, 4]. Many classical algorithms such as recursive least squares (RLS) [4]

, linear Kalman filtering and smoothing 


, and expectation maximization 

[6] have already been formulated as message passing on a factor graph. This enables the simple generation of novel algorithms by adapting the factor graph of known algorithms to a given problem, or by combining multiple known algorithms into a single graph and, hence, a joint algorithm. Algorithms adapted to a variety of problems such as cooperative localization [7], sparse input estimation [8], and motion planning [9] have been derived.

Despite the vast amount of literature on nonlinear Gaussian filtering and smoothing algorithms [10, 11, 12, 13, 14], to the authors’ knowledge very few efforts have been made to represent existing nonlinear filtering algorithms in the factor graph framework. Meyer, Hlinka, and Hlawatsch [15] describe sigma-point belief propagation (SPBP) algorithms on factor graphs, but they consider a non-sequential system model with observations depending on pairs of states, hence making their results inapplicable to sequential filtering and smoothing problems. Deisenroth and Mohamed [16] propose the use of expectation propagation (EP) as a general framework for Gaussian smoothers, similarly to [17]

, in which moments of distributions from the univariate exponential family are approximated using Gaussian quadrature. Both approaches result in iterative update rules with respect to the marginals.

In the present paper, we make an attempt at providing concise update rules for performing approximate Gaussian message passing through deterministic nonlinear nodes in factor graphs, thus enabling the representation of known nonlinear filtering and smoothing algorithms, and facilitating the derivation of new algorithms for various nonlinear problems. Similarly to [17, 16], we propose the use of numerical quadrature for computing moments, however we focus our exposition on deriving efficient approximate message passing rules for the directed messages instead of the marginals, which results in non-iterative filtering and smoothing schemes for graphs without loops.

2 Nonlinear Transformations

Consider a deterministic, nonlinear transformation


of a random variable

with values in

. The probability density function (PDF) of

is given by


the expected value of results as


and the covariance as


For arbitrary nonlinear functions and PDFs , the integrals in eqs. (3) and (4) rarely admit closed-form analytical solutions. Hence, the prior density

is usually assumed to be normally distributed, and numerical quadrature procedures are employed to approximate both integrals, such as the unscented transform (UT) 

[10], Gauss-Hermite quadrature (GHQ) [11], spherical-radial transform (SRT), which in a filtering setting results in the Cubature Kalman Filter (CKF) [13], or sparse-grid quadrature using the Smolyak rule [14]. The resulting estimates for and are then used as the parameters of a normal approximation to the PDF , which amounts to approximate moment matching and, hence, approximate minimization of the Kullback-Leiber divergence between the actual distribution and its normal approximation [18].

All of the methods mentioned above have in common that they perform the approximation


of a normally weighted integral, where the number of integration points and the weights differ between methods, and denotes the PDF of a normally distributed variable with mean and covariance matrix . Note that in order to solve eq. (4) using the approximation (5), one chooses

and hence is a polynomial of degree if is a polynomial of degree . As a consequence, both the UT and the SRT quadrature formulas, which yield exact results for polynomials up to and including degree , do not calculate the covariance exactly for polynomials of order  [13]. In particular, this includes bilinear functions

resulting from a multiplication of two Gaussian random variables. Gauss-Hermite quadrature formulas of arbitrary order can be readily constructed, but unfortunately these formulas suffer heavily from the curse of dimensionality 

[13]. Sparse-grid quadrature rules on the other hand, of which the classical UT has been shown to be a subset, can be flexibly adjusted in their degree of precision with a number of quadrature points growing polynomially in the number of dimensions, hence alleviating the curse of dimensionality [14].

Having described a feasible strategy for performing the forwards (filtering) pass through a nonlinear transformation, we will now consider the backwards (smoothing) pass. If the inverse  of the nonlinear transformation is available, the same approach that has been described so far can also be used to implement the backward pass. If, however, this is not the case, nonlinear Rauch-Tung-Striebel-type (RTS) smoothers can be derived, such as the unscented RTS smoother proposed by Särkkä [12]. In the following, a general nonlinear Gaussian RTS smoother-type backward pass through a nonlinear function node in a factor graph is derived, following the derivation in [12] but employing a slightly more general setting to facilitate local interpretation in a factor graph context.

Consider again the deterministic nonlinear transformation (1), and assume the parameters , and of the filtering distributions


and the smoothing distribution of


known, where denotes knowledge on , and  denotes knowledge on . The aim is now to compute the parameters of a Gaussian approximation


to the smoothing distribution of incorporating all available data. To this end, further assume that the model satisfies the Markov conditions


It then holds that

due to the assumed Markov property (10). It follows that

and by marginalization of


If the joint distribution

is approximated by a normal distribution

with suitably chosen covariance matrix , the marginalization (11) can be evaluated analytically to [12]


with , , , and defined as in eqs. (17), (18), (21) and (22) in table 1, where has been approximated using any of the quadrature methods described previously.

3 Message passing on factor graphs

A factor graph is a graphical representation of a factorization of an arbitrary function. Forney-style factor graphs (FFGs) consist of nodes, which represent factors, and edges connecting these nodes, which represent the variables that each factor depends on [4]. Inference can be efficiently performed by means of message passing along the edges of the factor graph. Edges are undirected, but arrows are introduced to disambiguate between messages and in and against the direction of an edge, respectively. Figure 1 shows an FFG representation of the PDF of the nonlinear state space model


with deterministic nonlinear functions , and , inputs  and process and measurement noise and , respectively. One of the main benefits of the factor graph framework is that it allows for the easy combination of existing algorithms from different fields, such as filtering/smoothing, sparse input estimation, parameter estimation, and control [6, 8, 19, 20] to derive powerful new algorithms.

Figure 1: One slice of a Forney-style factor graph representing the nonlinear state space model in eq. 13. Capital letters indicate the random variables associated with edges.

The dashed box in fig. 1 represents the joint probability , or equivalently the conditional distribution . Accordingly, for samples and initial state

this results in the Markov chain

representing the PDF of the complete sequential model. For linear functions , , and , well-known algorithms such as Kalman filtering and smoothing can be understood as special cases of the sum-product message passing algorithm on this model [4], effectively performing Gaussian message passing. Inspired by the conciseness of [4], the aim of the present contribution is to provide tabulated rules for approximate Gaussian message passing through the deterministic nonlinear transformation node depicted in fig. 2, representing the factor .

Figure 2: Factor node representing the factor .

To this end, the approximation methods presented in the previous section may now be formulated as message passing on factor graphs, the rules for which are summarized in table 1. To obtain these rules, we identify the forward message with the filtering distribution and the marginal message used in the backward pass with the smoothing distribution

, with the parameters of the Gaussian distributions defined as in eqs. (

6)-(9), and and denoting available knowledge on and , respectively, as before. Eqs. (3)-(5) then directly yield the update rules (15) and (16) for the forward pass. Eq. (12) yields the update rules (17), (18) for the backward pass in the parameterization of the Gaussian messages. This parameterization is, however, not necessarily desirable to use for practical applications since passing backwards through a filtering graph in this parameterization may require multiple matrix inversions for each time slice [5]. Therefore, we additionally derive message passing rules in the


parameterization, which has proven advantageous for efficient realization of the (backwards) smoothing pass on factor graphs such as the one shown in fig. 1 [5]. Using the Woodbury formula [21] and  [4], we obtain

corresponding to update rule (19). For , using [4]

we obtain

as summarized in update rule (20).

Forward (filtering) pass: (Using a numerical quadrature procedure with quadrature points and weights )

(15) (16)
Backward (smoothing) pass: (Valid under the Markov assumption that there is no path from to in the factor graph other than through )
(17) (18) (19) (20)
(21) (22)

Table 1: Approximate Gaussian message passing rules for deterministic nonlinear transformation nodes.

4 Nonlinear filtering and smoothing

The approach described in the previous sections can now be used to describe various new and existing nonlinear filtering and smoothing algorithms by performing forward and backward message passing along factor graphs such as the one shown in fig. 1. One instance of the class of algorithms that can be derived from this framework is given by the following nonlinear Modified Bryson-Frazier (MBF) smoother for state space models of the form (13) with linear output :

Nonlinear MBF Smoother:

  1. Perform forward message passing using equations (15) and (16), as well as the previously proposed update rules (II.1), (II.2), (V.1), and (V.2) from [8].

  2. Perform backward message passing using equations (19) and (20) as well as the previously proposed update rules (II.6), (II.7), and either (V.4), (V.6), (V.8) or (V.5), (V.7), (V.9) from [8].

The smoother is adapted from the factor graph formulation of the MBF smoother [22] for linear systems provided by Loeliger, Bruderer, Malmber, et al. [8]. It requires just a single matrix inversion for each backward time step and is included here mainly to demonstrate the utility of the presented factor graph representation of nonlinear Gaussian message passing for deriving various nonlinear filters and smoothers. The smoother may be implemented using any kind of numerical quadrature procedure and amounts to standard message passing on a statistically linearized factor graph [23].

5 Conclusion

In this contribution, local message passing rules for factor graph nodes representing deterministic nonlinear transformations have been derived. For the forward pass, a linearization is performed using any numerical quadrature method,and for the backward pass, general Rauch-Tung-Striebel (RTS)-type update rules have been derived in two different message parameterizations.The resulting message passing rules can be employed in any factor graphand in particular can be used to perform filtering and smoothing on state space models with state transition, input, and measurement nonlinearities. Demonstrating the usefulness of transferring results from classical nonlinear filtering theory to the factor graph framework, the modified Bryson-Frazier (MBF) smoother is easily augmented to incorporate nonlinear state transitions and input nonlinearities, requiring only a single matrix inversion in each time step. In this way, the present contribution adds the capability of handling nonlinear systems to a range of existing algorithms, hence enabling a factor graph description of a whole range of various new and existing algorithms.


The authors would like to thank Maximilian Pilz for interesting discussions on the subject of this article.


  • [1] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Transactions on information theory, vol. 47, no. 2, pp. 498–519, 2001.
  • [2] ——, “Factor graphs and the sum-product algorithm,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 498–519, 2001.
  • [3] H.-A. Loeliger, “An introduction to factor graphs,” IEEE Signal Processing Magazine, vol. 21, no. 1, pp. 28–41, 2004.
  • [4] H.-A. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. R. Kschischang, “The factor graph approach to model-based signal processing,” Proceedings of the IEEE, vol. 95, no. 6, pp. 1295–1322, 2007.
  • [5] F. Wadehn, L. Bruderer, V. Sahdeva, and H.-A. Loeliger, “New square-root and diagonalized Kalman smoothers,” in Communication, Control, and Computing (Allerton), 54th Annual Allerton Conference on.   IEEE, 2016, pp. 1282–1290.
  • [6] J. Dauwels, S. Korl, and H.-A. Loeliger, “Expectation maximization as message passing,” arXiv preprint, 2009. [Online]. Available: https://arxiv.org/abs/0910.2832
  • [7] B. Li, N. Wu, H. Wang, P.-H. Tseng, and J. Kuang, “Gaussian message passing-based cooperative localization on factor graph in wireless networks,” Signal Processing, vol. 111, pp. 1–12, 2015.
  • [8] H.-A. Loeliger, L. Bruderer, H. Malmberg, F. Wadehn, and N. Zalmai, “On sparsity by NUV-EM, Gaussian message passing, and Kalman smoothing,” in Information Theory and Applications Workshop (ITA).   IEEE, 2016.
  • [9] J. Dong, M. Mukadam, F. Dellaert, and B. Boots, “Motion planning as probabilistic inference using Gaussian processes and factor graphs,” Robotics: Science and Systems, vol. 12, 2016.
  • [10] E. A. Wan and R. van der Merwe, “The unscented Kalman filter for nonlinear estimation,” in Adaptive Systems for Signal Processing, Communications, and Control Symposium, 2000, pp. 153–158.
  • [11] I. Arasaratnam and S. Haykin, “Discrete-time nonlinear filtering algorithms using Gauss-Hermite quadrature,” Proceedings of the IEEE, vol. 95, no. 5, pp. 953–977, 2007.
  • [12] S. Särkkä, “Unscented Rauch-Tung-Striebel smoother,” IEEE Transactions on Automatic Control, vol. 53, no. 3, pp. 845–849, 2008.
  • [13] I. Arasaratnam and S. Haykin, “Cubature Kalman filters,” IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254–1269, 2009.
  • [14] B. Jia, M. Xin, and Y. Cheng, “Sparse-grid quadrature nonlinear filtering,” Automatica, vol. 48, no. 2, pp. 327–341, 2012.
  • [15] F. Meyer, O. Hlinka, and F. Hlawatsch, “Sigma point belief propagation,” IEEE Signal Processing Letters, vol. 21, no. 2, pp. 145–149, 2014.
  • [16] M. P. Deisenroth and S. Mohamed, “Expectation propagation in dynamical systems,” in Advances in Neural Information Processing Systems, 2012.
  • [17] O. R. Zoeter and T. Heskes, “Gaussian quadrature based expectation propagation,” in Proc. 10th Int. Workshop Artificial Intell. Stat., 2005, pp. 445–452.
  • [18] D. Barber,

    Bayesian reasoning and machine learning

    .   Cambridge University Press, 2012.
  • [19] C. Hoffmann and P. Rostalski, “Linear optimal control on factor graphs – a message passing perspective,” in Proceedings of the 20th IFAC World Congress, 2017.
  • [20] C. Hoffmann, E. Petersen, T. Handzsuj, G. Bellani, and P. Rostalski, “A factor graph-based change point detection algorith, with an application to sEMG-onset and activity detection,” in Proceedings of the Jahrestagung der Biomedizinischen Technik und Dreiländertagung der Medizinischen Physik, no. 62, 2017, pp. 116–120.
  • [21] N. J. Higham, Accuracy and stability of numerical algorithms.   SIAM, 2002.
  • [22] G. J. Bierman, Factorization Methods for Discrete Sequential Estimation.   Dover Publications, Inc., 1977.
  • [23] R. Van der Merwe, “Sigma-point Kalman filters for probabilistic inference in dynamic state-space models,” Ph.D. dissertation, 2004.