Multiple Bayesian Filtering as Message Passing

07/01/2019
by   Giorgio M. Vitetta, et al.
0

In this manuscript, a general method for deriving filtering algorithms that involve a network of interconnected Bayesian filters is proposed. This method is based on the idea that the processing accomplished inside each of the Bayesian filters and the interactions between them can be represented as message passing algorithms over a proper graphical model. The usefulness of our method is exemplified by developing new filtering techniques, based on the interconnection of a particle filter and an extended Kalman filter, for conditionally linear Gaussian systems. Numerical results for two specific dynamic systems evidence that the devised algorithms can achieve a better complexity-accuracy tradeoff than marginalized particle filtering and multiple particle filtering.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/12/2018

Parallel Concatenation of Bayesian Filters: Turbo Filtering

In this manuscript a method for developing novel filtering algorithms th...
07/25/2019

Double Bayesian Smoothing as Message Passing

Recently, a novel method for developing filtering algorithms, based on t...
02/15/2019

A New Smoothing Technique based on the Parallel Concatenation of Forward/Backward Bayesian Filters: Turbo Smoothing

Recently, a novel method for developing filtering algorithms, based on t...
09/02/2013

Sigma Point Belief Propagation

The sigma point (SP) filter, also known as unscented Kalman filter, is a...
09/09/2020

Particle Filtering Under General Regime Switching

In this paper, we consider a new framework for particle filtering under ...
08/26/2020

Stochastic filters based on hybrid approximations of multiscale stochastic reaction networks

We consider the problem of estimating the dynamic latent states of an in...
05/01/2017

Nonlinear Kalman Filtering with Divergence Minimization

We consider the nonlinear Kalman filtering problem using Kullback-Leible...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I. Introduction

It is well known that Bayesian filtering represents a general recursive solution to the nonlinear filtering problem (e.g., see [1, Sect. II, eqs. (3)-(5)]), i.e. to the problem of inferring the posterior distribution of the hidden state of a nonlinear state-space model (SSM). Unluckily, this solution can be put in closed form in few cases [2]. For this reason, various filtering methods generating a functional approximation of the desired posterior pdf have been developed; these can be divided into local and global methods on the basis of the way the posterior probability density function (pdf) is approximated [3], [4]. On the one hand, local techniques, like extended Kalman filtering (EKF) [2], are computationally efficient, but may suffer from error accumulation over time; on the other hand, global techniques, like particle filtering (PF) [5][6], may achieve high accuracy at the price, however, of unacceptable complexity and numerical problems when the dimension of the state space becomes large [7][9]. These considerations have motivated the investigation of various methods able to achieve high accuracy under given computational constraints. Some of such solutions are based on the idea of combining local and global methods; relevant examples of this approach are represented by: 1) Rao-Blackwellized particle filtering (RBPF; also known as marginalized particle filtering) [10] and other techniques related to it (e.g., see [4]); 2) cascaded architectures based on the joint use of EKF and PF (e.g., see [11]

). Note that, in the first case, the state vector is split into two disjoint components, namely, a

linear state component and a nonlinear state component

; moreover, these are estimated by a bank of Kalman filters and by a particle filter, respectively. In the second case, instead, an extended Kalman filter and a particle filter are run over partially overlapped state vectors. In both cases, however, two heterogeneous filtering methods are combined in a way that the resulting overall algorithm is forward only and, within each of its recursions, both methods are executed only once. Another class of solutions, known as

multiple particle filtering (MPF), is based on the idea of partitioning the state vector into multiple substates and running multiple particle filters in parallel, one on each subspace [9], [12]-[15]. The resulting network of particle filters requires the mutual exchange of statistical information (in the form of estimates/predictions of the tracked substates or parametric distributions), so that, within each filter, the unknown portion of the state vector can be integrated out in both weight computation and particle propagation. In principle, MPF can be employed only when the selected substates are separable in the state equation, even if approximate solutions can be devised to circumvent this problem [15]. Moreover, the technical literature about MPF has raised three interesting technical issues that have received limited attention until now. The first issue refers to the possibility of coupling an extended Kalman filter with each particle filter of the network; the former filter should provide the latter one with the statistical information required for integrating out the unknown portion of the state vector (see [14, Par. 3.2]). The second one concerns the use of filters having partially overlapped substates (see [13, Sec.1]). The third (and final) issue, instead, concerns the iterative exchange of statistical information among the interconnected filters of the network. Some work related to the first issue can be found in [16], where the application of MPF to target tracking in a cognitive radar network has been investigated. In this case, however, the proposed solution is based on Rao-Blackwellisation; for this reason, each particle filter of the network is not coupled with a single extended Kalman filter, but with a bank of Kalman filters. The second issue has not been investigated at all, whereas limited attention has been paid to the third one; in fact, the last problem has been investigated only in [12]

, where a specific iterative method based on game theory has been developed. The need of employing iterative methods in MPF has been also explicitly recognised in

[15], but no solution has been developed to meet it.

In this manuscript, we first focus on the general problem of developing filtering algorithms that involve multiple interconnected Bayesian filters; these filters are run over distinct (but not necessarily disjoint) subspaces and can exploit iterative methods in their exchange of statistical information. The solution devised for this problem (and called multiple Bayesian filtering, MBF, since it represents a generalisation of the MPF approach) is based on previous work on the application of factor graph theory to the filtering and smoothing problems [17][21]. More specifically, we show that: a) a graphical model can be developed for a network of Bayesian filters by combining multiple factor graphs, each referring to one of the involved filters; b) the pdfs computed by all these filters can be represented as messages passed on such a graphical model. This approach offers various important advantages. In fact, all the expressions of the passed messages can be derived by applying the same rule, namely the so called sum-product algorithm (SPA) [17], [18], to the graphical model devised for the whole network. Moreover, iterative algorithms can be developed in a natural fashion once the cycles contained in this graphical model have been identified and the order according to which messages are passed on them (i.e., the message scheduling) has been established. The usefulness of our approach is exemplified by mainly illustrating its application to a network made of two Bayesian filters. More specifically, we investigate the interconnection of an extended Kalman filter with a particle filter, and develop two new filtering algorithms under the assumption that the considered SSM is conditionally linear Gaussian (CLG). Simulation results for two specific SSMs evidence that the devised algorithms perform similarly or better than RBPF and MPF, but require a smaller computational effort.

The remaining parts of this manuscript are organized as follows. In Section II., after introducing factor graph theory and the SPA, the filtering problem is analysed from a factor graph perspective for a network of multiple interconnected Bayesian filters. In Section III., the tools illustrated in the previous section are applied to a network consisting of an extended Kalman filter interconnected with a particle filter, two new MBF algorithms are derived and their computational complexity is analysed in detail. The developed MBF algorithms are compared with EKF and RBPF, in terms of accuracy and execution time, in Section IV.. Finally, some conclusions are offered in Section V..

Ii. Graphical Modelling for Multiple Bayesian Filtering

In this paragraph, we illustrate some basic concepts about factor graphs and the computation of the messages passed over them. Then, we derive a graphical model for representing the overall processing accomplished by multiple interconnected Bayesian filters as a message passing on it.

A. Factor Graphs and the Sum-Product Algorithm

A factor graph is a graphical model representing the factorization of any function expressible as a product of factors , each depending on a set of variables In the following, Forney-style factor graphs are considered [17]. This means that the factor graph associated with the function consists of nodes, edges (connecting distinct nodes) and half-edges (connected to a single node only). Moreover, the following rules are employed for its construction: a) every factor is represented by a single node (a rectangle in our pictures); b) every variable is represented by a unique edge or half edge; c) the node representing a factor is connected with the edge (or half-edge) representing the variable if and only if such a factor depends on ; d) an equality constraint node (represented by a rectangle labelled by “=”) is used as a branching point when more than two factors are required to share the same variable. For instance, the factorisation of the function

(1)

can be represented through the factor graph shown in Fig. 1.

In this manuscript, factorisable functions represent joint pdfs. It is well known that the marginalization of with respect to one or more of its variables can be usually split into a sequence of simpler marginalizations; our interest in the graph representing is motivated by the fact that the function resulting from each of these marginalizations can be represented as a message (conveying a joint pdf of the variables it depends on) passed along an edge of the graph itself. In this work, the computation of all the messages is based on the SPA (also known as belief propagation). This algorithm can be formulated as follows (e.g., see [17, Sec. IV]): the message emerging from a node, representing a factor , along the edge associated with a variable is expressed by the product of and the messages along all the incoming edges (except that associated with ), integrated over all the involved variables except . Two simple applications of the SPA are illustrated in Fig. 2-a) and in Fig. 2-b), that refer to an equality constraint node and to a function node, respectively (note that, generally speaking, these nodes are connected to edges representing vectors of variables). On the one hand, the message emerging from the equality node shown in Fig. 2-a) is evaluated as

(2)

where and are the two messages entering the node itself (if a single message enters an equality node, the two messages emerging from are simply copies of it) and is the vector of variables all these message refer to. On the other hand, the message emerging from the function node shown Fig. 2-b), that refers to the function depending on the vectors of variables and , is given by

(3)

where denotes the message entering it.

In applying the SPA, it is important to keep in mind that: a) the marginal pdf , referring to the variable only, is expressed by the product of two messages associated with the edge , but coming from opposite directions; b) the half-edge associated with a variable may be thought as carrying a constant message of unit value as incoming message; c) if a marginal pdf is required to be known up to a scale factor, the involved messages can be freely scaled in their computation. The use of the last rules and of those expressed by Eqs. (2) and (3) can be exemplified by taking into consideration again the function (1

) (which is assumed now to represent the joint pdf of four continuous random variables) and showing how, thanks to these rules, the marginal pdf

can be evaluated in a step-by-step fashion. If the messages , and are defined, applying Eqs. (2)–(3) to the factor graph shown in Fig. 1 leads to the ordered computation of the messages

(4)
(5)
(6)

and

(7)

Then, given the messages (6) and (7), referring to the same edge, but originating from opposite directions, the required marginal is evaluated as

(8)

This result is exact since the graph representing the joint pdf (1) is cycle free, i.e. it does not contain closed paths. When the considered graph does not have this property, the SPA can still be employed (e.g., see [17, Par. III.A] and [18, Sec. V]), but its application leads to iterative message passing algorithms, that, in general, produce approximate results. Moreover, the order according to which messages are passed on a cycle (i.e., the message scheduling) has to be properly selected. Despite this, it is widely accepted that the most important applications of the SPA refer to cyclic graphs [18].

Figure 1: Factor graph representing the structure of the function (1) and message passing on it for the evaluation of the marginal .
Figure 2: Representation of the graphical models which Eqs. (2) (diagram a)) and (3) (diagram b)) refer to.

The last important issue related to the application of the SPA is the availability of closed form expressions for the passed messages when, like in the filtering problem investigated in this manuscript, the involved variables are continuous. In the following, the pdfs of all the considered random vectors are Gaussian or are approximated through a set of weighted particles. In the first case, the pdf of a random vector is conveyed by the message

(9)

where and denote the mean and the covariance of , respectively. In the second case, instead, its pdf is conveyed by the message

(10)

where

(11)

represents the th component of the message (10), i.e. the contribution of the th particle and its weight to such a message. Luckily, various closed form results are available for these two types of messages; the few mathematical rules required in the computation of all the messages appearing in our filtering algorithms can be found in Tables I–III of [21, App. A, p. 1534].

B. Graphical Modelling for a Network of Bayesian Filters and Message Passing on it

In this manuscript, we consider a discrete-time SSM whose dimensional hidden state in the th interval is denoted , and whose state update and measurement models are expressed by

(12)

and

(13)

respectively. Here, () is a time-varying dimensional (dimensional) real function and () the th element of the process (measurement) noise sequence (); this sequence consists of dimensional (dimensional) independent and identically distributed (iid) Gaussian noise vectors, each characterized by a zero mean and a covariance matrix (). Moreover, statistical independence between and is assumed for simplicity. Note that, from a statistical viewpoint, the SSM described by Eqs. (12)–(13) is characterized by the Markov model and the observation model for any .

In the following sections, we focus on the so-called filtering problem, which concerns the evaluation of the posterior pdf at an instant , given a) the initial pdf and b) the -dimensional measurement vector . It is well known that, if the pdf referring to the first observation interval is known, the computation of the posterior (i.e., filtered) pdf for can be accomplished by means of an exact Bayesian recursive procedure, consisting of a measurement update step followed by a time update step. In [21, Sec. III], it is shown that, if this procedure is formulated with reference to the joint pdf (in place of the associated a posteriori pdf ), its th recursion (with ) can be represented as a forward only message passing algorithm over the cycle free factor graph shown in Fig. 3. In the measurement update, the message going out of the equality node is computed as111In the following the acronyms fp, fe, ms and pm are employed in the subscripts of various messages, so that readers can easily understand their meaning; in fact, the messages these acronyms refer to convey a forward prediction, a forward estimate, measurement information and pseudo-measurement information, respectively. (see Eq. (2))

(14)

where

(15)

is the message feeding the considered graph. Note that the messages (15) and convey the predicted pdf (i.e., the forward prediction) of computed in the previous (i.e., in the th) recursion and the filtered pdf (i.e., the forward estimate) of computed in the considered recursion, respectively, whereas the message conveys the statistical information provided by the measurement (13).

In the time update, the message that emerges from the function node referring to the pdf is evaluated as (see Eq. (3))

(16)

such a message is equal to (see Eq. (15))

Let us take into consideration now a network of interconnected Bayesian filters. In the following, we assume that:

a) All the filters of the network are fed by the same measurement vector (namely, (13)), work in parallel and cooperate in order to estimate the state vector ; in doing so, they can fully share their statistical information.

b) The th filter of the network (with , , , ), denoted F, works on a lower dimensional space and, in particular, estimates the portion (having size , with ) of the state vector ; therefore, the substate , representing the portion of not included in , can be considered as a nuisance vector for F.

c) The set , collecting the substates estimated by all the filters of the network, covers , but does not necessarily represent a partition of it. In other words, unlike MPF, some overlapping between the substates estimated by different filters is admitted. This means that the filtering algorithm running on the whole network may contain a form of redundancy, since one or more elements of the state vector can be independently estimated by different Bayesian filters.

We are interested in developing recursive filtering algorithms for the whole network of Bayesian filters. The approach we propose to solve this problem consists of the following three steps: S1) building a factor graph that allows us to represent the measurement and time updates accomplished by each filter of the network and its interactions with the other filters as message passing algorithms on it; S2) developing a graphical model for the whole network on the basis of the factor graphs devised in the first step; S3) deriving new filtering methods as message passing algorithms over the whole graphical model obtained in the second step.

Let us focus, now, on step S1. In developing a graphical model for filter F, the following considerations must be kept into account:

1) Since the portion of is unknown to F (and, consequently, represents a nuisance state), an estimate of its pdf must be provided by the other filters of the network; this allows F to integrate out the dependence of its Markov model and of its observation model on .

2) Filter F can benefit from the pseudo-measurements computed on the basis of the statistical information provided by the other filters of the network.

As far as the last point is concerned, it is worth pointing out that, in this manuscript, any pseudo-measurement represents a fictitious measurement computed on the basis of the statistical information provided by a filtering algorithm different from the one benefiting from it; despite this, it can be processed as if it was a real measurement, provided that its statistical model is known. In practice, a pseudo-measurement made available to the filter F is a dimensional random vector that, similarly as the real measurement (13), can be modelled as222The possible dependence of the pseudo-measurement (17) on the substate is ignored here, for simplicity.

(17)

where is a time-varying dimensional function and is a zero mean dimensional noise vector. The evaluation of these fictitious measurements is often based on the mathematical constraints established by the Markov model of the considered SSM, as shown in the following section, where a specific network of filters is considered.

Based on the considerations illustrated above, the equations describing the measurement/time updates accomplished by F in the th recursion of the network can be formulated as follows. At the beginning of this recursion, F is fed by the forward prediction

(18)

originating from the previous recursion. In its first step (i.e., in its measurement update), it computes two filtered pdfs (i.e., two forward estimates), the first one based on the measurement (13), the second one on the pseudo-measurement (17). The first filtered pdf is evaluated as (see Eq. (14))

(19)

where

(20)

and are the messages conveying measurement information and a filtered (or predicted) pdf of provided by the other filters, respectively. Similarly, the second filtered pdf is evaluated as (see Eq. (14))

(21)

where333If the pseudo-measurement (17) depends also on , marginalization with respect to this substate is required in the computation of the following message.

(22)

is the message conveying pseudo-measurement information. Then, in its second step (i.e., in its time update), F computes the new forward prediction (see Eq. (16))

(23)

where has the same meaning as (see Eq. (20)), but is not necessarily equal to it (since more refined information about could be made available by the other filters of the network after that the message (20) has been computed).

Formulas (19)-(21) and (23) involve only products of pdfs and integrations of products; for this reason, their evaluation can be represented as a forward only message passing over the cycle free factor graph shown in Fig. 4. Note that, if this graph is compared with the one shown in Fig. 3, the following additional elements (identified by blue lines) are found:

1) Five equality nodes - Four of them allow to generate copies of the messages , , and , to be shared with the other filters of the network, whereas the remaining one is involved in the second measurement update of F.

2) A block in which the predicted/filtered pdfs , ; , and provided by the other filters of the network are processed - In this block, the messages (with and ) and are computed (see Eqs. (20), (22) and (23)); this block is connected to oriented edges only, i.e. to edges on which the flow of messages is unidirectional.

Given the graphical model represented in Fig. 4, step S2 can be accomplished by adopting the same conceptual approach as [21, Sec. III], where the factor graph on which RBPF and dual RBPF are based is devised by merging two sub-graphs, that refer to distinct substates. For this reason, a graphical model for the whole network of Bayesian filters can be developed by interconnecting distinct factor graphs, each structured like the one shown in that Figure. For instance, if is assumed for simplicity, this procedure results in the graphical model shown in Fig. 5. It is important to note that, in this case, if the substates and estimated by F and F, respectively, do not form a partition of state vector , they share a portion of it; this consists of state variables, that are separately estimated by the two Bayesian filters. The parameter can be considered as the degree of redundancy characterizing the considered network of filters. The presence of redundancy in a filtering algorithm may result in an improvement of estimation accuracy and/or tracking capability; however, this is obtained at the price of an increased complexity with respect to the case in which F and F are run on disjoint substates.

Once the graphical model for the whole network has been developed, step S3 can be easily accomplished. In fact, recursive filtering algorithms for the considered network can be derived by systematically applying the SPA to its graphical model after that a proper scheduling has been established for the exchange of messages among its Bayesian filters. Moreover, in developing a specific filtering algorithm to be run on a network of Bayesian filters, we must always keep in mind that:

1) Its th recursion is fed by the set of forward predictions , , , , , and generates couples of filtered densities , , , , and new forward predictions , , , , . Moreover, similarly as MPF, a joint filtered density for the whole state is unavailable (unless the substate of one or more of the employed Bayesian filters coincides with ) and multiple filtered/predicted pdfs are available for any substate shared by distinct filters.

2) Specific algorithms are needed to compute the pseudo-measurement and the nuisance substate pdfs in the F, F block appearing in Fig. 5. These algorithms depend on the considered SSM and on the selected message scheduling; for this reason, a general description of their structure cannot be provided.

3) The graphical model shown in Fig. 5, unlike the one illustrated in Fig. 3, is not cycle free; the presence of cycles is highlighted in the considered figure by showing the flow of messages along one of them. The presence of cycles raises the problems of a) identifying all the messages that can be iteratively refined and b) establishing the order according to which they are computed. Generally speaking, iterative message passing on the graphical model referring to a network of filters involves both the couple of measurement updates and the time update accomplished by all the interconnected filters. In fact, this should allow each Bayesian filter to a) progressively refine the nuisance substate density employed in its measurement/time updates, and b) improve the quality of the pseudo-measurements exploited in its second measurement update. For this reason, if iterations are run, the overall computational complexity of each recursion is multiplied by .

In the following section, a specific application of the general principles illustrated in this paragraph is analysed.

Figure 3: Message passing over the factor graph representing the th recursion of Bayesian filtering. A SSM characterized by the Markov model and the observation model is considered.
Figure 4: Message passing over the factor graph representing the couple of measurement updates and the time update accomplished by the th Bayesian filter in the th recursion of the network it belongs to. The messages , , , , and are denoted , , , ,  and , respectively, to ease reading.
Figure 5: Graphical model based on the factor graph shown in Fig. 4 and referring to the interconnection of two Bayesian filters; the presence of a closed path (cycle) on which messages can be passed multiple times is highlighted by brown arrows.

Iii. Filtering Algorithms Based on the Interconnection of an Extended Kalman Filter with a Particle Filter

In this section we focus on the development of two new filtering algorithms based on the interconnection of an extended Kalman filter with a particle filter. We first describe the graphical models on which these algorithms are based. Then, we provide a detailed description of the computed messages and their scheduling in a specific case. Finally, we provide a detailed analysis of the computational complexity of the devised algorithms.

A. Graphical Modelling

In this section, we develop new filtering algorithms for the class of conditionally linear Gaussian SSMs [10], [20], [21]; this allows us to partition the state vector in the th interval as , where , () is its linear (nonlinear) component (with ). The devised algorithms rely on the following assumptions:

1) They involve two interconnected Bayesian filters, denoted F and F.

2) Filter F is a particle filter444In particular, a sequential importance resampling filter is employed [1]. and estimates the nonlinear state component only (so that and ).

3) Filter F is an extended Kalman filter and works on the whole system state or on the linear state component only. Consequently, in the first case (denoted C.1 in the following), and is empty, and both the interconnected filters estimate the nonlinear state component (for this reason, the corresponding degree of redundancy is ). In the second case (denoted C.2 in the following), instead, and , and the two filters estimate disjoint substates (consequently, ).

This network configuration has been mainly inspired by RBPF. In fact, similarly as RBPF, the filtering techniques we develop are based on the idea of concatenating a local filtering method (EKF) with a global method (PF). However, unlike RBPF, a single extended Kalman filter is employed in place of a bank of Kalman filters. It is also worth remembering that, on the one hand, the use of a particle filter interconnected with an extended Kalman filter for tracking disjoint substates has been suggested in [14, Par. 3.2], where, however, no filtering algorithm based on this idea has been derived. On the other hand, a filtering scheme based on the interconnection of the same filters, but working on partially overlapped substates, has been derived in [22], where it has also been successfully applied to inertial navigation.

Based on the graphical model shown in Fig. 5, the factor graph illustrated in Fig. 6 can be drawn for case C.1. It is important to point out that:

1) Filter F is based on linearised (and, consequently, approximate) Markov/measurement models of the considered SSM, whereas filter F relies on exact models, as explained in more detail below.

2) Since the nuisance substate is empty, no marginalization is required in F; for this reason, the messages ; (i.e., and ) visible in Fig. 5 do not appear in Fig. 6.

3) The new predicted pdf and the second filtered pdf computed by F (i.e., the messages and , respectively) feed the FF block, where they are jointly processed to generate the pseudo-measurement message () feeding F. Similarly, as shown below, the computation of the pseudo-measurement message exploited by F (i.e., of the message , ) requires the knowledge of a new predicted pdf that refers, however, to the linear state component only. In our graphical model, the computation of this prediction is accomplished by the FF block; this explains why the new predicted pdf () evaluated by F and referring to the whole state of the considered SSM, does not feed the FF block.

4) Particle resampling with replacement has been included in the portion of the graphical model referring to filter F. This important task, accomplished after the second measurement update of this filter, does not emerge from the application of the SPA to our graphical model and ensures that the particles emerging from it are all equally likely. Note also that, because of the presence of particle resampling, two versions of the second filtered pdf () become available, one before resampling, the other one after it. As shown in the next paragraph, the second version of this message is exploited in the computation of the pseudo-measurement message ().

In the remaining part of this paragraph, we first provide various details about the filters F and F, and the way pseudo-measurements are computed for each of them; then, we comment on how the factor graph shown in Fig. 6 should be modified if case C.2 is considered.

Figure 6: Graphical model based on the factor graph shown in Fig. 5 and referring to the interconnection of an extended Kalman filter (F) with a particle filter (F).

Filter F - Filter F is based on the linearized versions of Eqs. (12) and (13), i.e. on the models (e.g., see [2, pp. 194-195])

(24)

and

(25)

respectively; here, , , , and () is the forward prediction (forward estimate) of computed by F in its th (th) recursion. Consequently, the approximate models

(26)

and

(27)

appear in the graphical model shown in Fig. 6.

Filter F - In developing filter F, we assume that the portion of Eq. (12) referring to the nonlinear state component (i.e., the last lines of the considered Markov model) and that the observation model (13) can be put in the form (e.g., see [21, eqs. (3)-(4)])

(28)

and

(29)

respectively. In Eq. (28), () is a time-varying dimensional real function ( real matrix) and consists of the last elements of the noise term appearing in Eq. (12) (the covariance matrix of is denoted ); moreover, in Eq. (29), () is a time-varying dimensional real function ( real matrix). This explains why filter F is based on the exact pdfs

(30)

and

(31)

that appear in the graphical model shown in Fig. 6.

Computation of the pseudo-measurements for filter F - Filter F is fed by pseudo-measurement information about the whole state , i.e. about both the substates and . On the one hand, pseudo-measurements about the nonlinear state component are provided by the particles contributing to the filtered pdf () available after particle resampling. On the other hand, pseudo-measurements about the linear state component are evaluated by means of the same method employed by RBPF for this task. This method is based on the idea that the random vector (see [10, Par. II.D, p. 2283, eq. (24a)] and [21, Sec. III, p. 1524, eq. (9)])

(32)

depending on the nonlinear state component only, must equal the sum (see Eq. (28))

(33)

that depends on the linear state component. For this reason, realizations of (32) are computed in the FF block on the basis of the messages () and () and are treated as measurements about .

Computation of the pseudo-measurements for filter F - The messages feeding FF block are employed for: a) generating a pdf of , so that the dependence of the state update and measurement models (i.e., of the densities , (30) and (31), respectively) on this substate can be integrated out; b) computing pseudo-measurement information about . As far as the last point is concerned, the approach we adopt is the same as that developed for dual RBPF in [21, Sec. V, pp. 1528-1529]. Such an approach relies on the Markov model

(34)

referring to the linear state component [20], [21]; in the last expression, () is a time-varying dimensional real function ( real matrix), and consists of the first elements of the noise term appearing in Eq. (12) (the covariance matrix of is denoted , and independence between and is assumed for simplicity). From Eq. (34) it is easily inferred that the random vector

(35)

equals the sum

(36)

that depends on only; for this reason, (35) can be interpreted as a pseudo-measurement about . In this case, the generation of pseudo-measurement information can be summarised as follows. First, pdfs, one for each of the particles conveyed by the message (), are computed for the random vector (35) by exploiting the statistical information about the linear state component made available by F. Then, each of these pdfs is correlated with the pdf obtained for under the assumption that this vector is expressed by Eq. (36); this procedure results in a set of particle weights, different from those computed on the basis of (29) in the first measurement update of F.

A graphical model similar to the one shown in Fig. 6 can be easily derived from the general model appearing in Fig. 5 for case C.2 too. The relevant differences with respect to case C.1 can be summarized as follows:

1) Filters F and F estimate and , respectively; consequently, their nuisance substates are and , respectively.

2) The FF block is fed by the predicted/filtered pdfs computed by F; such pdfs are employed for: a) for providing F with a pdf for , so that dependence of the Markov model (see Eq. (34))

(37)

and of the measurement model (31) on this substate can be integrated out; b) generating pseudo-measurement information about the substate only. As far as point a) is concerned, it is also important to point out that the approximate model () on which F is based can be derived from Eq. (31) (Eq. (37)) after setting (); here, () denote the prediction (the estimate) of evaluated on the basis of the message () computed by F. Moreover, since Eqs. (29) and (34) exhibit a linear dependence on , F becomes a standard Kalman filter.

The derivation of a specific filtering algorithm based on the graphical models described in this paragraph requires defining the scheduling of the messages passed on them and deriving mathematical expressions for such messages. These issues are investigated in detail in the following paragraph.

B. Message Scheduling and Computation

In this paragraph, a recursive filtering technique, called dual Bayesian filtering (DBF) and based on the graphical model illustrated in Fig. 6, is developed. In each recursion of the DBF technique, F is run before F; moreover, the presence of cycles in the graph on which it is based is accounted for by including a procedure for the iterative computation of the messages passed on them. Our description of the selected scheduling relies on Fig. 7, that refers to the th recursion and to the th iteration accomplished within this recursion (with , , , , where represents the overall number of iterations). It is important to point out that the following changes have been made in Fig. 7 with respect to Fig. 6:

1) A simpler notation has been adopted for the messages to ease reading. In particular, the symbols , , (), () and () represent the messages , , (), () and (), respectively; moreover, the integer parameter appearing in the superscript of some of them represents the iteration index.

2) Blue (red) arrows have been employed to identify Gaussian messages (messages in other forms).

3) The FF block is fed by the two filtered pdfs of computed by F (i.e., by the messages and ), but not by the predicted pdf , since the last message is useless.

3) The forward prediction feeding F is involved in the proposed iterative procedure and may change from iteration to iteration because of resampling (in fact, this may lead to discarding a portion of the particles conveyed by this message); for this reason, its dependence on the iteration index has been explicitly indicated.

4) The same message (namely, ) is employed in F for integrating out the dependence of the Markov model