The problem of Bayesian smoothing for a state space model
(SSM) concerns the development of recursive algorithms able to estimate theprobability density function (pdf) of the model state on a given observation interval, given a batch of noisy measurements acquired over it ; the estimated pdf is known as a smoothed or smoothing pdf. A general strategy for solving this problem is based on the so called two-filter smoothing formula -; in fact, this formula allows to compute the required smoothing density by merging the statistical information generated in the forward pass of a Bayesian filtering method with those evaluated in the backward pass of a different filtering method, paired with the first one and known as backward information filtering (BIF). Unluckily, closed form solutions for this strategy can be derived for linear Gaussian and linear Gaussian mixture models only , . For this reason, all the existing smoothing algorithms based on the above mentioned formula and applicable to general nonlinear models are approximate and are based on sequential Monte Carlo techniques (e.g., see , ,  and references therein). Unluckily, the adoption of these algorithms, known as particle smoothers, may be hindered by their complexity, which becomes unmanageable when the dimension of the sample space for the considered SSM is large.
Recently, a factor graph approach has been exploited to devise a new filtering method, based on the parallel concatenation of two (constituent) Bayesian filters and called turbo filtering (TF) . In this manuscript, a new smoothing technique that employs TF in its forward pass and a new BIF scheme, based on the parallel concatenation of two backward information filters, is developed. Our derivation of the new BIF method, called backward information turbo filtering (BITF), is based on a general graphical model; this allows us to: a) represent any BITF algorithm as the interconnection of two soft-in soft-out (SISO) processing modules; b) represent the iterative processing accomplished by these modules as a message passing technique; c) derive the expressions of the passed messages by applying the sum-product algorithm (SPA) , , together with a specific scheduling procedure, to the graphical model itself; d) show how the statistical information generated by a BITF algorithm in the backward pass can be merged with those produced by the paired TF technique in the forward pass in order to evaluate the required smoothed pdfs. To exemplify the usefulness of this smoothing method, called turbo-smoothing (TS) in the following, we take into consideration the TF algorithms proposed in  for the class of conditionally linear Gaussian (CLG) SSMs and derive a BITF algorithm paired with them. This approach leads to the development of two new TS algorithms, one generating an estimate of the joint smoothing density over the whole observation interval, the other one an estimate of the marginal smoothing densities over the same interval. Our computer simulations for a specific CLG SSM evidence that, in the considered case, the derived TS algorithms perform very closely to the Rao-Blackwellized particle smoothing (RBPS) technique proposed in  and to the particle smoothers devised in .
The remaining part of this manuscript is organized as follows. A description of the considered SSMs is illustrated in Section 2. In Section 3, a general graphical model on which the processing accomplished in BITF and TS is based is illustrated; then, a specific instance of it, referring to a CLG SSM, is developed and the messages passed over it in BITF are defined. In Section 4, the scheduling and the computation of such messages are described, specific TS algorithms are developed, and the differences and similarities between these algorithms and other smoothing techniques are briefly analysed. A comparison, in terms of accuracy and execution time, between the proposed techniques and three smoothers recently appeared in the literature is provided in Section 5 for a specific CLG SSM. Finally, some conclusions are offered in Section 6.
2 Model Description
In this manuscript we focus on a discrete-time SSM whose -dimensional hidden state in the -th interval is denoted , and whose state update and measurement models are expressed by
respectively. Here, () is a time-varying -dimensional (-dimensional) real function and () the -th element of the process (measurement) noise sequence (); this sequence consists of -dimensional (-dimensional) independent and identically distributed
(iid) Gaussian noise vectors, each characterized by a zero mean and a covariance matrix(). Moreover, statistical independence between and is assumed.
In the following, two additional mathematical representations for the considered SSM are also exploited. The first one is approximate, being employed by an extended Kalman filter (EKF); in fact, it is based on the linearized versions of eqs. (1) and (2), namely (e.g., see [1, pp. 194-195])
respectively; here, , is the (forward) estimate of evaluated by the EKF in its -th recursion, , , is the (forward) prediction computed by the EKF in its -th recursion and .
The second representation is based on the additional assumption that the SSM described by eqs. (1)-(2) is CLG , , so that its state vector in the -th interval can be partitioned as ; here, , () is the so called linear (nonlinear) component of , with (). For this reason, following ,  and , the models
are adopted for the update of the linear () and nonlinear () components and for the measurement vector, respectively. In the state update model (5), () is a time-varying -dimensional real function ( real matrix) and consists of the first (last ) elements of if (); independence between and is also assumed for simplicity and the covariance matrix () is denoted (). In the measurement model (6), instead, () is a time-varying -dimensional real function ( real matrix).
In the next two Sections we focus on the problem of developing algorithms for the estimation of a) the joint smoothed pdf (problem P.1) and b) the sequence of marginal smoothed pdfs (problem P.2); here, is the duration of the observation interval and is a -dimensional vector. It is important to point out that: a) in solving both problems P.1 and P.2, the prior knowledge of the pdf of the initial state is assumed; b) in principle, if an estimate of the joint pdf is available, estimates of all the posterior can be evaluated by marginalization.
3 Graphical Modelling for the Parallel Concatenation of Bayesian Information Filters
In this Section, we derive the graphical models on which BITF and TS techniques are based. More specifically, starting from the factor graph representing Bayesian smoothing , we first develop a general graphical model for the parallel concatenation of two backward information filters. Then, a specific instance of this model is devised for the case in which the forward filters are an EKF and a particle filter (PF), and the considered SSM is CLG.
3.1 Graphical Model for the Parallel Concatenation of Bayesian Information Filters and Message Passing over it
The development of our BIF algorithms is based on the graphical approach illustrated in ref. [11, Sec. III]. This approach consists in representing Bayesian filtering and BIF as two recursive algorithms that compute, on the basis of the SPA, a set of probabilistic messages passed along the same (cycle free) factor graph; this graph is illustrated Fig. 1-a) and refers to a SSM characterized by the Markov model and the measurement model . More specifically, in the -th recursion of Bayesian filtering, messages are passed along the considered graph in the forward direction; moreover, the messages and (denoted and , respectively, in Fig. 1 and conveying a forward estimate of and a forward prediction of , respectively) are computed on the basis of the input message
for , , , (). Dually, in the -th recursion of BIF, messages are passed along the considered graph in the backward direction, and the messages and (denoted and , respectively, in Fig. 1 and conveying a backward prediction of and a backward estimate of , respectively) are computed on the basis of the input message
with (). Once the backward pass is over, a solution to problem P.2 becomes available, since the marginal smoothed pdf can be evaluated as111Note that, similarly as  and , the joint pdf is considered here in place of the posterior pdf .
or, equivalently, as
with . Note that, from a graphical viewpoint, formulas (9) and (10) can be related with the two different partitionings of the graph shown in Fig. 1-a) (where a specific partitioning is identified by a brown dashed vertical line cutting the graph in two parts).
In ref.  it has been also shown that the factor graph illustrated in Fig. 1-a) can be employed as a building block in the development of a larger graphical model that represents a turbo filtering scheme, i.e. the parallel concatenation of two (constituent) Bayesian filters (denoted F and F in the following). In this model, the graphs referring to F and F are interconnected in order to allow the mutual exchange of statistical information in the form of pseudo-measurements (conveyed by probabilistic messages). From a graphical viewpoint, the exploitation of these additional information in each filter requires:
a) modifying the graph shown in Fig. 1-a) in a way that each constituent filter can benefit from the pseudo-measurements provided by the other filter through an additional measurement update;
b) developing message passing algorithms over a proper graphical model for 1) the conversion of the statistical information generated by each constituent filter into a form useful to the other one and 2) the generation, inside each constituent filter, of the statistical information to be made available to the other filter.
As far as the need expressed at point a) is concerned, the graph of Fig. 1-a) can be easily modified by adding a new equality node and a new edge along which the message , conveying pseudo-measurement information, is passed; this results in the factor graph shown in Fig. 1-b). Note that, in the new graphical model, two forward estimates (backward estimates) are computed in the forward (backward) pass. The first estimate, represented by () is generated by merging () with the message () conveying measurement (pseudo-measurement) information, whereas the second one, represented by (), is evaluated by merging () with the message (). Moreover, similarly as the previous case, the smoothed pdf can be computed as
note also that each of these factorisations can be associated with one of the three distinct vertical cuts drawn in Fig. 1-b).
As far as point b) is concerned, in ref.  it is shown that, in any TF scheme, all the processing tasks related to the conversion (generation) of the statistical information emerging from (feeding) each constituent filter can be easily incorporated in a single module, called soft-in soft-out (SISO) module and whose overall processing can be represented as message passing over a graphical model including the factor graph shown in Fig. 1-b). For this reason, any TF scheme can be devised by linking (i.e., by concatenating) two SISO modules, each incorporating a specific filtering algorithm and exchanging probabilistic information in an iterative fashion. It is also important to point out that the two constituent filters are not required to estimate the whole system state. For this reason, in the following, we assume that: a) the filter F estimates the portion (with and ) of the state vector (the size of is denoted , with ); b) the portion of not included in is denoted , so that the equalities or hold. However, the vector () is required to be part of (or, at most, to coincide with) (), so that an overall estimate of the system state can be always generated on the basis of the posterior pdfs of and evaluated by F and F, respectively. In fact, this constraint on and leads to the conclusion that, generally speaking, the portion of , collecting state variables, is estimated by both F and F, being shared by and .
A similar conceptual approach is followed in the remaining part of this Paragraph to derive the general representation of the BIF technique paired with a given TF scheme, that is, briefly, a backward information turbo filtering (BITF) technique. This means that:
1) The general architecture we propose for BITF is based on the parallel concatenation of two constituent Bayesian information filters, that are denoted BIF and BIF in the following.
2) The processing accomplished by BIF (BIF) is represented as a message passing algorithm over the same graphical model as F (F).
3) BITF processing can be represented as the iterative exchange of probabilistic information between two distinct SISO modules.
4) The -th SISO module (with and ) incorporates a specific BIF algorithm, that can be represented as a message passing over a factor graph similar to that shown in Fig. 1-b) and that estimates the portion of .
The graphical model developed for the SISO module based on BIF is shown in Fig. 2. In this Figure, to ease the interpretation of message passing, three rectangles, labeled as BIF-IN, BIF and BIF-OUT, have been drawn; this allow us to easily identify the portions of the graphical model involved in a) the conversion of the statistical information provided from BIF into a form useful to BIF, b) BIF processing and c) the generation of the statistical information made available by BIF to BIF, respectively. A detailed description of the signal processing tasks accomplished within each portion is provided below.
BIF-IN - The statistical information provided by BIF to the considered SISO module is condensed in the messages and ; these convey a smoothed estimate of and pseudo-measurement information about , respectively. The first message is processed in two different ways. In fact, on the one hand, it is marginalised in the block labelled by the letter M (see Fig. 2) in order to generate the pdf (do not forget that the state vector is included in ); on the other hand, is processed jointly with in order to generate the message conveying pseudo-measurement information about (this is accomplished in the block called PM conversion, PMC; see Fig. 2). Then, the messages and are passed to BIF.
BIF - The message passing accomplished in this part refers to the BIF algorithm paired with F. The graphical model developed for it and the message passing accomplished over it are based on Fig. 1-b). Note, however, that: a) the message passing aims at computing the (backward) predicted density and the (backward) filtered density and on the basis of the backward estimate originating from the previous recursion, and of the messages and provided by BIF-IN; b) an approximate model of the considered SSM could be adopted in the evaluation of these densities. For this reason, generally speaking, we can assume that the BIF algorithm is based on the Markov model and on the observation model , representing the exact models and , respectively, or approximations of one or both of them. Note also that, in both the second measurement update and the time update accomplished by this algorithm, marginalization with respect to the unknown state component is made possible by the availability of the message .
BIF-OUT - This part is fed by the backward estimate of and by the smoothed estimate of (available after that the first measurement update has been accomplished by F). The second message follows two different paths, since a) it is passed to the other SISO module as it is and b) it is jointly processed with in order to generate the pseudo-measurement message feeding the other SISO module; the last task is accomplished in the pseudo-measurement generation (PMG) block.
A graphical model structurally identical to the one shown in Fig. 2 can be easily drawn for the SISO module based on BIF by interchanging () with (). Merging the graphical model shown in Fig. 2 with its counterpart referring to BIF results in the parallel concatenation architecture illustrated in Fig. 3 (details about the underlying graphical model are omitted for simplicity) and on which TS is based. It is important to point out that:
1. The overall graphical model derived for TS, unlike the one illustrated in Fig. 1, is not cycle free; therefore, the application of the SPA to it requires defining a proper message scheduling and, generally speaking, results in iterative algorithms.
2. At the end of the -th recursion of a TS algorithm, two smoothed densities, namely and , are available. This raises the problem of how these statistical information can be fused in order to get a single pdf for (and, in particular, a single smoothed estimate of) the -dimensional portion of estimated by both F/BIF and F/BIF. Unluckily, this remains an open issue. In our computer simulations, a simple selection strategy has been adopted in state estimation, since one of the two smoothed estimates of has been systematically discarded.
3.2 A Graphical Model for the Parallel Concatenation of the Bayesian Information Filters Paired with an Extended Kalman Filter and a Particle Filter
In the remaining part of this manuscript we focus on a specific instance of the proposed TS architecture, since we make the same specific choices as  for both the SSM and the filters employed in the forward pass. In particular, we focus on the CLG SSM described in Section 2 and assume that:
1) BIF is the backward filter associated with an EKF operating over the whole system state (so that and is empty). In other words, BIF is a backward Kalman filter based on a linearised model of the considered SSM.
2) BIF is a backward filter associated with a PF (in particular, a sequential importance resampling filter ) operating on the nonlinear state component only (so that and ) and representing it through a set of particles (note that elements of the system state are shared by the two BIF algorithms). This means that BIF is employed to compute new weights for all the elements of the particle set generated by the PF in the forward pass.
Based on the general models shown in Figs. 2 and 3, the specific graphical model illustrated in Fig. 4 (and referring to the -th recursion of backward filtering) can be drawn for the considered case. In the following, we provide various details about the adopted notation and the message passing within each constituent filter and from each filter to the other one.
Message passing within BIF - BIF is based on the approximate statistical models and ; these are derived from the linearised eqs. (3) and (4), respectively. Moreover, the (Gaussian) messages passed over its graph (enclosed within the upper rectangle appearing in Fig. 4) are , , , , , (), and , and are denoted , , , , , (), and , respectively, to ease reading.
Message passing within BIF - BIF is based on the exact statistical models , and , that are derived from the eqs. (5) (with ) and (6), respectively. Moreover, the messages processed by it and appearing in Fig. 4 refer to the -th particle predicted in the previous (i.e., in the -th) recursion of forward filtering and denoted , with ; such messages are , , , , , , and , and are denoted , , , , , , and , respectively, to ease reading.
Message passing from BIF to BIF - BIF is fed by the message and the message set conveying pseudo-measurement information; these messages are computed on the basis of the statistical information made available by BIF. More specifically, on the one hand, the message (denoted ) results from the marginalization of and is employed for marginalising the PF state update and measurement models (i.e., , and , respectively) with respect to . On the other hand, the pseudo-measurement message (denoted ) is evaluated in the PMG block by processing the messages and (denoted and resulting from the marginalization of ), under the assumption that is represented by the -th particle (conveyed by the message ).
As illustrated in the Appendix, the computation of the message involves the evaluation of the pdf of the random vector
defined on the basis of the state update equation (5) (with ) and conditioned on the fact that . This pdf, which is computed according to the joint statistical characterization of and provided by BIF, is conveyed by the message (not appearing in Fig. 4). Note also that from eq. (5) (with ) the equality
is easily inferred; the pdf of evaluated on the basis of the right-hand side (RHS) of eq. (15) is denoted in the following.
Message passing from BIF to BIF - BIF is fed by the message that, unlike the set passed to BIF, provides pseudo-measurement information about the whole state . This message is generated as follows. The message set produced by the PF is processed in the PMG block, that computes the set of pseudo-measurement messages referring to the linear state component only. Then, the two sets and are merged in the PMC block, where the information they convey are converted into the (single) message . Moreover, as illustrated in the Appendix, the message conveys a sample of the random vector 
such a sample is generated under the assumption that . The pdf of the random vector is evaluated on the basis of the joint statistical representation of the couple , produced by BIF and is conveyed by the message (not appearing in Fig. 4); note also that from eq. (5) (with ) the equality
is easily inferred; the pdf of evaluated on the basis of the RHS of eq. (17) is denoted in the following.
The rationale behind the message passing illustrated above can be summarized as follows. The message is extracted from the statistical information generated by BIF and is exploited by BIF to refine its backward estimate of the whole state; moreover, merging this estimate with the forward estimate allows to generate a more accurate statistical representation for and, consequently, for (these are conveyed by and , respectively); finally, these statistical information are exploited to aid BIF in the computation of more refined weights of the particles representing .
Given the graphical model shown in Fig. 4 and the messages passed over it, the derivation of a specific BITF algorithm requires: a) defining the mathematical structure of the input messages that feed the -th recursion of backward filtering and that of the output messages emerging from both backward filtering and smoothing in the same recursion; b) describing message scheduling; c) deriving mathematical expressions for all the computed messages. These issues are analysed in detail in Section 4.
4 Scheduling and Computation of Probabilistic Messages in Turbo Smoothing Algorithms for CLG Models
In this Section, the specific issues raised at the end of the previous Section and concerning the message passing accomplished over the graphical model shown in Fig. 4 are addressed. For this reason, we first provide various details about a) the messages feeding backward filtering, and b) the messages emerging from it and from the related smoothing. Then, we focus on the scheduling of such messages and on their computation. This allows us to develop two new smoothing techniques, one solving problem P.1, the other one problem P.2. Finally, these techniques are briefly compared with other particle smoothing methods available in the literature.
4.1 Input and Output Messages
The input messages feeding the -th recursion of backward filtering are generated in the -th recursion of the paired forward filtering and in the previous recursion (i.e., in the -th recursion) of the backward pass. In the following, various details about such messages are provided.
1. Input messages evaluated in the forward pass - A turbo filter, consisting of an EKF (denoted F) and a PF (denoted F), is employed in the forward pass of the devised TS algorithms and is run only once. Therefore, the forward predictions/estimates, provided by F (F) and made available to BIF (BIF), are expressed by Gaussian pdfs (sets of weighted particles), each conveyed by a Gaussian message (by a set of particle-dependent messages). The notation adopted in the following for these probabilistic information is summarized below.
Filter F - This filter, in its -th recursion, computes the forward prediction of , conveyed by the message222Considerations similar to the ones expressed for (18) and (19) can be repeated for the messages (22) and (23), respectively, defined below. (see Fig. 4)
This message is updated in the -th recursion of F on the basis of the measurement . This produces the Gaussian message
representing a forward estimate of ; the covariance matrix and the mean vector can be evaluated on the basis of the associated precision matrix (see [11, eqs. (14)-(17)])
and of the transformed mean vector
Filter F - This filter, in its -th recursion, computes the particle set , representing a forward prediction of ; the weight assigned to the particle is equal to for any , since the use of particle resampling in each recursion is assumed. The statistical information available about are conveyed by the message
with . The weight of (with ) is updated by F in its -th recursion on the basis of the measurement ; the new weight is denoted and is conveyed by the forward message
Note that the message set represents the forward estimate of computed by F in its -th recursion and that the message set (see eq. (22)) enters the graphical model developed for BIF along the half edge referring to (see Fig. 4).
2. Input messages evaluated in the backward pass - The -th recursion of backward filtering is fed by the input messages
that convey the pdf of the backward estimate of computed by BIF and the backward estimate of generated by BIF, respectively, in the previous recursion.
All the input messages described above are processed to compute: 1) the new backward estimates and , that represent the outputs emerging from the -th recursion of backward filtering; 2) the required smoothed information (in the form of probabilistic messages) by merging forward and backward messages. In the remaining part of this Paragraph, some essential information about the structure of such messages are provided; details about their computation are given in the next Paragraph.
1. Computation of backward estimates - The computation of the message (BIF) and of the message set (BIF) is accomplished as follows. First, the backward prediction
of and the message
conveying a backward weight for the -th particle representing (with ) are computed by BIF and BIF, respectively. Then, in BIF, the message (26) is merged with the pseudo-measurement message and the measurement message in order to compute
and (see eq. (24))
conveying a new weight for the -th particle . Then, the information conveyed by the message set is merged with that provided by the measurement-based set in order to evaluate the message (see eq. (25))