## 1 Introduction

The problem of *Bayesian* *smoothing* for a *state space model*

(SSM) concerns the development of recursive algorithms able to estimate the

*probability density function*(pdf) of the model state on a given observation interval, given a batch of noisy measurements acquired over it [1]; the estimated pdf is known as a

*smoothed*or

*smoothing*pdf. A general strategy for solving this problem is based on the so called

*two-filter smoothing formula*[2]-[3]; in fact, this formula allows to compute the required smoothing density by merging the statistical information generated in the forward pass of a

*Bayesian filtering*method with those evaluated in the backward pass of a different filtering method,

*paired with the first one*and known as

*backward information filtering*(BIF). Unluckily, closed form solutions for this strategy can be derived for

*linear Gaussian*and

*linear Gaussian mixture*models only [1], [4]. For this reason, all the existing smoothing algorithms based on the above mentioned formula and applicable to general

*nonlinear*models are approximate and are based on

*sequential Monte Carlo*techniques (e.g., see [2], [5], [6] and references therein). Unluckily, the adoption of these algorithms, known as

*particle smoothers*, may be hindered by their complexity, which becomes unmanageable when the dimension of the sample space for the considered SSM is large.

Recently, a *factor graph* approach has been exploited to devise a new
filtering method, based on the *parallel concatenation* of two (*constituent*) Bayesian filters and called *turbo filtering *(TF) [7]. In this manuscript, a new smoothing technique that
employs TF in its *forward* pass and a new BIF scheme, based on the
parallel concatenation of two backward information filters, is developed.
Our derivation of the new BIF method, called *backward information
turbo filtering* (BITF), is based on a *general graphical model*; this
allows us to: a) represent any BITF algorithm as the interconnection of
*two soft-in soft-out *(SISO) processing modules; b) represent the
iterative processing accomplished by these modules as a message passing
technique; c) derive the expressions of the passed messages by applying the
*sum-product algorithm* (SPA) [8], [9], together with a specific *scheduling* procedure, to
the graphical model itself; d) show how the statistical information
generated by a BITF algorithm in the backward pass can be merged with those
produced by the paired TF technique in the forward pass in order to evaluate
the required smoothed pdfs. To exemplify the usefulness of this smoothing
method, called *turbo-smoothing* (TS) in the following, we take into
consideration the TF algorithms proposed in [7] for the
class of *conditionally linear Gaussian* (CLG) SSMs and derive a BITF
algorithm paired with them. This approach leads to the development of two
new TS algorithms, one generating an estimate of the joint smoothing density
over the whole observation interval, the other one an estimate of the
marginal smoothing densities over the same interval. Our computer
simulations for a specific CLG SSM evidence that, in the considered case,
the derived TS algorithms perform very closely to the *Rao-Blackwellized particle smoothing* (RBPS) technique proposed in [10] and to the particle smoothers devised in [11].

The remaining part of this manuscript is organized as follows. A description of the considered SSMs is illustrated in Section 2. In Section 3, a general graphical model on which the processing accomplished in BITF and TS is based is illustrated; then, a specific instance of it, referring to a CLG SSM, is developed and the messages passed over it in BITF are defined. In Section 4, the scheduling and the computation of such messages are described, specific TS algorithms are developed, and the differences and similarities between these algorithms and other smoothing techniques are briefly analysed. A comparison, in terms of accuracy and execution time, between the proposed techniques and three smoothers recently appeared in the literature is provided in Section 5 for a specific CLG SSM. Finally, some conclusions are offered in Section 6.

## 2 Model Description

In this manuscript we focus on a discrete-time SSM whose -dimensional
*hidden state* in the -th interval is denoted , and whose
*state update* and *measurement models* are expressed by

(1) |

and

(2) | |||||

respectively. Here, () is a time-varying -dimensional (-dimensional) real function and () the -th element of the process (measurement) noise sequence (); this sequence consists
of -dimensional (-dimensional) *independent and identically
distributed*

(iid) Gaussian noise vectors, each characterized by a zero mean and a covariance matrix

(). Moreover, statistical independence between and is assumed.In the following, two additional mathematical representations for the
considered SSM are also exploited. The first one is approximate, being
employed by an *extended Kalman filter* (EKF); in fact, it is based on
the *linearized* versions of eqs. (1) and (2), namely (e.g., see [1, pp. 194-195])

(3) |

and

(4) |

respectively; here, , is the (forward) *estimate* of
evaluated by the EKF in its -th recursion, , , is the (forward) *prediction*
computed by the EKF in its -th recursion and .

The second representation is based on the additional assumption that the SSM
described by eqs. (1)-(2) is CLG [10], [13], so that its state vector in the -th
interval can be partitioned as ; here, , () is the so called
*linear *(*nonlinear*) *component* of , with (). For this reason, following [11],
[12] and [13], the models

(5) |

and

(6) |

are adopted for the update of the *linear* () and *nonlinear*
() components and for the measurement vector, respectively. In the
state update model (5), () is a
time-varying -dimensional real function ( real
matrix) and consists of the first (last ) elements of if (); independence between
and is also assumed
for simplicity and the covariance matrix () is denoted (). In
the measurement model (6), instead, () is a time-varying -dimensional real function ( real matrix).

In the next two Sections we focus on the problem of developing algorithms
for the estimation of a) the *joint* *smoothed pdf* (problem P.1) and b) the sequence of
*marginal* *smoothed pdfs * (problem P.2); here, is the duration
of the observation interval and is a -dimensional vector. It is important to point out that: a) in solving both
problems P.1 and P.2, the prior knowledge of the pdf of
the initial state is assumed; b) in principle, if an estimate of the joint
pdf is available, estimates of all
the posterior can be evaluated by
*marginalization*.

## 3 Graphical Modelling for the Parallel Concatenation of Bayesian Information Filters

In this Section, we derive the graphical models on which BITF and TS
techniques are based. More specifically, starting from the factor graph
representing Bayesian smoothing [11], we first develop a
general graphical model for the parallel concatenation of two backward
information filters. Then, a specific instance of this model is devised for
the case in which the forward filters are an EKF and a *particle filter*
(PF), and the considered SSM is CLG.

### 3.1 Graphical Model for the Parallel Concatenation of Bayesian Information Filters and Message Passing over it

The development of our BIF algorithms is based on the graphical approach
illustrated in ref. [11, Sec. III]. This approach consists in
representing *Bayesian filtering* and BIF as two recursive algorithms
that compute, on the basis of the SPA, a set of probabilistic messages
passed along the *same* (cycle free) factor graph; this graph is
illustrated Fig. 1-a) and refers to a SSM characterized by the
Markov model and the measurement model . More specifically, in the -th
recursion of Bayesian filtering*, *messages are passed along the
considered graph in the *forward* direction; moreover, the messages and (denoted and , respectively, in
Fig. 1 and conveying a *forward estimate* of
and a *forward prediction* of , respectively) are
computed on the basis of the input message

(7) |

for , , , (). Dually, in the -th recursion of BIF, messages are
passed along the considered graph in the *backward* direction, and the
messages and
(denoted and , respectively, in Fig. 1 and
conveying a *backward prediction* of and a *backward estimate* of , respectively) are computed on the
basis of the input message

(8) |

with (). Once the
backward pass is over, a solution to problem P.2 becomes available,
since the marginal smoothed pdf can be evaluated as^{1}^{1}1Note that, similarly as [11] and [12], the
*joint* pdf is considered here in
place of the *posterior* pdf .

(9) |

or, equivalently, as

(10) |

with . Note that, from a graphical viewpoint, formulas (9) and (10) can be related with the two different partitionings of the graph shown in Fig. 1-a) (where a specific partitioning is identified by a brown dashed vertical line cutting the graph in two parts).

In ref. [7] it has been also shown that the factor
graph illustrated in Fig. 1-a) can be employed as a building
block in the development of a larger graphical model that represents a *turbo filtering* scheme, i.e. the *parallel concatenation* of two (*constituent*) Bayesian filters (denoted F and F in the
following). In this model, the graphs referring to F and F are
interconnected in order to allow the mutual exchange of statistical
information in the form of *pseudo-measurements* (conveyed by
probabilistic messages). From a graphical viewpoint, the exploitation of
these additional information in each filter requires:

a) modifying the graph shown in Fig. 1-a) in a way that each
constituent filter can benefit from the pseudo-measurements provided by the
other filter through an additional *measurement update*;

b) developing message passing algorithms over a proper graphical model for
1) the *conversion* of the statistical information generated by each constituent filter into a form useful to the other one and 2) the *generation*, inside each constituent filter, of the statistical information
to be made available to the other filter.

As far as the need expressed at point a) is concerned, the graph of Fig. 1-a) can be easily modified by adding a new equality node and a new
edge along which the message ,
conveying pseudo-measurement information, is passed; this results in the
factor graph shown in Fig. 1-b). Note that, in the new graphical
model, two *forward estimates* (*backward estimates*) are computed
in the *forward* (*backward*) pass. The first estimate,
represented by () is generated by merging
() with the message () conveying measurement
(pseudo-measurement) information, whereas the second one, represented by (), is evaluated by merging () with the message (). Moreover, similarly as the previous
case, the smoothed pdf can
be computed as

(11) | |||||

(12) | |||||

(13) |

note also that each of these factorisations can be associated with one of the three distinct vertical cuts drawn in Fig. 1-b).

As far as point b) is concerned, in ref. [7] it is shown
that, in any TF scheme, all the processing tasks related to the conversion
(generation) of the statistical information emerging from (feeding) each
constituent filter can be easily incorporated in a single* module*,
called *soft-in soft-out* (SISO) module and whose overall processing
can be represented as message passing over a graphical model including the
factor graph shown in Fig. 1-b). For this reason, any TF scheme
can be devised by linking (i.e., by concatenating) two SISO modules, each
incorporating a specific filtering algorithm and exchanging probabilistic
information in an iterative fashion. It is also important to point out that
the two constituent filters are not required to estimate the whole system
state. For this reason, in the following, we assume that: a) the filter F estimates the portion (with and ) of
the state vector (the size of is
denoted , with ); b) the portion of not
included in is denoted ,
so that the equalities or hold. However, the vector () is required to be
part of (or, at most, to coincide with) (), so that an *overall estimate* of the system state can be always generated on the basis of the posterior pdfs
of and evaluated by F
and F, respectively. In fact, this constraint on and leads to the conclusion that,
generally speaking, the portion of , collecting state variables,
is estimated by both F and F, being shared by and .

A similar conceptual approach is followed in the remaining part of this
Paragraph to derive the general representation of the BIF technique paired
with a given TF scheme, that is, briefly, a *backward information turbo
filtering* (BITF) technique. This means that:

1) The general architecture we propose for BITF is based on the parallel
concatenation of two *constituent* Bayesian information filters, that
are denoted BIF and BIF in the following.

2) The processing accomplished by BIF (BIF) is represented as a
message passing algorithm over the *same* graphical model as F (F).

3) BITF processing can be represented as the iterative exchange of probabilistic information between two distinct SISO modules.

4) The -th SISO module (with and ) incorporates a specific BIF algorithm, that can be represented as a message passing over a factor graph similar to that shown in Fig. 1-b) and that estimates the portion of .

The graphical model developed for the SISO module based on BIF is
shown in Fig. 2. In this Figure, to ease the interpretation of
message passing, three rectangles, labeled as BIF-IN, BIF and BIF-OUT, have been drawn; this allow us to easily identify the portions
of the graphical model involved in a) the *conversion* of the
statistical information provided from BIF into a form useful to BIF, b) BIF processing and c) the *generation* of the
statistical information made available by BIF to BIF,
respectively. A detailed description of the signal processing tasks
accomplished within each portion is provided below.

BIF-IN - The statistical information provided by BIF to the
considered SISO module is condensed in the messages and ; these convey a
smoothed estimate of and pseudo-measurement
information about , respectively. The first
message is processed in two different ways. In fact, on the one hand, it is
marginalised in the block labelled by the letter M (see Fig. 2) in
order to generate the pdf (do not
forget that the state vector is included in ); on the other hand, is
processed jointly with in order to
generate the message conveying
pseudo-measurement information about (this is
accomplished in the block called PM* conversion*, PMC; see Fig. 2). Then, the messages and are passed to BIF.

BIF - The message passing accomplished in this part refers to the BIF
algorithm paired with F. The graphical model developed for it and the
message passing accomplished over it are based on Fig. 1-b). Note,
however, that: a) the message passing aims at computing the (backward)
predicted density and
the (backward) filtered density and on the
basis of the backward estimate originating from the previous recursion, and of the messages and
provided by BIF-IN; b) an approximate model of the considered SSM
could be adopted in the evaluation of these densities. For this reason,
generally speaking, we can assume that the BIF algorithm is based on
the *Markov model* and on the *observation model* ,
representing the *exact* models and , respectively, or *approximations* of one or both of them. Note also that, in both the second
measurement update and the time update accomplished by this algorithm,
marginalization with respect to the unknown state component is made possible by the availability of the message .

BIF-OUT - This part is fed by the backward estimate of
and by the smoothed estimate of (available after that the first measurement update has been
accomplished by F). The second message follows two different paths,
since a) it is passed to the other SISO module as it is and b) it is jointly
processed with in
order to generate the pseudo-measurement message feeding the other SISO module; the last task is accomplished in
the *pseudo-measurement generation* (PMG) block.

A graphical model structurally identical to the one shown in Fig. 2 can be easily drawn for the SISO module based on BIF by interchanging () with (). Merging the graphical model shown in Fig. 2 with its counterpart referring to BIF results in the parallel concatenation architecture illustrated in Fig. 3 (details about the underlying graphical model are omitted for simplicity) and on which TS is based. It is important to point out that:

1. The overall graphical model derived for TS, unlike the one illustrated in Fig. 1, is not cycle free; therefore, the application of the SPA to it requires defining a proper message scheduling and, generally speaking, results in iterative algorithms.

2. At the end of the -th recursion of a TS algorithm, two smoothed densities, namely and , are available. This raises the problem of how these statistical information can be fused in order to get a single pdf for (and, in particular, a single smoothed estimate of) the -dimensional portion of estimated by both F/BIF and F/BIF. Unluckily, this remains an open issue. In our computer simulations, a simple selection strategy has been adopted in state estimation, since one of the two smoothed estimates of has been systematically discarded.

### 3.2 A Graphical Model for the Parallel Concatenation of the Bayesian Information Filters Paired with an Extended Kalman Filter and a Particle Filter

In the remaining part of this manuscript we focus on a specific instance of the proposed TS architecture, since we make the same specific choices as [7] for both the SSM and the filters employed in the forward pass. In particular, we focus on the CLG SSM described in Section 2 and assume that:

1) BIF is the backward filter associated with an EKF operating over the whole system state (so that and is empty). In other words, BIF is a backward Kalman filter based on a linearised model of the considered SSM.

2) BIF is a backward filter associated with a PF (in particular, a
*sequential importance resampling* filter [14])
operating on the nonlinear state component only (so that and ) and representing it through a set of particles (note
that elements of the system state are shared by the two BIF
algorithms). This means that BIF is employed to compute new weights
for all the elements of the particle set generated by the PF in the forward
pass.

Based on the general models shown in Figs. 2 and 3, the specific graphical model illustrated in Fig. 4 (and referring to the -th recursion of backward filtering) can be drawn for the considered case. In the following, we provide various details about the adopted notation and the message passing within each constituent filter and from each filter to the other one.

*Message passing within* BIF - BIF is based on the *approximate* statistical models
and ; these are derived from the
linearised eqs. (3) and (4),
respectively. Moreover, the (Gaussian) messages passed over its graph
(enclosed within the upper rectangle appearing in Fig. 4) are , , , , , (), and , and are denoted , , , , , (), and , respectively, to ease reading.

*Message passing within* BIF - BIF is based on the *exact* statistical models , and , that are derived from the eqs. (5) (with ) and (6), respectively. Moreover, the messages processed by it
and appearing in Fig. 4 refer to the -th particle *predicted* in the previous (i.e., in the -th) recursion of forward
filtering and denoted , with ;
such messages are , , , , , , and , and are denoted , , , , , , and ,
respectively, to ease reading.

*Message passing from* BIF *to* BIF - BIF is fed
by the message and the message set conveying pseudo-measurement
information; these messages are computed on the basis of the statistical
information made available by BIF. More specifically, on the one hand,
the message (denoted ) results from the
marginalization of and is employed for
marginalising the PF state update and measurement models (i.e., , and , respectively) with respect
to . On the other hand, the pseudo-measurement message
(denoted ) is evaluated in the PMG block by processing the messages and
(denoted and resulting from the marginalization of ), under the assumption that is represented by the -th particle (conveyed by the message
).

As illustrated in the Appendix, the computation of the message involves the evaluation of the pdf of the random vector

(14) |

defined on the basis of the state update equation (5) (with ) and conditioned on the fact that . This pdf, which is computed according to the joint statistical characterization of and provided by BIF, is conveyed by the message (not appearing in Fig. 4). Note also that from eq. (5) (with ) the equality

(15) |

is easily inferred; the pdf of evaluated on the basis
of the *right-hand side* (RHS) of eq. (15) is denoted in the following.

*Message passing from* BIF *to* BIF - BIF is fed
by the message that, unlike the set passed to BIF, provides pseudo-measurement
information about the *whole state* . This message is
generated as follows. The message set
produced by the PF is processed in the PMG block, that
computes the set of pseudo-measurement messages referring to the linear state component only. Then, the two
sets and are merged in the PMC block, where the
information they convey are *converted* into the (single) message . Moreover, as illustrated in the Appendix, the
message conveys a sample of the random
vector [13]

(16) |

such a sample is generated under the assumption that . The pdf of the random vector is evaluated on the basis of the joint statistical representation of the couple , produced by BIF and is conveyed by the message (not appearing in Fig. 4); note also that from eq. (5) (with ) the equality

(17) |

is easily inferred; the pdf of evaluated on the basis of the RHS of eq. (17) is denoted in the following.

The rationale behind the message passing illustrated above can be summarized as follows. The message is extracted from the statistical information generated by BIF and is exploited by BIF to refine its backward estimate of the whole state; moreover, merging this estimate with the forward estimate allows to generate a more accurate statistical representation for and, consequently, for (these are conveyed by and , respectively); finally, these statistical information are exploited to aid BIF in the computation of more refined weights of the particles representing .

Given the graphical model shown in Fig. 4 and the messages passed over it, the derivation of a specific BITF algorithm requires: a) defining the mathematical structure of the input messages that feed the -th recursion of backward filtering and that of the output messages emerging from both backward filtering and smoothing in the same recursion; b) describing message scheduling; c) deriving mathematical expressions for all the computed messages. These issues are analysed in detail in Section 4.

## 4 Scheduling and Computation of Probabilistic Messages in Turbo Smoothing Algorithms for CLG Models

In this Section, the specific issues raised at the end of the previous Section and concerning the message passing accomplished over the graphical model shown in Fig. 4 are addressed. For this reason, we first provide various details about a) the messages feeding backward filtering, and b) the messages emerging from it and from the related smoothing. Then, we focus on the scheduling of such messages and on their computation. This allows us to develop two new smoothing techniques, one solving problem P.1, the other one problem P.2. Finally, these techniques are briefly compared with other particle smoothing methods available in the literature.

### 4.1 Input and Output Messages

The *input* messages feeding the -th recursion of backward
filtering are generated in the -th recursion of the paired forward
filtering and in the previous recursion (i.e., in the -th
recursion) of the backward pass. In the following, various details about
such messages are provided.

1. *Input messages evaluated in the forward pass* - A turbo filter,
consisting of an EKF (denoted F) and a PF (denoted F), is
employed in the forward pass of the devised TS algorithms and is run only
once. Therefore, the forward predictions/estimates, provided by F (F) and made available to BIF (BIF), are expressed by
Gaussian pdfs (sets of weighted particles), each conveyed by a Gaussian
message (by a set of particle-dependent messages). The notation adopted in
the following for these probabilistic information is summarized below.

Filter F - This filter, in its -th recursion, computes the
*forward prediction* of , conveyed by the message^{2}^{2}2Considerations similar to the ones expressed for (18) and (19) can be repeated for the messages (22) and (23), respectively, defined below. (see Fig. 4)

(18) |

This message is updated in the -th recursion of F on the basis of the measurement . This produces the Gaussian message

(19) |

representing a *forward estimate* of ; the covariance
matrix and the mean vector can
be evaluated on the basis of the associated precision matrix (see [11, eqs.
(14)-(17)])

(20) |

and of the transformed mean vector

(21) |

respectively; here, , and . The message (18) enters the graphical model developed for BIF (see Fig. 4) along the half edge referring to .

Filter F - This filter, in its -th recursion, computes the
particle set , representing a *forward prediction* of ; the weight assigned to the particle is equal to for any , since the use of *particle resampling* in each recursion is assumed. The statistical
information available about are conveyed by the
message

(22) |

with . The weight of (with ) is updated by F in its -th recursion on the basis of the measurement ; the new weight is denoted and is conveyed by the forward message

(23) |

Note that the message set
represents the *forward estimate* of computed by
F in its -th recursion and that the message set (see eq. (22)) enters the graphical
model developed for BIF along the half edge referring to (see Fig. 4).

2. *Input messages evaluated in the backward pass* - The -th
recursion of backward filtering is fed by the input messages

(24) |

and

(25) |

that convey the pdf of the backward estimate of computed by BIF and the backward estimate of generated by BIF, respectively, in the previous recursion.

All the input messages described above are processed to compute: 1) the new backward estimates and , that represent the outputs emerging from the -th recursion of backward filtering; 2) the required smoothed information (in the form of probabilistic messages) by merging forward and backward messages. In the remaining part of this Paragraph, some essential information about the structure of such messages are provided; details about their computation are given in the next Paragraph.

1. *Computation of backward estimates* - The computation of the message
(BIF) and of the
message set (BIF) is accomplished as follows. First, the *backward prediction*

(26) |

of and the message

(27) |

conveying a *backward weight* for the -th particle representing (with ) are computed by BIF and BIF, respectively.
Then, in BIF, the message (26) is merged with the pseudo-measurement message and the measurement message
in order to compute

(28) |

and (see eq. (24))

(29) |

respectively. Similarly, in BIF, the message (27) is merged first with the pseudo-measurement message in order to produce the message (see eq. (25))

(30) |

conveying a new weight for the -th particle . Then, the information conveyed by the message set is merged with that provided by the measurement-based set in order to evaluate the message (see eq. (25))

Comments

There are no comments yet.