## 1 Introduction

The *nonlinear filtering problem* consists of inferring the *posterior distribution* of the hidden state of a nonlinear dynamic system
from a set of past and present measurements [1]. A
general recursive solution to this problem, known as *Bayesian filters* (e.g., see [1, Sect. II, eqs. (3)-(5)]), is available,
but, unluckily, can be put in closed form in few cases [4].
In the past, various filtering methods generating a *functional
approximation* of the desired posterior pdf have been developed; these can
be divided into *local* and *global* methods on the basis of the
way the posterior pdf is approximated [2], [3]. On the one hand, local techniques, like

*extended Kalman filtering*

*sequential Monte Carlo*(SMC) algorithms [5], [6] (also known as

*particle filtering*, PF [7], [8]) may achieve high accuracy at the price, however, of unacceptable complexity and numerical problems. These considerations have motivated the investigation of other methods able to achieve high accuracy under given computational constraints. Some of such solutions are based on the idea of

*combining*(i.e.,

*concatenating*)

*local and global methods*; relevant examples of this approach are represented by a)

*marginalized particle filtering*(MPF) [9] and other techniques related to it (e.g., see [3] and [10]) and b)

*cascaded architectures*based on the joint use of EKF and PF (e.g., see [11] and [12]). Note that, in all these cases, two heterogeneous methods are combined in a way that the resulting filtering algorithm is

*forward only*and, within its recursion, each of such methods is executed only once; for this reason, if the jargon of

*coding theory*is adopted in this context, such filtering algorithms can be seen as specific instances of the general concept of

*serial concatenation*[13], [14] of two (

*constituent*) filtering methods.

In this manuscript, we focus on the novel concept of *parallel
concatenation* (PC) of Bayesian filterings, i.e. on the idea of combining
two (constituent) filters in a way that, within each recursion of the
resulting concatenated algorithm, they can iteratively refine their
statistical information through the mutual exchange of probabilistic (i.e.,
*soft*) information; this concept is dubbed *turbo filtering* (TF)
for its resemblance to the iterative (i.e., *turbo*) decoding of
concatenated channel codes [15]. More specifically, we first
develop a *general graphical model* that allows us to: a) represent the
PC of two Bayesian filters as the interconnection of *two soft-in
soft-out *(SISO) modules, b) represent the iterative processing accomplished
by these modules as a message passing technique and c) to derive the
expressions of the passed messages by applying the *sum-product
algorithm* (SPA) [16], [17], together
with a specific *scheduling* procedure, to the graphical model itself.
Then, the usefulness of this approach is exemplified by developing two TF algorithms for the class of *conditionally linear Gaussian* (CLG) SSMs
[9]. Our computer simulations for a specific CLG SSM evidence
that, in the considered case, these algorithms perform very closely to MPF,
but are substantially faster.

It is worth mentioning that the TF principle has been formulated for the
first time in [18], where it has also been successfully
applied to inertial navigation. However, all the theoretical results
illustrated in this manuscript have been obtained later and have been
inspired by various results available in the literature about: a) the
representation of filtering methods as *message passing procedures on
factor graphs* (e.g., see [16], [17] and
[19]); b) the use of graphical models in the derivation and
interpretation of *turbo decoding* and *turbo equalization* [16], [17], [20].

The remaining part of this manuscript is organized as follows. A description of the considered SSM is illustrated in Section 2. In Section 3 a new graphical model describing the TF principle is devised; then, a specific case of that model, referring to the use of an extended Kalman filter and particle filter as constituent filters, and a CLG SSM is analysed. The derivation of two TF algorithms based on the last model is illustrated in Section 4, whereas their interpretation from a coding theory perspective is discussed in Section 5. Such algorithms are compared with EKF and MPF, in terms of accuracy and execution time, in Section 6. Finally, some conclusions are offered in Section 7.

## 2 Model Description

In the following we focus on a discrete-time CLG SSM [9],
whose -dimensional *hidden state* in the -th interval is partitioned
as ; here, , ( ) is the so called
*linear *(*nonlinear*) *component* of , with (). Following [9] and [10],
the models

(1) |

and

(2) | |||||

are adopted for the update of the *linear* () and *nonlinear*
() components, and for the

-dimensional vector of noisy measurements available in the

-th interval, respectively. In the*state update model*(1) () is a time-varying -dimensional real function ( real matrix) and is the -th element of the process noise sequence ; this sequence consists of -dimensional

*independent and identically distributed*(iid) Gaussian noise

*vectors, each characterized by a zero mean and a covariance matrix (independence between and is also assumed for simplicity). Moreover, in the*

*measurement model*(2), is a time-varying real matrix, is a time-varying -dimensional real function and the -th element of the measurement noise sequence ; this sequence consists of -dimensional iid Gaussian noise vectors (each characterized by a zero mean and a covariance matrix ), and is independent of both and .

In the following we take into consideration not only the detailed models (1) and (2), but also their more compact counterparts

(3) |

and

(4) |

respectively, which refer to the whole state; here, () is a -dimensional function
(Gaussian noise vector^{1}^{1}1The covariance matrix of can be easily
computed on the basis of the matrices and .) deriving from the ordered concatenation of the vectors and ( and ; see (1)), and . Moreover, since EKF is employed in the TF algorithms developed
in the following, the *linearized* versions of (3) and (4) are also considered; these can be expressed as (e.g., see
[4, pp. 194-195])

(5) |

and

(6) |

respectively; here, ,

is the (forward) estimate of

evaluated by EKF in its -th recursion, , , is the (forward) prediction computed by EKF in its -th recursion and .In the following Section we focus on the so-called *filtering problem*,
which concerns the evaluation of the posterior pdf at an instant , given a) the initial pdf and b) the -dimensional *measurement* vector .

## 3 Graphical Modelling for Turbo Filtering

Let us consider first a SSM described by the *Markov model* and the *observation model* for any . In this case, the computation of the posterior
pdf for can be accomplished
by means of an exact *Bayesian recursive procedure*, consisting of a
*measurement update* (MU) step followed by a *time update* (TU)
step. Following [16, Sec. II, p. 1297], the equations
describing the -th recursion of this procedure (with ) can
be easily obtained by applying the SPA to the Forney-style* *FG shown
in Fig. 1, if the joint pdf
is considered in place of the associated a posteriori pdf . In fact, given the measurement message , if the input message^{2}^{2}2In the following the acronyms *fp* and *fe* are employed in the
subscripts of various messages, so that readers can easily understand their
meaning; in fact, the messages these acronyms refer to represent a form of
one-step *forward prediction* and of *forward estimation*,
respectively. enters this FG, the message going out of the *equality node* is given by

(7) | |||||

and, consequently, the message emerging from the *function node*
referring to the pdf is expressed by

(8) |

Eqs. (7) and (8) express the MU and the TU, respectively, that need to be accomplished in the -th recursion of Bayesian filtering.

Let us see now how the FG illustrated in Fig. 1 can be exploited
to devise a graphical model efficiently representing the TF concept. As
already stated in the Introduction, any TF scheme results from the *parallel concatenation *of two *constituent* Bayesian filters (denoted F and F in the following), that can iteratively improve their
accuracy through the exchange of their statistical information. In practice,
in developing TF techniques, the following general rules are followed:
R1) the constituent filters operate on *partially overlapped*
portions of system state; R2) the filter F (F) is the
core of a *processing module* (called *soft-in soft-out*, SISO,
module in the following) receiving statistical information from F (F) and generating new statistical information useful to F (F); R3) each constituent filter relies on *exact*
Markov/observation models or *approximate* (e.g., linearized) versions
of them. These rules can be motivated and implemented as follows. The *first rule* (i.e., R1) ensures that any TF filtering algorithm
contains a form of *redundancy*, that represents the first of the two
fundamental properties characterizing each *error correction method*
employed in digital communications [13]. In our general
description of a TF scheme, it is assumed that (see Fig. 2-(a)):
1) filter F (F) estimates the state vector () of size (), with (); 2) the portion () of not included in ( ) is *contained in* (*or at most coincides with*) (). This entails that: a) an *overall
estimate* of the system state can be generated on the basis
of the posterior pdfs of and evaluated by F and F, respectively; b) the portion of , consisting of

(9) |

elements, is estimated by both F and F. Consequently, rule R1
requires the parameter (9), that represents the
*degree of redundancy* of the overall filtering algorithm, to be
strictly positive.

The *second rule* (i.e., R2) has been inspired by the fact
that, generally speaking, iterative decoders of concatenated channel codes
are made of multiple SISO modules, one for each constituent code. The
implementation of this rule in TF requires accurately defining the *nature* of the statistical information to be passed from each constituent
filter to the other one. Actually, this problem has been already tackled in
the development of MPF, where the information passed from a particle filter
to a bank of Kalman filters takes the form of *pseudo-measurements*
(PMs) evaluated on the basis of the *mathematical constraints*
established by state update equations [9]. The use of PMs
allows us to exploit the *memory* characterizing the time evolution of
dynamic models (and representing the second fundamental property of each
*error correction method* employed in digital communications).
Moreover, PMs can be processed as they were *real measurements* [9]; for this reason, their use can be incorporated in the FG shown in Fig. 1 by including a new MU, i.e. by adding a new
equality node through which the message emerging from the first MU (i.e.,
from the MU based on real measurements) is merged with a message conveying PM
information. This idea is implemented in the graphical model^{3}^{3}3Note that *oriented* edges are used in our graphical models wherever
message passing along such edges can be accomplished along a single
direction only. shown in Fig. 2-(b) and providing a detailed
description of the *overall processing* accomplished by a SISO module
based on F (a similar model can be easily drawn for F by
interchanging the couple , with , in that figure). In fact, this model represents the F
filtering algorithm (F block), the *conversion* of the
statistical information provided from F into a form useful to F
(F-IN block) and the *generation* of the statistical information
made available by F to F (F-OUT block). Its structure can
be explained as follows:

1. The algorithm employed by F is based on the *Markov model* and on the *observation model* , that represent the *exact* models and , respectively, or *approximations*
of one or both of them (as required by the *third rule*, i.e. by
R3). The pdf of the state component (unknown to F) is provided by F
through the message . Morever, as already stated above, the
forward estimate of is computed by F in two distinct MU
steps, the first one involving the message (based on the *measurement* ), the second one involving the message (conveying the PM information computed by
F); these steps generate the messages and , respectively.

2. The forward estimate computed by F is passed to F
together with the PM message . The last message is evaluated on the
basis of the messages and , i.e. on the basis of the forward
estimates available *before* and *after* the second MU of F.
Note also that the computation of is carried out in the block called PM* generation* (PMG) inside the F-OUT block.

3. The statistical information made available by F to F is
condensed in the messages and . The message acquired by F can be computed by
*marginalizing* the message ,
since, generally speaking, is a portion of
(marginalization is accomplished in block labelled with the letter M in Fig. 2-(b)); moreover, is
processed jointly with to generate the
PM message (this is accomplished in the block called
PM* conversion*, PMC, inside the F-IN block).

Merging the graphical model shown in Fig. 2-(b) with its
counterpart referring to F results in the PC architecture shown in
Fig. 3. This model, unlike the one illustrated in Fig. 1, is *not cycle free*. For this reason, generally speaking, the
application of the SPA to it leads to *iterative algorithms* with no
natural termination and whose accuracy can be substantially influenced by
the adopted *message scheduling* [16], [17]. This consideration and the possibility of choosing
different options for F and F lead easily to the conclusion that
the graphical models shown in Figs. 2-(b) and 3 can be
employed to develop an entire *family* of filtering algorithms, called
*turbo filters*.

In the remaining part of this manuscript we focus on a specific instance of
the proposed PC architecture, since we make specific choices for both the
SSM and the two filters. In particular, we focus on the CLG SSM described in
Section 2 and assume that F is an* extended
Kalman filter* operating over the whole system state (so that and is an empty vector), whereas F is
a *particle filter* (in particular, a *sequential importance
resampling*, SIR, filter [1]) operating on the nonlinear
state component only (so that
and ); note that, in this case,
the degree of redundancy is (see (9)). Our
choices aim at developing a new concatenated filtering algorithm in which an
extended Kalman filter is aided by a particle filter in its most difficult
task, i.e. in the estimation of the nonlinear state component. Moreover, the
proposed TF scheme can be easily related to MPF, since the last technique can be
considered as a form of *serial concatenation* of PF with Kalman
filtering. However, our TF instance employs, unlike MPF, a *single*
(extended) Kalman filter in place of a bank of Kalman filters; morever, such
a filter estimates the whole system state, instead of its nonlinear
component only.
Based on the general models shown in Figs. 2-(b)
and 3, the specific graphical model illustrated in Fig. 4
can be drawn for the considered case. This model deserves the following
comments:

1. The upper (lower) rectangle delimited by a grey line allow to easily identify the message passing accomplished by EKF (PF).

2. Filter F is based on the *approximate* models and , that can be easily derived from the linearised eqs. (5) and (6), respectively. Moreover, the
(Gaussian) messages processed by it are , , , , and , and are denoted , , , , and , respectively, to ease reading.

3. Filter F is based on the *exact* models , and , that can be easily derived
from the eqs. (1) (with ) and (2),
respectively. Moreover, the messages processed by it and appearing in Fig. 4 refer to the -th particle *predicted* in the previous
(i.e. -th) recursion and denoted , with (where represents the overall number of
particles); such messages are , , , , and , and are denoted , , , , and , respectively, to ease reading.

4. The message () generated by F undergoes *marginalization* in the block
labelled with the letter M; this results in the message (), denoted (). Based on the general model shown in Fig. 2-b), we exploit
the messages and to compute the PM message (denoted ) in the block called PMG.
Moreover, is employed for
marginalising the PF state
update and measurement models (i.e., , and , respectively); this allows us to compute
the messages and , respectively.

5. The message produced by PF is
processed in the block called PMG in order to generate the PM
message (the message is not required in this case; see the next Section).
Moreover, the two sets and (each consisting of
messages) are merged in the block called PMC, where the
information they convey are *converted* into the (single) PM message feeding F.

6. At the end of the -th recursion, a single statistical model is
available for . On the contrary, two models are
available for , one particle-based, the other one
Gaussian, since this state component is shared by F and F; note
that the former model, unlike the second one, is able to represent a *multimodal* pdf.

Let us now focus on the evaluation of the PMs for the considered TF scheme.
On the one hand, the PM messages
evaluated for F are exploited to improve the estimation accuracy for
the *nonlinear state component only*. Their computation involves the
pdf of the random vector

(10) |

defined on the basis of the state update equation (1) (with ). This pdf need to be evaluated for each of the particles representing ; in the following, its expression associated with the -th particle (i.e., conditioned on ) and evaluated on the basis of the joint pdf of and provided by F is conveyed by the message . Note also that, based on (1) (with ), the vector (10) is expected to equal the sum

(11) |

that depends on only; the pdf of evaluated on the basis of (11) is denoted in the following.

On the other hand, the PM message evaluated
for F is expected to improve the estimation accuracy for the *whole state*. For this reason, in our TF techniques, its computation involves
the two message sets and , generated by F and referring to
the two distinct components of . The messages convey a particle-based representation of . The message ,
instead, represents the pdf of the random vector [9]

(12) |

conditioned on for any . This pdf is evaluated on the basis of the joint representation of the couple , produced by F and is conveyed by the message ; note also that, based on (1) (with ), the quantity (12) is expected to equal the sum

(13) |

that depends on and only; the pdf of evaluated on the basis of (13) is denoted in the following.

Two specific message scheduling for the graphical model shown in Fig. 4 are proposed in the following Section, where the computation of all the involved messages is also analysed in detail.

## 4 Message Passing in Turbo Filtering

In this Section two different options are considered for the scheduling of the messages appearing in Fig. 4. The first option consists in running EKF before PF within each iteration, whereas the second one in doing the opposite; the resulting algorithms are dubbed TF#1 and TF#2, respectively. The message scheduling adopted in TF#1 is represented in Fig. 5, that refers to the -th iteration accomplished within the -th recursion (with , where is the overall number of iterations); this explains why the superscripts and have been added to all the iteration-dependent messages appearing in Fig. 4.

As far as the evaluation of the messages passed in TF#1 and TF#2 is
concerned, this is mainly based on three *computational rules* (CR)
resulting from the application of the SPA to equality nodes and function
nodes. More specifically, the first computational rule, denoted CR1, applies
to an *equality constraint node*; if the messages and denote the
messages entering it, the message emerges
from it. In particular, if (with and ), then ; moreover, the precision matrix and
the transformed mean vector associated with
and , respectively, are given by (see [16, Table 2, p.
1303, eqs. (II.1) and (II.2)])

(14) |

and

(15) |

respectively, where and for , . The second computational rule, denoted CR2, applies to a node representing the function ; if the message denotes the message entering it, the message emerging from it is given by

(16) |

In particular, if and , then

(17) |

with and (see [16, Table 2,
p. 1303, eqs. (II.7) and (II.9); Table 3, p. 1304, eqs. (III.1) and (III.3) ]). Finally, the third computational rule, denoted CR3,
applies to a node representing the function and fed by the
message ; the output message is the *constant*
message

(18) |

where , , , , and is the size of .

In the following we show how, applying the above mentioned CRs, simple
formulas can be derived for all messages passed in the graphical model shown
in Fig. 5. However, before doing this, we need to define the *input* messages for the considered recursion; these are

(19) |

for the EKF (upper part of the graphical model) and the set of messages for the PF (lower part of the graphical model), where

(20) |

with ; in the following we also assume that the
available particles are collected in the set . On the other hand, the *output* messages are (for EKF) and (for PF); since, as shown below, the devised TF
algorithms preserve the mathematical structure of the filtered densities
from recursion to recursion,
and have the same functional form
as (19) and (20) (for any ),
respectively.

It is also worth mentioning that not all the messages appearing in Fig. 5 depend on the iteration index . More specifically, the following messages are computed only once:

1. The messages and evaluated by EKF in its *first* MU. In
particular, is computed as (see
Fig. 5)

(21) |

where is the message conveying the information provided by , whose statistical representation is expressed by the pdf (resulting from the linearised equation (6)); therefore, it can be expressed as

(22) |

or, equivalently, as (see [16, Table 3, p. 1304, eqs. (III.5) and (III.6) ])

(23) |

here, the covariance matrix and the mean vector can be evaluated from the associated precision matrix

(24) |

and the transformed mean vector

(25) |

respectively, and . Therefore, (21) can be put in the form

(26) |

where the covariance matrix and the mean vector can be evaluated from the associated precision matrix (see CR1, eq. (14))

(27) |

and the transformed mean vector (see CR1, eq. (15))

(28) |

respectively; here, and . The message , instead, is easily obtained from (26) by marginalizing the last message with respect to ; this produces

(29) |

where and are extracted from the mean and the covariance matrix of , respectively, since consists of the first elements of .

2. The output messages and (for any ), since they are evaluated on the basis of the forward estimates and computed by EKF and PF, respectively, in the last iteration.

In the following, a detailed description of the messages passed in TF#1 is provided. The formulas derived for this algorithm can be easily re-used in the computation the messages passed in TF#2; for this reason, after developing TF#1, we limit to providing a brief description of the scheduling adopted in TF#2.

The scheduling illustrated in Fig. 5 for TF#1 consists in computing the involved (iteration-dependent) messages according to the following order: 1) , ; 2) , ; 3) , ; 4) , . Therefore, the evaluation of these messages can be organized according to the four steps described below and to be carried out for , , , . Note that in our description of TF#1 scheduling, particle-dependent messages always refer to the -th particle (with that ) and that, generally speaking, the structure of the particle set changes from iteration to iteration, even if it preserves its cardinality; moreover, the particle set available at the beginning of the -th iteration is

Comments

There are no comments yet.