1 Introduction
The nonlinear filtering problem in continuoustime is found in many applications in finance, economics and engineering; see e.g. [1]. We consider the case where one seeks to filter an unobserved diffusion process (the signal) with access to an observation trajectory that is, in theory, continuous in time and following a diffusion process itself. The nonlinear filter is the solution to the KallianpurStriebel formula (e.g. [1]) and typically has no analytical solution. This has lead to a substantial literature on the numerical solution of the filtering problem; see for instance [1, 7].
In practice, one has access to very highfrequency observations, but not an entire trajectory and this often means one has to time discretize the functionals associated to the path of the observation and signal. This latter task can be achieved by using the approach in [18], which is the one used in this article, but improvements exist; see for instance [5, 6]. Even under such a timediscretization, such a filter is not available analytically, for most problems of interest. From here one must often discretize the dynamics of the signal (such as Euler), which in essense leads to a highfrequency discretetime nonlinear filter. This latter object can be approximated using particle filters in discrete time, as in, for instance, [1]; this is the approach followed in this article. Alternatives exist, such as unbiased methods [9] and integrationbyparts, change of variables along with FeynmanKac particle methods [7], but, each of these schemes has its advantages and pitfalls versus the one followed in this paper. We refer to e.g. [6] for some discussion.
Particle filters generate samples (or particles) in parallel and sequentially approximate nonlinear filters using sampling and resampling. The algorithms are very well understood mathematically; see for instance [7] and the references therein. Given the particle filter approximation of the discretized filter, using an Euler method for the signal, one can expect that to obtain a mean square error (MSE), relative to the true filter, of , for arbitrary, such that the associated cost is . This follows from standard results on discretizations and particle filters. In a related context of regular, discrete time observations and dynamics, with the signal following a diffusion, [15] (see also [14]), show that when the MSE for a particle filter is , the cost is and one can improve particle filters using the multilevel Monte Carlo (MLMC) method [11, 12], as we now explain.
MLMC is an approach which can help to approximate expectations w.r.t. probability measures that are induced by discretizations, such as an Euler method. The idea is to create a collapsing sum representation of an expectation w.r.t. an accurate discretization and interpolate with differences of expectations of increasingly coarse (in terms of the discretization) probability measures. Then, if one can sample from appropriate couplings of the pairs of probability measures in the differences of the expectations, one can reduce the computational effort to achieve a given MSE. In the case of
[15], one can achieve a MSE , for cost for a class of processes.In this paper we apply the methodology of [15], which combines particle filters with the MLMC methodology (termed the multilevel particle filter), to the nonlinear filtering problem in continuoustime. The main issue is that inorder to mathematically understand the application of this methodology to this new context several new results are required. The main difference to the case of [15], other than the processes involved, is the fact that one averages over the data in the analysis of filters in continuoustime. This requires one to analyze the properties of several timediscretized FeynmanKac semigroups, in order to verify the mathematical improvements of the approach (see also [10]). Under assumptions, we prove that to achieve a MSE one requires a cost
. This is verified in several numerical examples. We remark that the mathematical results are of interest beyond the context of this article, for instance, unbiased estimation; see
[2] for example.2 Problem
2.1 Notations
Let be a measurable space. For we write as the collection of bounded measurable functions. Let , denotes the collection of realvalued functions that are Lipschitz w.r.t. ( denotes the
norm of a vector
). That is, if there exists a such that for anyWe write as the Lipschitz constant of a function . For , we write the supremum norm . denotes the collection of probability measures on . For a measure on and a , the notation is used. denote the Borel sets on . is used to denote the Lebesgue measure. For a measurable space and
a nonnegative measure on this space, we use the tensorproduct of function notations for
, . Let be a nonnegative operator and be a measure then we use the notations and for , For the indicator is written . (resp. ) denotes andimensional Gaussian distribution (density evaluated at
) of mean and covariance . If we omit the subscript . For a vector/matrix , is used to denote the transpose of . For , denotes the Dirac measure of , and if with , we write . For a vectorvalued function in dimensions (resp. dimensional vector), (resp. ) say, we write the component () as (resp. ). For a matrix we write the entry as . For anda random variable on
with distribution associated to we use the notation . For a finite set , we write as the cardinality of .2.2 Model
Let be a measurable space. On consider the probability measure and a pair of stochastic processes , , with , , , with given:
(1)  
(2) 
where , , with nonconstant and of full rank and are independent standard Brownian motions of dimension and respectively. To minimize certain technical difficulties, the following assumption is made throughout the paper:

We have:

is bounded with , and is uniformly elliptic.

are bounded and , .

Now, we introduce the probability measure which is equivalent to defined by the RadonNikodym derivative
with, under , following the dynamics (2) and independently is a standard Brownian motion. We have the solution to the Zakai equation for
where is the filtration generated by the process . Our objective is to, recursively in time, estimate the filter, for
2.3 Discretized Model
In practice, we will have to work with a discretization of the model in (1)(2), for several reasons:

One only has access to a finite, but possibly very high frequency data.

is typically unavailable analytically.
We will assume access to path of data which is observed at a high frequency, as mentioned above.
Let be given and consider an Euler discretization of stepsize , , :
(3) 
It should be noted that the Brownian motion in (3) is the same as in (2) under both and . Then, for define:
and note that for any
is simply a discretization of (of the type of [18]). Then set for
For notational convenience . For one can also set
where we define .
3 Approach
For notational convenience, throughout this Section, we omit the notation from for the Euler discretization.
3.1 Particle Filter
Let be given, we consider approximating using a particle filter. For set
For , we define, for any ,
Set, with
Denote by the joint Markov transition of defined via the Euler discretization (3) and a Dirac on a point : for ,
For , define the operator with as:
(4) 
where, to clarify, . Now, define, for ,
Then one can establish that for
Moreover, for
(5) 
The objective of the PF is to provide an approximation of the formulae (4) and (5).
Let be given, then the particle filter generates a system of random variables on at a time according to the probability measure
where for
The particle filter is summarized in Algorithm 1. For one can approximate via
(6) 
For one can also estimate the filter at time , , as
3.2 Coupled Particle Filter
Let be given, in multilevel estimation, the basic idea is to approximate, for
Normally is chosen to target a specific bias and this is the strategy considered here. There is a complication as also determines the level frequency of the data that are used  this is discussed below. We focus on the term , , as one can use the PF above for approximating the term (see (6)).
For (resp. ), we define (resp. )
(resp. ). The following exposition closely follows [13], with modifications to the context here. Let be a Markov kernel, for paths and contructed by using the same Brownian increments in the discretization (3) (see e.g. [11] or [16, Section 3.3]). Let be a Markov kernel defined for
Note that for any ,
Let and define the probability measure:
(7)  
where for
and for and
Define for
and for
(8) 
Now it can be shown that (see [15]) that for
(9) 
The objective of the coupled particle filter (CPF) is to provide an approximation of the formulae (7) and (9).
For set . A CPF for sequentially approximating is then generated on at a time according to the probability measure
where for
To run a CPF, one must understand how to sample from which is detailed in Algorithm 2. The CPF is then described in Algorithm 3. Then one can approximate , via (9)
(10) 
For one can also estimate the differences of the filter at time , , as
Comments
There are no comments yet.