1 Introduction
A stochastic hybrid system (SHS, hereafter), also known as a switching diffusion [3], is a continuous-time Markov process with state space consisting of both discrete and continuous parts, namely, and respectively. The elements are, without loss of generality, specified as the standard basis of in this article. Denoting by the inner product of or , , where is Kronecker’s delta. The discrete part of , denoted by
, can be seen as a continuous-time semi-Markov chain with state space
and “ matrix” of the form , where is the continuous part of . In other words,(1) |
as . Here, is a matrix for each , that is, for and for all . The continuous part is a semi-Markov process on and defined as the solution of a stochastic differential equation
for some -valued function , where and is a
dimensional white noise. The generator
of this Markov process is given by(2) |
There is a huge amount of literature on the analysis and applications of SHS. See e.g., [3, 4, 7, 5] and the references therein. The author’s motivation to study SHS is its potential application to the analysis of single-molecule dynamics which has several unobservable switching states [2]. In this article, we consider parametric estimations of the Q matrix and of the drift coefficient based on a partial observation where is monitored continuously in time, while is unobserved. When both and do not depend on , the system is a hidden Markov model studied in Elliot et al. [1]. Extending results developed in [1], we derive a finite-dimensional filter and the EM algorithm for the SHS. In Section 2, we describe the basic properties of SHS as the solution of a martingale problem. In Section 3, we derive the likelihood function under complete observations of both and on a time interval . In Section 4, we consider the case where the discrete part is unobservable, and construct a finite dimensional filter extending Elliot et al. [1]. In Section 5, again by extending Elliot et al. [1], we construct the EM algorithm for parametric estimations under the partial observation.
2 A construction as a weak solution
Here we construct a SHS as a weak solution, that is, we construct a
distribution on the path space
which is a solution of the martingale problem with the
generator (2).
A direct application of Theorem (5.2) of Stroock [6] provides the following.
Theorem 2.1
Let be a bounded Borel function and
, be bounded continuous
functions. Then, for any ,
there exists a unique probability measure
is a martingale under for any , where is the canonical map on . Moreover, is a strong Markov process with .
The uniqueness part of Theorem 2.1 is important in this
article.
For the existence, we give below an explicit construction, which plays
a key role
to solve
a filtering problem later.
First, we construct a SHS with in a pathwise manner. Without loss of generality, assume . Note that is then a dimensional Brownian motion. Let be a probability space on which a dimensional Brownian motion
and an i.i.d. sequence of exponential random variables
that is independent of are defined. Conditionally on , a time-inhomogeneous continuous-time Markov chain with (1) is defined using the exponential variables. More specifically, given , letand for , with . The construction goes in a recursive manner; given , let
and for , with
. Properties of the exponential distribution verifies the following lemma.
Lemma 2.1
By Itô’s formula, for any , we have
from which together with (1) it follows
where is the expectation under and with in (2). Note only this, we have also that
is a martingale with respect to the filtration generated by . Even more importantly, Lemma 2.1 implies the following.
Lemma 2.2
Under the same conditions of Lemma 2.1, for any ,
is a martingale with respect to the natural filtration of under the conditional probability measure given , where with . In particular,
is a martingale under the conditional probability measure .
Now we construct a SHS for a general bounded Borel function . Let
(4) |
By the boundedness of , Novikov’s conditions is satisfied and so, is an -martingale under . Therefore,
defines a probability space .
Theorem 2.2
Let be a Q matrix-valued bounded continuous function and be an -valued bounded Borel function. Under , is a Markov process with generator (2). Further for any ,
is an martingale.
Proof: By the Bayes formula,
Since
and is Markov under , , meaning that it is Markov under as well. By Itô’s formula,
and
Therefore,
meaning that is a martingale under . The Bayes formula then implies that is a martingale under . In particular, the generator is given by . ////
Corollary 2.1
By the uniqueness result of Theorem 2.1, the law of under coincides with with .
3 The likelihood under complete observations
Here we consider a statistical model and derive the likelihood under complete observation of a sample path on a time interval . For each , denotes the distribution on induced by a Markov process with generator
where is a family of -valued bounded Borel functions and is a family of matrix-valued bounded continuous functions. Note that is almost surely identified from a path of by computing its quadratic variation. It is therefore assumed to be known hereafter. The initial distribution is also assumed to be known and not to depend on .
Theorem 3.1
Let and assume that
are bounded for each , where . Then, is equivalent to , and the log likelihood
is given by
where is the counting process of the transition from to :
(5) |
Proof: The proof is standard but given for the readers’ convenience. Let
and
By Itô’s formula,
and by (5),
(6) |
Therefore, by Corollary 2.1, and are orthogonal local martingales under . The assumed boundedness further implies that they are martingales. This implies that is a martingale under .
It only remains to show that is a martingale under for any , where
By Itô’s formula,
and
Since
we have
Consequently, we have
which is a martingale under by (6). ////
4 A finite-dimensional filter
Here we extend the filtering theory of hidden Markov models developed by Elliot et al. [1] to the SHS
where is a martingale (recall Corollary 2.1).
In this section we assume we observe only a continuous sample path
on a time interval
while is hidden.
The system is a hidden Markov model in [1] when both and
do not depend on .
By this dependence, is not independent of and so,
the argument in [1] cannot apply here any more.
We however show in this and the next sections
that the results in [1] remain valid.
Namely, a finite-dimensional filter and the
EM algorithm can be constructed for
the SHS. A key for this is Lemma 2.2.
Denote by the natural filtration of . The filtering problem is to infer from the observation of , that is, to compute . The smoothing problem is to compute for . Denote for a given integrable random variable , where is the expectation under in Section 2. For a given process , the Bayes formula gives
(7) |
where is defined by (4). Denoting , , we can write .
Theorem 4.1
Under the same conditions of Theorem 2.2, if is of the form
(8) |
where are bounded predictable processes, then
(9) |
Proof: Itô’s formula gives
Take the conditional expectation under given to get (9). Here, we have used the fact that is a dimensional Brownian motion under as well as Lemma 2.2. ////
Theorem 4.2
Note that
and so, (10
) is a linear equation on the vector valued process
that is easy to solve. Then (11) is also solved, and is obtained from (7).Theorem 4.3
Under the same conditions of Theorem 2.2, for each , for any ,
Proof: Let in
(9). ////
This is also a linear equation and so, the smoothing problem is easily solved via (7).
5 The EM algorithm
Here we consider again the parametric family introduced in Section 3. We assume that a continuous sample path is observed on a time interval while is hidden. We construct the EM algorithm to estimate . Under the same assumptions as in Theorem 3.1, the law of under is equivalent to that under and the log likelihood function is given by
The maximum likelihood estimator is therefore given by
Note that does not depend on the choice of because by the Bayes formula,
(12) |
for any . Now, we recall the idea of the EM algorithm. Let
By Jensen’s inequality and (12),
which means that the sequence defined by
makes increasing.
Under an appropriate condition the sequence converges to
the maximum likelihood estimator , for which we refer to Wu [8].
The computation of is a filtering problem for which can apply the results in Section 3. Now we state the main result of this article.