# Contrast function estimation for the drift parameter of ergodic jump diffusion process

In this paper we consider an ergodic diffusion process with jumps whose drift coefficient depends on an unknown parameter θ. We suppose that the process is discretely observed at the instants (t n i)i=0,...,n with Δn = sup i=0,...,n--1 (t n i+1 -- t n i) → 0. We introduce an estimator of θ, based on a contrast function, which is efficient without requiring any conditions on the rate at which Δn → 0, and where we allow the observed process to have non summable jumps. This extends earlier results where the condition nΔ 3 n → 0 was needed (see [10],[24]) and where the process was supposed to have summable jumps. Moreover, in the case of a finite jump activity, we propose explicit approximations of the contrast function, such that the efficient estimation of θ is feasible under the condition that nΔ k n → 0 where k > 0 can be arbitrarily large. This extends the results obtained by Kessler [15] in the case of continuous processes. Lévy-driven SDE, efficient drift estimation, high frequency data, ergodic properties, thresholding methods.

## Authors

• 6 publications
• 7 publications
• ### Joint estimation for volatility and drift parameters of ergodic jump diffusion processes via contrast function

In this paper we consider an ergodic diffusion process with jumps whose ...
10/25/2019 ∙ by Chiara Amorino, et al. ∙ 0

• ### State-dependent jump activity estimation for Markovian semimartingales

The jump behavior of an infinitely active Itô semimartingale can be conv...
11/15/2018 ∙ by Fabian Mies, et al. ∙ 0

• ### Nonparametric drift estimation for diffusions with jumps driven by a Hawkes process

We consider a 1-dimensional diffusion process X with jumps. The particul...
04/17/2019 ∙ by Charlotte Dion, et al. ∙ 0

• ### Parametric inference for multidimensional hypoelliptic diffusion with full observations

Multidimensional hypoelliptic diffusions arise naturally as models of ne...
02/08/2018 ∙ by Anna Melnykova, et al. ∙ 0

• ### Sequential tracking of an unobservable two-state Markov process under Brownian noise

We consider an optimal control problem, where a Brownian motion with dri...
08/03/2019 ∙ by Alexey Muravlev, et al. ∙ 0

• ### The nonparametric LAN expansion for discretely observed diffusions

Consider a scalar reflected diffusion (X_t)_t≥ 0, where the unknown drif...
02/06/2018 ∙ by Sven Wang, et al. ∙ 0

• ### Bayesian sequential least-squares estimation for the drift of a Wiener process

Given a Wiener process with unknown and unobservable drift, we seek to e...
01/16/2019 ∙ by Erik Ekström, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Diffusion processes with jumps have been widely used to describe the evolution of phenomenon arising in various fields. In finance, jump-processes were introduced to model the dynamic of asset prices ([21],[16]), exchange rates ([4]), or volatility processes ([3],[7]). Utilization of jump-processes in neuroscience can be found for instance in [6].

Practical applications of these models has lead to the recent development of many statistical methods. In this work, our aim is to estimate the drift parameter from a discrete sampling of the process solution to

 Xθt=Xθ0+∫t0b(θ,Xθs)ds+∫t0σ(Xθs)dWs+∫t0∫R∖{0}γ(Xθs−)z~μ(ds,dz),

where is a one dimensional Brownian motion and a compensated Poisson random measure, with a possible infinite jump activity. We assume that the process is sampled at the times where the sampling step goes to zero. Due to the presence of a Gaussian component, we know that it is impossible to estimate the drift parameter on a finite horizon of time. Thus, we assume that and the ergodicity of the process .

Generally, the main difficulty while considering statistical inference of discretely observed stochastic processes comes from the lack of explicit expression for the likelihood. Indeed, the transition density of a jump-diffusion process is usually unknown explicitly. Several methods have been developed to circumvent this difficulty. For instance, closed form expansions of the transition density of jump-diffusions is studied in [1], [17]. In the context of high frequency observation, the asymptotic behaviour of estimating functions are studied in [14], and conditions are given to ensure rate optimality and efficiency. Another approach, fruitful in the case of high frequency observation, is to consider pseudo-likelihood method, for instance based on the high frequency approximation of the dynamic of the process by the one of the Euler scheme. This leads to explicit contrast functions with Gaussian structures (see e.g. [24],[23],[20]).

The validity of the approximation by the Euler pseudo-likelihood is justified by the high frequency assumption of the observations, and actually proving that the estimators are asymptotic normal usually necessitates some conditions on the rate at which should tend to zero. For applications, it is important that the condition on is less stringent as possible.

In the case of continuous processes, Florens-Zmirou [8] proposes estimation of drift and diffusion parameters under the fast sampling assumption . Yoshida [25] suggests a correction of the contrast function of [8] that yields to the condition . In Kessler [15], the author introduces an explicit modification of the Euler scheme contrast such that the associated estimators are asymptotically normal, under the condition where is arbitrarily large. Hence, the result by Kessler allows for any arbitrarily slow polynomial decay to zero of the sampling step.

In the case of jump-diffusions, Shimizu [23] proposes parametric estimation of drift, diffusion and jump coefficients. The asymptotic normality of the estimators are obtained under some explicit conditions relating the sampling step and jump intensity of the process. These conditions on are more restrictive as the intensity of jumps near zero is high. In the situation where this jump intensity is finite, the conditions of [23] reduces to . In [10], the condition on the sampling step is relaxed to , when one estimates the drift parameter only.

In this paper, we focus on the estimation of the drift parameter, and our aim is to weaken the conditions on the decay of the sampling step in way comparable to Kessler’s work [15], but in the framework of jump-diffusion processes.

One of the idea in Kessler’s paper is to replace, in the Euler scheme contrast function, the contribution of the drift by the exact value of the first conditional moment

or some explicit approximation with arbitrarily high order when . In presence of jumps, the contrasts functions in [24] (see also [23], [10]) resort to a filtering procedure in order to suppress the contribution of jumps and recover the continuous part of the process. Based on those ideas, we introduce a contrast function (see Definition 1), whose expression relies on the quantity , where is some compactly supported function and . The function is such that vanishes when the increments of the data are too large compared to the typical increments of a continuous diffusion process, and thus can be used to filter the contribution of the jumps.

The main result of our paper is that the associated estimator converges at rate

, with some explicit asymptotic variance and is efficient. Comparing to earlier results (

[24], [23], [10]), the sampling step can be irregular, no condition is needed on the rate at which and we have suppressed the assumption that the jumps of the process are summable. Let us stress that when the jumps activity is so high that the jumps are not summable, we have to choose (see Assumption ).

Moreover, in the case where the intensity is finite and with the specific choice of being an oscillating function, we show that we can approximate our contrast function by a completely explicit one, exactly as in the paper by Kessler [15]. This yields to an efficient estimator under the condition , where is related to the oscillating properties of the function . As can be chosen arbitrarily high, up to a proper choice of , our method allows to estimate efficiently the drift parameter, under the assumption that the sampling step tends to zero at some polynomial rate.

The outline of the paper is the following. In Section 2 we present the assumptions on the process . The Section 3 contains the main results of the paper. In Section 3.1, we define the contrast function and state first order expansion for the quantity . The consistency and asymptotic normality of the estimator are stated in Section 3.2, and the explicit modification of the contrast function in the case of finite jump activity is presented in Section 3.3. The Section 4 is devoted to the statement of limit theorems useful to study the asymptotic behavior of the contrast function. The proofs of the main statistical results are given in Section 5, while the proofs of the limit theorems and some technical results are presented in the Appendix.

## 2 Model, assumptions

Let be a compact subset of and a solution to

 Xθt=Xθ0+∫t0b(θ,Xθs)ds+∫t0a(Xθs)dWs+∫t0∫R∖{0}γ(Xθs−)z~μ(ds,dz),t∈R+, (1)

where is a one dimensional Brownian motion, is a Poisson random measure associated to the Lévy process , with and is the compensated one, on . We denote

the probability space on which

and are defined.
We suppose that the compensator has the following form: , where conditions on the Levy measure will be given later.
The initial condition , and are independent.

### 2.1 Assumptions

We suppose that the functions , and satisfy the following assumptions:

ASSUMPTION 1: The functions , and, for all , are globally Lipschitz. Moreover, the Lipschitz constant of is uniformly bounded on .

Under Assumption 1 the equation (1) admits a unique non-explosive càdlàg adapted solution possessing the strong Markov property, cf [2] (Theorems 6.2.9. and 6.4.6.).

ASSUMPTION 2: For all there exists a constant such that admits a density with respect to the Lebesgue measure on ; bounded in and in for every compact . Moreover, for every and every open ball , there exists a point such that .

The last assumption was used in [18] to prove the irreducibility of the process . Other sets of conditions, sufficient for irreducibility, are in [18].

ASSUMPTION 3 (Ergodicity):

1. For all , .

2. For all there exists such that , if .

3. as .

4. as .

5. , we have .

Assumption 2 ensures, together with the Assumption 3, the existence of unique invariant distribution , as well as the ergodicity of the process , as stated in the Lemma 2 below.

ASSUMPTION 4 (Jumps):

1. The jump coefficient is bounded from below, that is

 infx∈R|γ(x)|:=γmin>0
2. The Lévy measure is absolutely continuous with respect to the Lebesgue measure and we denote .

3. We suppose that s.t., for all , , with , .

4. The jump coefficient is upper bounded, i.e. .

Assumptions 4.1 and 4.4 are useful to compare size of jumps of and . In the sequel we skip the specific case for simplicity as it is embedded in the case with a choice of arbitrarily close to .

ASSUMPTION 5 (Non-degeneracy): There exists some , such that for all

The Assumption ensures the existence of the contrast function defined in Section .

ASSUMPTION 6 (Identifiability): For all ,,

 ∫R(b(θ,x)−b(θ′,x))2a2(x)dπθ(x)>0

We can see that this last assumption is equivalent to

 ∀θ≠θ′,(θ,θ′)∈Θ2,b(θ,.)≠b(θ′,.). (2)

We also need the following technical assumption:

ASSUMPTION 7:

1. The derivatives , with and , exist and they are bounded if . If , for each they have polynomial growth.

2. The derivatives exist and they are bounded for each .

3. The derivatives exist and they are bounded for each .

Define the asymptotic Fisher information by

 I(θ)=∫R(˙b(θ,x))2a2(x)πθ(dx). (3)

ASSUMPTION 8: For all , .

###### Remark 1.

If , using Assumption 4.3 the stochastic differential equation (1) can be rewritten as follows:

 Xθt=Xθ0+∫t0¯b(θ,Xθs)ds+∫t0a(Xθs)dWs+∫t0∫R∖{0}γ(Xθs−)zμ(ds,dz),t∈R+, (4)

where .
This expression implies that follows diffusion equation in the interval in which no jump occurred.

From now on we denote the true parameter value by , an interior point of the parameter space that we want to estimate. We shorten for .
We will use some moment inequalities for jump diffusions, gathered in the following lemma:

###### Lemma 1.

Let satisfies Assumptions 1-4. Let and let .
Then, for all ,
1) for all , ,
2) for all , , .
3) for all , , .

The first two points follow from Theorem 66 of [22] and Proposition 3.1 in [24]. The last point is a consequence of the second one: ,

 E[|Xs+h|p|Fs]=E[|Xs+h−Xs+Xs|p|Fs]≤c(E[|Xs+h−Xs|p|Fs]+E[|Xs|p|Fs]),

where may change value line to line. Using the second point of Lemma 1 and the measurability of with respect to , it is upper bounded by . Therefore

 suph∈[0,1]E[|Xs+h|p|Fs]≤suph∈[0,1]c|h|(1+|Xs|p)+c|Xs|p≤c(1+|Xs|p).

### 2.2 Ergodic properties of solutions

An important role is playing by ergodic properties of solution of equation (1)
The following Lemma states that Assumptions are sufficient for the existence of an invariant measure such that an ergodic theorem holds and moments of all order exist.

###### Lemma 2.

Under assumptions 1 to 4, for all , admits a unique invariant distribution and the ergodic theorem holds:

1. For every measurable function satisfying , we have a.s.

 limt→∞1t∫t0g(Xθs)ds=πθ(g).
2. For all , .

3. For all , .

A proof is in [10] (Section 8 of Supplement) in the case , the proof relies on [18]. In order to use it also in the case we have to show that, taken even and , satisfies the drift condition , where and .
Using Taylor’s formula up to second order we have

 =c∫R∫10|z|2∥γ∥∞q(q−1)|x+szγ(y)|q−2F(z)dsdz=o(|x|q). (5)

Concerning the generator’s continuous part, we use the second point of Assumption 3 to get

 Acf⋆(x)=12σ2(x)q(q−1)xq−2+b(θ,x)qxxq−2≤o(|x|q)−cq|x|2xq−2≤o(|x|q)−cf⋆(x). (6)

By (5) and (6), the drift condition holds.

## 3 Construction of the estimator and main results

We exhibit a contrast function for the estimation of a parameter in the drift coefficient. We prove that the derived estimator is consistent and asymptotically normal.

### 3.1 Construction of the estimator

Let be the solution to (1). Suppose that we observe a finite sample

 Xt0,...,Xtn;0=t0≤t1≤...≤tn,

where is the solution to (1) with . Every observation time point depends also on , but to simplify the notation we suppress this index. We will be working in a high-frequency setting, i.e.

 Δn:=supi=0,...,n−1Δn,i⟶0,n→∞,

with .
We assume and as .
We introduce a jump filtered version of the gaussian quasi-likelihood. This leads to the following contrast function:

###### Definition 1.

For and , we define the contrast function as follows:

 Un(θ):=n−1∑i=0(Xti+1−mθ,ti,ti+1(Xti))2a2(Xti)(ti+1−ti)φΔβn,i(Xti+1−Xti)1{|Xti|≤Δ−kn,i} (7)

where

 mθ,ti,ti+1(x):=E[Xθti+1φΔβn,i(Xθti+1−Xθti)|Xθti=x]E[φΔβn,i(Xθti+1−Xθti)|Xθti=x] (8)

and

 φΔβn,i(Xti+1−Xti)=φ(Xti+1−XtiΔβn,i),

with a smooth version of the indicator function, such that for each , with and for each , with .
The last indicator aims to avoid the possibility that is big. The constant is positive and it will be choosen later, related to the development of (cf. Remark below).
Moreover we define

 mθ,h(x):=E[Xθhφhβ(Xθh−Xθ0)|Xθ0=x]E[φhβ(Xθh−Xθ0)|Xθ0=x].

By the homogeneity of the equation we get that depends only on the difference and so that we may denote simply as , in order to make the notation easier.

We define an estimator of as

 ^θn∈argminθ∈ΘUn(θ). (9)

The idea, with a finite intensity, is to use the size of in order to judge the existence of a jump in an interval . The increment of with continuous transition could hardly exceed the threshold with . Therefore we can judge a jump occurred if . We keep the idea even when the intensity is no longer finite.
With a such defined , using the true parameter value , we have that

 E[(Xti+1−mθ0,ti,ti+1(Xti))φΔβn,i(Xti+1−Xti)|Xti=x]=E[Xti+1φΔβn,i(Xti+1−x)|Xti=x]+
 −E[Xti+1φΔβn,i(Xti+1−Xti)|Xti=x]E[φΔβn,i(Xti+1−Xti)|Xti=x]E[φΔβn,i(Xti+1−Xti)|Xti=x]=0,

where we have just used the definition and the measurability of .
But, as the transition density is unknown, in general there is no closed expression for , hence the contrast is not explicit. However, in the proof of our results we will need an explicit development of (7).

In the sequel, for , we will denote for any function , where , is such that

 ∃c>0|Ri,n(θ,x)|≤c(1+|x|c)Δδn,i (10)

uniformly in and with independent of .
The functions represent the term of rest and have the following useful property, consequence of the just given definition:

 R(θ,Δδn,i,x)=Δδn,iR(θ,Δ0n,i,x). (11)

We point out that it does not involve the linearity of , since the functions on the left and on the right side are not necessarily the same but only two functions on which the control (10) holds with and , respectively.

We state asymptotic expansions for . The cases and yield to different magnitude for the rest term.

Case :

###### Theorem 1.

Suppose that Assumptions 1 to 4 hold and that and are given in definition and the third point of Assumption , respectively. Then

 E[φΔβn,i(Xθti+1−Xθti)|Xθti=x]=1+R(θ,Δ(1−αβ)∧(2−3β)n,i,x). (12)
###### Theorem 2.

Suppose that Assumptions 1 to 4 hold and that and are given in definition and the third point of Assumption , respectively. Then, for any ,

 E[(Xθti+1−x)φΔβn,i(Xθti+1−Xθti)|Xθti=x]=Δn,ib(x,θ)+ (13)
 −Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz+R(θ,Δ2−2βn,i,x).

There exists such that, for ,

 mθ,Δn,i(x)=x+Δn,ib(x,θ)+ (14)
 −Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz+R(θ,Δ2−2βn,i,x).

.
Case :

###### Theorem 3.

Suppose that Assumptions 1 to 4 hold and that and are given in definition and the third point of Assumption , respectively. Then

 E[φΔβn,i(Xθti+1−Xθti)|Xθti=x]=1+R(θ,Δ(1−αβ)∧(2−4β)n,i,x). (15)
###### Theorem 4.

Suppose that Assumptions 1 to 4 hold and that and are given in definition and the third point of Assumption , respectively. Then, for any ,

 E[(Xθti+1−x)φΔβn,i(Xθti+1−Xθti)|Xθti=x]=Δn,ib(x,θ)+ (16)
 −Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz+R(θ,Δ2−3βn,i,x).

There exists such that, for ,

 mθ,Δn,i(x)=x+Δn,ib(x,θ)+ (17)
 −Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz+R(θ,Δ2−3βn,i,x).
###### Remark 2.

The constant in the definition (7) of constrast function can be taken in the interval . In this way and so (14) or (17) holds for smaller than .
If it is not the case the contribution of the observation in the contrast function is just . However we will see that suppressing the contribution of too big does not effect the efficiency property of our estimator.

###### Remark 3.

In the development (13) or (16) the term is independent of , hence it will disappear in the difference , but it is not negligible compared to since its order is if and at most if . Indeed, by the definition of the function , we know that we can consider as support of the interval . If , using moreover the third point of Assumption 4 we get the following estimation:

 |Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz|≤R(θ0,Δ1n,i,Xti). (18)

Otherwise, if , we have

 |Δn,i∫R∖{0}zγ(x)[1−φΔβn,i(γ(x)z)]F(z)dz|≤c|Δn,i|∫c×[−Δβn,i∥γ∥∞,Δβn,i∥γ∥∞]c|z|−α=R(θ,Δ1+β(1−α)n,i,x),

with and , hence the exponent on is always more than .
We can therefore write in the first case

 mθ,Δn,i(x)=x+R(θ,Δn,i,x)=R(θ,Δ0n,i,x) (19)

and in the second

 mθ,Δn,i(x)=x+R(θ,Δ1+β(1−α)n,i,x)=R(θ,Δ0n,i,x). (20)
###### Remark 4.

In Theorem 3 we do not need conditions on because, for each and for each the exponent on is positive and therefore the last term of (15) is negligible compared to . In Theorem 4, instead, is a negligible function if and only if , it means that it must be and .
We have taken and so, since is always less than , these two conditions are always respected.

### 3.2 Main results

Let us introduce the Assumption that turns out starting from Theorems 1, 2, 3 and 4:

ASSUMPTION : We choose if . If on the contrary , then we take in .

The following theorems give a general consistency result and the asymptotic normality of the estimator , that hold without further assumptions on and .

###### Theorem 5.

(Consistency) Suppose that Assumptions 1 to 7 and hold and let of the definition of the contrast function (7) be in . Then the estimator is consistent in probability:

 ^θnP→θ0,n→∞.

Recalling that the Fisher information I is given by (3), we give the following theorem.

###### Theorem 6.

(Asymptotic normality) Suppose that Assumptions 1 to 8 and hold, and .
Then the estimator is asymptotically normal:

 √nΔn(^θn−θ0)L→N(0,I−1(θ0)),n→∞.
###### Remark 5.

Furthermore, the estimator is asymptotically efficient in the sense of the Hájek-Le Cam convolution theorem.
The Hájek

LeCam convolution theorem states that any regular estimator in a parametric model which satisfies LAN property is asymptotically equivalent to a sum of two independent random variables, one of which is normal with asymptotic variance equal to the inverse of Fisher information, and the other having arbitrary distribution. The efficient estimators are those with the second component identically equal to zero.

The model (1) is LAN with Fisher information (see [10]) and thus is efficient.

###### Remark 6.

We point out that, contrary to the papers [10] and [24], in this case there is not any condition on the sampling, that can be irregular and with that goes slowly to zero.

### 3.3 Explicit contrast in the finite intensity case.

In the case with finite intensity it is possible to make the contrast explicit, using the development of proved in the next proposition. We need the following assumption:

ASSUMPTION :

1. We have , and is a function.

2. We assume that , and are functions, they have at most uniform in polynomial growth as well as their derivatives.

Let us define , with and ; as in the Remark .

###### Proposition 1.

Assume that holds and let be a function that has compact support and such that on and , for . Then, for with some ,

 mθ,Δn,i(x)=x+⌊2(M+1)β⌋∑k=0A(k)