Log In Sign Up

Estimating Diffusion With Compound Poisson Jumps Based On Self-normalized Residuals

This paper considers parametric estimation problem of the continuous part of a jump dif- fusion model. The threshold based method was previously proposed in various papers, which enables us to distinguish whether observed increments have jumps or not, and to estimate unknown parameters. However, a data-adapted and quantitative choice of the threshold parameter is a subtle and sensitive problem, and still remains as a tough problem. In this paper, we propose a new and simple alternative based on the Jarque-Bera normality test, which makes us to attain the above two things without any sensitive fine tuning. We show that under suitable conditions the proposed estimator has a consistency property. Some numerical experiments are conducted.


page 1

page 2

page 3

page 4


Threshold estimation for jump-diffusions under small noise asymptotics

We consider parameter estimation of stochastic differential equations dr...

Optimizing DDPM Sampling with Shortcut Fine-Tuning

In this study, we propose Shortcut Fine-tuning (SFT), a new approach for...

Modeling High-Dimensional Data with Unknown Cut Points: A Fusion Penalized Logistic Threshold Regression

In traditional logistic regression models, the link function is often as...

Estimation in Tensor Ising Models

The p-tensor Ising model is a one-parameter discrete exponential family ...

Assessing, testing and estimating the amount of fine-tuning by means of active information

A general framework is introduced to estimate how much external informat...

1. Introduction

Suppose that we are given discrete-time but high-frequency observation from a solution to the one-dimensional diffusion with jumps described by


where the ingredients are given as follows.

  • is a standard Wiener process and a compound Poisson process associated with the Lévy measure

    for some probability distribution

    . Throughout we assume that .

  • The sampling times fulfills that


    where the terminal sampling time ; hereafter, we will largely abbreviate “” from the notation like and .

A well-known approach to estimate

is the threshold based method independently proposed in [5], [7], and [10]. In the method, we regard that the increment

contains the jump component if for a fixed jump-detection threshold , and estimate after removing such increments. It is shown that for a good satisfying a suitable rate, the estimator of has asymptotic normality at the same rate as diffusion models. Hence the method asymptotically achieves both the estimation of and the jump detection in observed data, while finite-sample performance of the threshold method strongly depends on the value of . Unfortunately, a data-adaptive and quantitative choice of the threshold in the jump-detection filter is a subtle and sensitive problem, and still remains as an annoying problem in practice; see [8], [9], as well as the references therein. Such problem can also be seen in other jump detection methods such as [1].

The primary objective of this paper is to formulate an intuitively easy-to-understand strategy, which can simultaneously estimate and detect jumps without any precise calibration of a jump-detection threshold. For this purpose, we utilize the approximate self-normalized residuals [6] based on the Gaussian quasi maximum likelihood estimator (GQMLE), which makes a classical Jarque-Bera type test [3] adapted to our model. More specifically, the hypothesis test whose significance level is

is constructed by the following manner: let the null hypothesis be of “no jump component” :

against the alternative hypothesis of “non-trivial jump component”:

Then, if the Jarque-Bera type statistic based on the self-normalized statistics introduced later is larger than a given percentile of the chi-square distribution with the degrees of freedom being

, we reject the null hypothesis ; and otherwise, we accept . For such a test, we can intuitively regard that the largest increment contains at least one jump (and it will turn out to be true) when the null hypothesis is rejected. Following this inspection, our proposed method is to iteratively conduct the test with removing the largest increments at each test until is accepted, and after that, we estimate the target parameter without removed increments. Our method enables us not only just to make a “pre-cleaning” of diffusion-like data sequence by removing big fluctuations which collapse the (approximate) Gaussianity of the self-normalized residuals, but also to approximately quantify jumps relative to continuous fluctuations in a natural way.

The rest of this paper is organized as follows: in Section 2, we will give a briefly summary of the GQMLE, the approximate self-normalized residuals and the Jarque-Bera test for our model. Section 3 provides the specific recipe of ours and an alternative estimator to GQMLE in order to reduce computational load. At last, we will show some numerical experiments of our method.

2. Preliminaries

In this section, we briefly review the construction of GQMLE, self-normalized residual, and Jarque-Bera statistics with its theoretical behavior. Given any function on , we write

We denote by the image measure of associated with the parameter value , and by the Poisson random measure associated with .

Suppose that the null hypothesis

is true for a moment; namely the underlying model is a diffusion process. Then, for the estimation of

, we can make use of the Gaussian quasi-(log-)likelihood

where denotes the standard normal density and

This quasi-likelihood is constructed based on the local-Gauss approximation of the transition probability by under , and lead to the Gaussian quasi-maximum likelihood estimator (GQMLE) defined by any element

It is well known that the asymptotic normality holds true [4] under suitable regularity conditions:


Here denotes the invariant measure of .

To see whether a working model fits data well or not, diagnosis based on residual analysis is often done. Based on the GQMLE, [6] formulated Jarque-Bera normality test based on self-normalized residuals for our model. Define the self-normalized residual statistic by:

where and . Making use of , we define Jarque-Bera type statistic by

Then, Jarque-bera normality test for our model is justified by the following sense:

Theorem 2.1.

(cf. [6, Theorem 3.1 and Theorem 4.1]) Under the suitable regularity conditions, we have the followings:

  • Under , we have ;

  • Under , we have .

3. Proposed strategy

For brevity we write

Let be a small number. We propose the iterative jump detection procedure based on the Jarque-Bera type test below.

  • Set , and let be empty set.

  • Calculate the modified GQMLE (MGQMLE, for short) by:

    where . Define the following statistics:

    Building on the MGQMLE and the above ingredients, (re-)construct the following modified self-normalized residuals and Jarque-Bera type statistics :

  • If , then pick out the interval number

    add to , and return to Step 1; otherwise, set the number of jumps , and go to Step 3.

  • If , regard that there is no jump; otherwise, the detected jumps are (they are in descending order). Finally, set as the estimator of .

Remark 3.1.

By using its intensity parameter , the number of jumps of a compound Poisson process is expressed as . Thus, as the terminal time gets larger and larger, the iteration number of our proposed methodology should also be large. In such case or the case where seemingly several jumps do exist, we could instead start from -th stage for some which conveniently enables us to “skip” first some redundant stages.

Remark 3.2.

In practice, the size of “last-removed” increment would be used as the threshold for detecting jumps for future observations: with the value in hand, for future observations we regard that a jump occurred over if

Remark 3.3.

Our method enables us to divide the set of the whole increments into the following two categories:

  • “One-jump” group , and

  • “No-jump” group .

Our method conducts the estimation of the drift and diffusion part of based on continuously joined up data computed from the no-jump group pairs:

Also, we may estimate the jump part by the members of one-jump group; namely we think that the sequence under being i.i.d. with common jump distribution of the compound Poisson process .

Remark 3.4.

To reduce the computational load of the calculating the GQMLE, one can alternatively use the stepwise estimator defined by:

and its modified version can similarly be defined. Under the null hypothesis being true, the limit distribution of is shown to be equivalent to that of (cf. [11]). Moreover, computation of the GQMLE and MGQMLE may become much less time-consuming one when the coefficients are of certain tractable forms: let and be the dimension of and , respectively, and suppose that the diffusion coefficient and the drift function can be written by suitable functions and as

where denotes the -th element of

for every vector

. Then the stepwise estimator is given by

where and . What is important from these expressions is that the modified version of can be calculated simply by removing the corresponding indices from the sum without repetitive numerical optimizations, thus reducing the computational time to a large extent.

4. Asymptotic property of the MGQMLE

In this section, we look at the asymptotic properties of the MGQMLE for the following toy model:


where is a compound Poisson process expressed as

In this expression, and denote a Poisson process whose intensity parameter is

and i.i.d random variables, respectively. Recall that the observations

are obtained in , . To deduce the asymptotic properties of the MGQMLE, we introduce some assumptions below.

Assumption 4.1.
  • , and there exists positive deterministic sequence satisfying the conditions

  • For any , we have

  • the number of jump removal .

The following theorem ensures a consistency property of the MGQMLE:

Theorem 4.2.

If Assumption 4.1 holds, then we have

for each and .

5. Numerical experiments

We consider the following SDE model:

where . Here we set the true values as and . Under the conditions where:

  • ;

  • ;

  • ;

  • , and selected with equal probabilities.

Here, we set number of jumps fixed just for numerical comparison purpose. Then the performance of our method is given in the table 1.

before jump removal 0.31 2.00
(0.06) (0.49)
after jump removal 0.30 1.00
(0.01) (0.01)
Table 1.

The performance of the GQMLE and MGQMLE: the mean is given with the standard deviation in parenthesis.

The transition of our estimators and the logarithmic values of the JB statistic in the last iteration are shown in Figures 3-3. As can be seen from Table 1

, the estimation accuracy is drastically improved by our method. In this example, we set the jump distribution symmetric, thus the improvement of the estimation of the drift parameter is small compared with that of the diffusion parameter; the amount of improvement is expected to be much more significant when the jump distribution is skewed.

Figure 1.
Figure 2.
Figure 3.


This work was supported by JST, CREST Grant Number JPMJCR14D7, Japan.

6. Appendix: proof of Theorem 4.2

Let Assumption 4.1 hold throughout this section. First, we prove two lemmas.

Lemma 6.1.

Let denote jump times of . Then we have


Since the increments of the jump times of Poisson process independently obey the exponential distribution with mean

, it follows that

For convenience, we hereafter write

Thanks to Lemma 6.1, in proving Theorem 4.2 we may and do focus on the set .

Lemma 6.2.

We have


Hereafter we use the following notations:

For every , we write . Then we have

From extreme value theory (cf. [2, Table 3.4.4]), we have


Therefore it suffices to show that


Assumption 4.1 implies that

hence the claim follows. ∎

Lemma 6.2 implies that all increments containing jumps are correctly picked up as long as . Similarly, we can derive

Proof of Theorem 4.2

We introduce the following events:

Taking the lemmas into consideration, we can split as


Since , it suffices to show and . From now on, for an event we denote by the indicator function of :

First we focus on the estimate of . By virtue of the foregoing discussion, we have the following expression:

Hence it follows that

The law of large numbers for triangular sequences implies that

Again applying (6.1), we have

Thus .

Let us now move on to the estimates of . From the representation

and the central limit theorem, it follows that

Hence if , we have