1. Introduction
Suppose that we are given discretetime but highfrequency observation from a solution to the onedimensional diffusion with jumps described by
(1.1) 
where the ingredients are given as follows.

is a standard Wiener process and a compound Poisson process associated with the Lévy measure
for some probability distribution
. Throughout we assume that . 
The sampling times fulfills that
(1.2) where the terminal sampling time ; hereafter, we will largely abbreviate “” from the notation like and .
A wellknown approach to estimate
is the threshold based method independently proposed in [5], [7], and [10]. In the method, we regard that the increment
contains the jump component if for a fixed jumpdetection threshold , and estimate after removing such increments. It is shown that for a good satisfying a suitable rate, the estimator of has asymptotic normality at the same rate as diffusion models. Hence the method asymptotically achieves both the estimation of and the jump detection in observed data, while finitesample performance of the threshold method strongly depends on the value of . Unfortunately, a dataadaptive and quantitative choice of the threshold in the jumpdetection filter is a subtle and sensitive problem, and still remains as an annoying problem in practice; see [8], [9], as well as the references therein. Such problem can also be seen in other jump detection methods such as [1].
The primary objective of this paper is to formulate an intuitively easytounderstand strategy, which can simultaneously estimate and detect jumps without any precise calibration of a jumpdetection threshold. For this purpose, we utilize the approximate selfnormalized residuals [6] based on the Gaussian quasi maximum likelihood estimator (GQMLE), which makes a classical JarqueBera type test [3] adapted to our model. More specifically, the hypothesis test whose significance level is
is constructed by the following manner: let the null hypothesis be of “no jump component” :
against the alternative hypothesis of “nontrivial jump component”:
Then, if the JarqueBera type statistic based on the selfnormalized statistics introduced later is larger than a given percentile of the chisquare distribution with the degrees of freedom being
, we reject the null hypothesis ; and otherwise, we accept . For such a test, we can intuitively regard that the largest increment contains at least one jump (and it will turn out to be true) when the null hypothesis is rejected. Following this inspection, our proposed method is to iteratively conduct the test with removing the largest increments at each test until is accepted, and after that, we estimate the target parameter without removed increments. Our method enables us not only just to make a “precleaning” of diffusionlike data sequence by removing big fluctuations which collapse the (approximate) Gaussianity of the selfnormalized residuals, but also to approximately quantify jumps relative to continuous fluctuations in a natural way.The rest of this paper is organized as follows: in Section 2, we will give a briefly summary of the GQMLE, the approximate selfnormalized residuals and the JarqueBera test for our model. Section 3 provides the specific recipe of ours and an alternative estimator to GQMLE in order to reduce computational load. At last, we will show some numerical experiments of our method.
2. Preliminaries
In this section, we briefly review the construction of GQMLE, selfnormalized residual, and JarqueBera statistics with its theoretical behavior. Given any function on , we write
We denote by the image measure of associated with the parameter value , and by the Poisson random measure associated with .
Suppose that the null hypothesis
is true for a moment; namely the underlying model is a diffusion process. Then, for the estimation of
, we can make use of the Gaussian quasi(log)likelihoodwhere denotes the standard normal density and
This quasilikelihood is constructed based on the localGauss approximation of the transition probability by under , and lead to the Gaussian quasimaximum likelihood estimator (GQMLE) defined by any element
It is well known that the asymptotic normality holds true [4] under suitable regularity conditions:
where
Here denotes the invariant measure of .
To see whether a working model fits data well or not, diagnosis based on residual analysis is often done. Based on the GQMLE, [6] formulated JarqueBera normality test based on selfnormalized residuals for our model. Define the selfnormalized residual statistic by:
where and . Making use of , we define JarqueBera type statistic by
Then, Jarquebera normality test for our model is justified by the following sense:
Theorem 2.1.
(cf. [6, Theorem 3.1 and Theorem 4.1]) Under the suitable regularity conditions, we have the followings:

Under , we have ;

Under , we have .
3. Proposed strategy
For brevity we write
Let be a small number. We propose the iterative jump detection procedure based on the JarqueBera type test below.

Set , and let be empty set.

Calculate the modified GQMLE (MGQMLE, for short) by:
where . Define the following statistics:
Building on the MGQMLE and the above ingredients, (re)construct the following modified selfnormalized residuals and JarqueBera type statistics :

If , then pick out the interval number
add to , and return to Step 1; otherwise, set the number of jumps , and go to Step 3.

If , regard that there is no jump; otherwise, the detected jumps are (they are in descending order). Finally, set as the estimator of .
Remark 3.1.
By using its intensity parameter , the number of jumps of a compound Poisson process is expressed as . Thus, as the terminal time gets larger and larger, the iteration number of our proposed methodology should also be large. In such case or the case where seemingly several jumps do exist, we could instead start from th stage for some which conveniently enables us to “skip” first some redundant stages.
Remark 3.2.
In practice, the size of “lastremoved” increment would be used as the threshold for detecting jumps for future observations: with the value in hand, for future observations we regard that a jump occurred over if
Remark 3.3.
Our method enables us to divide the set of the whole increments into the following two categories:

“Onejump” group , and

“Nojump” group .
Our method conducts the estimation of the drift and diffusion part of based on continuously joined up data computed from the nojump group pairs:
Also, we may estimate the jump part by the members of onejump group; namely we think that the sequence under being i.i.d. with common jump distribution of the compound Poisson process .
Remark 3.4.
To reduce the computational load of the calculating the GQMLE, one can alternatively use the stepwise estimator defined by:
and its modified version can similarly be defined. Under the null hypothesis being true, the limit distribution of is shown to be equivalent to that of (cf. [11]). Moreover, computation of the GQMLE and MGQMLE may become much less timeconsuming one when the coefficients are of certain tractable forms: let and be the dimension of and , respectively, and suppose that the diffusion coefficient and the drift function can be written by suitable functions and as
where denotes the th element of
for every vector
. Then the stepwise estimator is given bywhere and . What is important from these expressions is that the modified version of can be calculated simply by removing the corresponding indices from the sum without repetitive numerical optimizations, thus reducing the computational time to a large extent.
4. Asymptotic property of the MGQMLE
In this section, we look at the asymptotic properties of the MGQMLE for the following toy model:
(4.1) 
where is a compound Poisson process expressed as
In this expression, and denote a Poisson process whose intensity parameter is
and i.i.d random variables, respectively. Recall that the observations
are obtained in , . To deduce the asymptotic properties of the MGQMLE, we introduce some assumptions below.Assumption 4.1.

, and there exists positive deterministic sequence satisfying the conditions

For any , we have

the number of jump removal .
The following theorem ensures a consistency property of the MGQMLE:
Theorem 4.2.
5. Numerical experiments
We consider the following SDE model:
where . Here we set the true values as and . Under the conditions where:

;

;

;

, and selected with equal probabilities.
Here, we set number of jumps fixed just for numerical comparison purpose. Then the performance of our method is given in the table 1.
before jump removal  0.31  2.00 
(0.06)  (0.49)  
after jump removal  0.30  1.00 
(0.01)  (0.01) 
The performance of the GQMLE and MGQMLE: the mean is given with the standard deviation in parenthesis.
The transition of our estimators and the logarithmic values of the JB statistic in the last iteration are shown in Figures 33. As can be seen from Table 1
, the estimation accuracy is drastically improved by our method. In this example, we set the jump distribution symmetric, thus the improvement of the estimation of the drift parameter is small compared with that of the diffusion parameter; the amount of improvement is expected to be much more significant when the jump distribution is skewed.
Acknowledgement
This work was supported by JST, CREST Grant Number JPMJCR14D7, Japan.
6. Appendix: proof of Theorem 4.2
Let Assumption 4.1 hold throughout this section. First, we prove two lemmas.
Lemma 6.1.
Let denote jump times of . Then we have
Proof.
Since the increments of the jump times of Poisson process independently obey the exponential distribution with mean
, it follows that∎
For convenience, we hereafter write
Thanks to Lemma 6.1, in proving Theorem 4.2 we may and do focus on the set .
Lemma 6.2.
We have
Proof.
Lemma 6.2 implies that all increments containing jumps are correctly picked up as long as . Similarly, we can derive
Proof of Theorem 4.2
We introduce the following events:
Taking the lemmas into consideration, we can split as
where
Since , it suffices to show and . From now on, for an event we denote by the indicator function of :
First we focus on the estimate of . By virtue of the foregoing discussion, we have the following expression:
Hence it follows that
The law of large numbers for triangular sequences implies that
Again applying (6.1), we have
Thus .
Let us now move on to the estimates of . From the representation
and the central limit theorem, it follows that
Hence if , we have