 # Bias Correction Estimation for Continuous-Time Asset Return Model with Jumps

In this paper, local linear estimators are adapted for the unknown infinitesimal coefficients associated with continuous-time asset return model with jumps, which can correct the bias automatically due to their simple bias representation. The integrated diffusion models with jumps, especially infinite activity jumps are mainly investigated. In addition, under mild conditions, the weak consistency and asymptotic normality is provided through the conditional Lindeberg theorem. Furthermore, our method presents advantages in bias correction through simulation whether jumps belong to the finite activity case or infinite activity case. Finally, the estimators are illustrated empirically through the returns for stock index under five-minute high sampling frequency for real application.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Continuous-time models are widely used in economics and finance, such as interest rate etc, especially the continuous-time diffusion processes with jumps. Jump-diffusion process is represented by the following stochastic differential equation:

 dXt=μ(Xt−)dt+σ(Xt−)dWt+∫Ec(Xt−,z)r(ω,dt,dz), (1.1)

which can accommodate the impact of sudden and large shocks to financial markets. Johannes joh provided the statistical and economic role of jumps in continuous-time interest rate models. However, in empirical finance the current observation usually behaves as the cumulation of all past perturbation such as stock prices by means of asset returns in Nicolau n2 et al. Furthermore, in the research field involved with model (1.1), although the scholars focused on the price of the asset, most of them didn’t consider the returns of the asset. As mentioned in Campbell, Lo and MacKinlay clm , return series of an asset are a complete and scale-free summary of the investment opportunity for most investors, and are easier to handle than price series due to their more attractive statistical properties. The capital asset return model is the core of capital market theory which describes the relationship between the returns and risks of individual securities or portfolio.

For characterizing this integrated economic phenomenon, moreover, the return series, we considered the promising continuous-time integrated diffusion process with jumps (1.2), which is motivated by unit root processes under the discrete framework of Park and Phillips pp and continuous integrated diffusion process in Nicolau n . It satisfies the following second-order stochastic differential equation:

 {dYt=Xtdt,dXt=μ(Xt−)dt+σ(Xt−)dWt+∫Ec(Xt−,z)r(ω,dt,dz), (1.2)

where is a standard Brownian motion, is a time-homogeneous Poisson random measure on , which is independent of , and is its intensity measure, that is, , is a Lévy density. For empirical financial data, in the model (1.2) represents the continuously compounded return of underlying assets, denotes the asset price by means of the cumulation of the returns plus initial asset value. Furthermore, the model (1.2) can accommodate nonstationarity and transform nonstationarity into stationarity by differencing, which can not be performed through univariate diffusion model due to the nondifferentiability of a Brownian motion.

For model (1.2), the estimators for unknown coefficients have been considered based on low frequency or high frequency observations under various settings. For model (1.2) without jumps, Gloter g1 g2 and Ditlevsen and Sørensen ds built the parametric and semiparametric estimation, while Nicolau n and Comte, Genon-Catalot and Rozenholc cgr analyzed nonparametric estimators for the unknown quantities. Moreover, Wang and Lin wl , Wang, Zhang and Tang wzt , Hanif hm2 and Wang and Tang wt improved the nonparametric estimators for them. For model (1.2) with finite activity jumps, Song and Lin sl , Song, Lin and Wang slw , Chen and Zhang cz and Funke and Schmisser fs theoretically investigated nonparametric estimation for the drift or volatility coefficients. Song sy empirically considered the application for the estimation proposed in high frequency financial data.

In this paper, we adapt local linear estimators for the unknown coefficients of integrated diffusion models with jumps, especially infinite activity jumps. In the context of nonparametric estimator with finite-dimensional auxiliary variables, local polynomial smoothing become an effective smoothing method, which doesn’t assume the functional form for the unknown coefficients. Moreover, local linear estimators have excellent properties such as full asymptotic minimax efficiency achievement and boundary bias correction automatically, one can refer to Fan and Gijbels fg2 for better review.

Our contribution have three folds. Firstly, in terms of the model, the previous work was mainly focused on the continuous case as Nicolau n or finite activity case as Song sy . We will consider a more practical integrated diffusion models with infinite activity jumps for the asset return. The existence of infinite activity jumps for the high frequency financial data has been testified based on Aït-Sahalia and Jacod aj in empirical analysis part.

Secondly, in the theoretical side, compared with Nadaraya-Watson estimators, the conditional Lindeberg theorem might be no longer applicable for the local linear estimators due to their destroyed adaptive and predictable structure of conditionally on the field generated by

We effectively tackle the key technical problems by means of the Slutsky’s theorem and establish central limit theorems for the volatility functions in the second-order diffusion model with infinite activity jumps. More technical proof details can be sketched in Lemma

4 and Theorem 2.6.

Thirdly, Chen and Zhang cz gave the large sample properties of local linear estimators for second-order diffusion with finite activity jump, but they didn’t consider the finite-sampling performance of them. Considering what has been talked above, in the practical side we consider two types of jump (finite activity jumps and infinite activity jumps) aimed at verifying the better finite-sampling performance of local linear estimators under various settings. Moreover, the estimators are illustrated empirically through the return of stock index in Shenzhen Stock Exchange under five-minute high sampling frequency data between Jan 2015 and Dec 2015. In summary, the integrated diffusion model with jump, especially infinite activity jumps may be an alternative model to describe the dynamic variation for the returns of financial assets.

The paper is organized as follows. The local linear estimators and their large sample properties are collected in Section 2. The finite sample performance of underlying estimators through Monte Carlo simulation study is presented in Section 3. The estimators are illustrated empirically in Section 4. Some technical lemmas for the main theorems are given in Appendix part.

## 2 Local Linear estimators and Large sample properties

For model (1.2), we usually get observations rather than However, the value of cannot be obtained from in a fixed sample intervals. Additionally, nonparametric estimations of the unknown qualities in model (1.2) cannot in principle be constructed on the observations due to the unknown conditional distribution of . As Nicolau n showed, with observations and given that

 YiΔn−Y(i−1)Δn=∫iΔn(i−1)ΔnXudu,

we can obtain an approximation value of by

 ˜XiΔn=YiΔn−Y(i−1)ΔnΔn. (2.1)

Due to the Markov properties of model (1.2), we can build the following infinitesimal conditional expectations

 E[˜X(i+1)Δn−˜XiΔnΔn|F(i−1)Δn]=μ(X(i−1)Δn)+Op(Δn), (2.2) E[(˜X(i+1)Δn−˜XiΔn)2Δn|F(i−1)Δn]=23σ2(X(i−1)Δn)+23∫Rc2(X(i−1)Δn,z)f(z)dz+Op(Δn). (2.3)

where . One can refer to Appendix A in Song, Lin and Wang slw for detailed calculations.

For the given , the local linear estimators for and based on infinitesimal conditional expectations (2.2) and (2.3) are defined as the solutions to the following weighted least squares problems: find , , , to minimize

 (2.4)
 n∑i=1(32(˜X(i+1)Δn−˜XiΔn)2Δn−a2−b2(˜XiΔn−x))2K(˜X(i−1)Δn−xhn), (2.5)

where is the kernel function and is a sequence of positive numbers, satisfies as

The solutions for and to (2.4) and (2.5) as follows are respectively the local linear estimators of and

 ^μn(x)=∑ni=1ωi−1(˜Xi+1−˜XiΔn)∑ni=1ωi−1, (2.6)
 ^Mn(x)=∑ni=1ωi−132(˜Xi+1−˜Xi)2Δn∑ni=1ωi−1 (2.7)

where

 ωi−1=K(˜Xi−1−xhn)(n∑j=1K(˜Xj−1−xhn)(˜Xj−x)2−(˜Xi−x)n∑j=1K(˜Xj−1−xhn)(˜Xj−x)).

The assumptions of this paper are listed below, which confirm the large sample properties of the constructed estimators based on (2.6) and (2.7).

###### Assumption 1

i) (Local Lipschitz continuity)  For each there exist a constant and a function : with such that, for any ,

 |μ(x)−μ(y)|+|σ(x)−σ(y)|≤Ln|x−y|,   |c(x,z)−c(y,z)|≤ζn(z)|x−y|.

(ii (Linear growthness)  For each , there exist as above and C, such that for all ,

 |μ(x)|+|σ(x)|≤C(1+|x|), |c(x,z)|≤ζn(z)(1+|x|).
###### Remark 2.1

This assumption guarantees the existence and uniqueness of a solution to stochastic differential equation in (1.2), see Jacod and Shiryaev js . For instance, Long, Ma and Shimizu lms and Long and Qian lq imposed similar conditions on the coefficients of the underlying stochastic differential equation.

###### Assumption 2

The process is ergodic and stationary with a finite invariant measure . For a given point

, the stationary probability measure

of the process is positive at that is Furthermore, the process is mixing with where

###### Remark 2.2

The Assumption 2 implies that the process has a unique weak solution. The finite invariant measure implies that the process is positive Harris recurrent with the stationary probability measure The hypothesis that is a stationary process is obviously a plausible assumption because for major integrated time series data, a simple differentiation generally assures stationarity. The same condition yielding information on the rate of decay of mixing coefficients for was mentioned the Assumption 3 in Gugushvili and Spereij gs . For instance, the mixing process with exponentially decreasing mixing coefficients satisfies the condition, see Hansen and Scheinkman hs , Chen, Hansen and Carrasco chc .

###### Assumption 3

The kernel () : is a positive and continuously differentiable function satisfying:

 ∫K(u)du=1, Kji:=∫Ki(u)ujdu<∞.

Moreover, For ,

 limh→0E[ 1h |K′(ξn,i)|α(˜X(i−1)Δn−xh)m]<∞

where or , or and = ,

###### Remark 2.3

In fact, any density function can be considered as a kernel, moreover even unnecessary positive functions can be used. For simplification, we only consider positive and symmetrical kernels used widely. It is well known both empirically and theoretically that the choice of kernel functions is not very important to the kernel estimator, see Gasser and Müller gm . As Nicolau n pointed out this assumption is generally satisfied under very weak conditions. For instance, with a Gaussian kernel and a Cauchy stationary density (which has heavy tails) we still have . Notice that the expectation with respect to the distribution depends on the stationary densities of and because is a convex linear combination of and

For every , and

###### Remark 2.4

This assumption guarantees that Lemma 1 can be used properly throughout the article. If is a Lévy process with bounded jumps (i.e., almost surely, where C is a nonrandom constant), then , that is,

has bounded moments of all orders, see Protter

pr . This condition is widely used in the estimation of an ergodic diffusion or jump-diffusion from discrete observations, see Florens-Zmirou fz , Kessler ke , Shimizu and Yoshida syo .

###### Remark 2.5

The relationship between and is similar as the stationary case in Bandi and Nguyen bn and (b1), (b2) of A8 in Nicolau n .

We have the following asymptotic results for the local linear estimators such as (2.6) and (2.7) based on the assumptions above.

###### Theorem 2.6

Under Assumptions 1-5, as we have

(i)

(ii) Furthermore, if and then

 √hnnΔn(^μn(x)−μ(x)−12h2nμ′′(x)(K21)2−K31K11K21−(K11)2)⟹N(0,VM(x)p(x)),

and

 √hnnΔn(^Mn(x)−M(x)−12h2nM′′(x)(K21)2−K31K11K21−(K11)2)⟹N(0,V∫Ec4(x,y)f(y)dyp(x)),

where

 V=(K21)2K02+(K11)2K22−2(K11)(K21)K12[K21−(K11)2]2.
###### Remark 2.7

For the finite activity jumps of model (1.2),

 ∫Ec(Xt−,z)r(ω,dt,dz):=∫Ec(Xt−,z)N(dt,dz)−λ(Xt−)∫Ec(Xt−,z)Π(dz)dt,

where is a Poisson counting measure, reflects the conditional impact of a jump and

is the probability distribution function of a jump. For model (

1.2), we can observe that so and

For the infinite activity jumps of model (1.2), we will focus on the following diffusion process with jumps for such that

 dXt=μ(Xt−)dt+σ(Xt−)dWt+ξ(Xt−)dJt,

where is a pure jump Lévy process such as an infinite activity jump process of the representation

 dJt=∫Ry(μ(dt,dy)−ν(dy)dt):=∫Ry¯μ(dt,dy)

with a Poisson random measure compensated by its intensity measure Hence and for model (1.2).

In contrary to the integrated diffusion model without jumps (Nicolau n ), the rate of convergence of the second infinitesimal moment estimator is same as the first infinitesimal moment estimator. Apparently, this is due to the presence of discontinuous breaks that have an equal impact on all the functional estimates. As Johannes joh

pointed out, for the conditional variance of interest rate changes, not only diffusion play a certain role, but also jumps account for more than half at lower interest level rates, almost two-thirds at higher interest level rates, which dominate the conditional volatility of interest rate changes. Thus, it is extremely important to estimate the conditional variance as

+ which reflects the fluctuation of the return of the underlying asset.

###### Remark 2.8

In Song sy , he showed that under the Assumptions in this paper, and the following result holds with symmetric kernels

 √hnnΔn(^μNWn(x)−μ(x)−h2nK21(12μ′′(x)+μ′(x)ϕ′(x)ϕ(x)))d→N(0,K02M(x)p(x)).

When is symmetric, for local linear estimators we obtain that

 √hnnΔn(^μn(x)−μ(x)−12h2nμ′′(x)K21)⟹N(0,K02M(x)p(x)).

Comparing the bias of local linear estimator with that of Nadaraya-Watson estimator in Song sy above, we can observe that the bias of is more than that of . When is asymmetric, sketching the proof procedure in Song sy , we can prove that the bias for should subtract which confirms the bias of local linear estimator is smaller than that of Nadaraya-Watson case. Hence, compared with the Nadaraya-Watson estimator, local linear estimator possesses simple bias representation and can correct the bias automatically whether is symmetric or not.

###### Remark 2.9

Similarly as Theorem 4 in Fan and Gijbels fg1 , we now investigate the behavior of the estimator (2.6) at left-boundary points (the right-boundary points are the same). Put , with . Assuming , the conditional MSE of the estimator (2.6) at the boundary point is . More proof details see Fan and Gijbels fg1 (A little tedious, so we omit here). Indeed, its rate of convergence for local linear estimator is not influenced by the position of the point under consideration. Hence the local linear smoother does not require modifications at the boundary. So it turns out that the local linear smoother has an additional advantage over other kernel-type estimator (see Gasser and Müller gm ). To some extend, the simulation in the paper confirms this result.

###### Remark 2.10

It is very important to consider the choice of the bandwidth in nonparametric estimation. Here we will select the optimal bandwidth based on the mean squared error (MSE) and the asymptotic theory in Theorem 2.6. Take for example, the optimal smoothing parameter for local linear estimator of is given that

 hn,opt,fi=(1nΔn⋅4VM(x)(K21−(K11)2)2p(x)μ2′′(x)((K21)2−K31K11)2)15=Op(1nΔn)15,

which differs from the continuous case with The bandwidth constructed above relies on the consistent estimators for these unknown quantities and they are difficult to obtain and may give rise to bias. Here we mention two rules of thumb on selecting the bandwidth. Practically, for simplicity one can use the empirical bandwidth selector where

denotes the standard deviation of the data. Or, one can apply the cross-validation method to assess the performance of an estimator via estimating its prediction error. The main idea is to minimize the following expression:

, where is the local linear estimator (2.6) and bandwidth , but without using the th observation.

###### Remark 2.11

If the smoothing parameter

the normal confidence interval for

using local linear estimators at the significance level are constructed as follows,

 Iμ,α= [^μn(x)−h2n⋅12^μ′′n(x)(K21)2−K31K11K21−(K11)2−z1−α/2⋅1√nΔnhn⋅ ⎷V^Mn(x)^pn(x), ^μn(x)−h2n⋅12^μ′′n(x)(K21)2−K31K11K21−(K11)2+z1−α/2⋅1√nΔnhn⋅ ⎷V^Mn(x)^pn(x)],

where

is the inverse CDF for the standard normal distribution evaluated at

To facilitate statistical inference for based on Theorem 2.6, we need to conduct consistent estimators for the unknown quantities in the normal approximation. denote the local linear estimators of in (2.6) and (2.7), respectively. As Fan and Gijbels fg2 showed, the derivative in can be estimated by taking the second derivative of the local linear estimators of in (2.6). The consistent estimator for is

 ^pn(x)=1nhnn∑i=1K(˜X(i−1)Δn−xhn).

## 3 Monte Carlo Simulation Study

In this section, a simple Monte Carlo simulation experiment is constructed aimed at the finite-sampling performance between local linear estimators constructed as (2.6) and (2.7) denoted as LL and Nadaraya-Watson estimators constructed in Song (2017) denoted as NW. Throughout this section, various lengths of observation time interval and sample sizes with will be considered. We use classical Gaussian kernel and the common bandwidth , where denotes the standard deviation of the data. Here we will consider two types of jump aimed at verifying the better finite-sampling performance of local linear estimators under various settings. Take the estimator of for example to show the small-sampling performance. Similar results can be found for of

Example 1 (Finite Activity case). Our experiment is based on the following second-order diffusion model with finite activity jump:

 ⎧⎨⎩dYt=Xt−dt,dXt=−10Xt−dt+√0.1+0.1X2t−dWt+dJt, (3.1)

where the coefficients of continuous part are equal to the ones used in Nicolau n and is a compound Poisson jump process with arrival intensity and jump size corresponding to Bandi and Nguyen bn .

The finite-sample performance of the local linear estimator and NW estimator for

with finite activity jump is demonstrated in Figure 1. We can observe that local linear estimator performs a little better than the NW estimator, especially at the boundary points. What is shown in Table 1 is the biases of local linear estimator and NW estimator at various quantile points of

In addition, 95% Monte Carlo confidence intervals of various estimators for are depicted in Figure 2. It shows that the true value of can not fall in the 95% Monte Carlo confidence intervals of NW estimators at the sparse design boundary points. These findings confirms the fact that local linear estimator can correct the boundary bias automatically due to its simple bias representation as shown in Remark 2.8. Figure 3 gives the QQ Plots for local linear estimator of the drift function with finite activity jump, which reveals the normality of local linear estimators in finite sample and confirms the results in Theorems 2.6. Figure 1: Various Nonparametric Estimators for μ(x)=−10∗x with T=10, n=1000, λ=2 and jump size Zn∼N(0,0.0362) Figure 2: 95% Monte Carlo Confidence Interval for μ(x)=−10∗x with T=10, n=1000, λ=2 and jump size Zn∼N(0,0.0362) Figure 3: QQ plot of local linear estimator for μ(x)=−10∗x with T=10, n=1000, λ=2 and jump size Zn∼N(0,0.0362)

Next, we will assess the global performance between local linear estimator and NW estimator via the Root of Mean Square Errors (RMSE)

 RMSE= ⎷1mm∑k=0{^μ(xk)−μ(xk)}2, (3.2)

where is the estimator of and are chosen uniformly to cover the range of sample path of Tables 2, 3 and 4 report the results on RMSE-LL and RMSE-NW for the drift function with different types of time spans, sampling numbers, jump intensities and jump sizes over 100 replicates.

We can notice that the local linear estimator performs a little better (approximately reduced by half) than the NW estimator in terms of the RMSE under different types of time spans, sampling numbers, jump intensities and jump sizes. This fact confirms that local linear estimator possesses the property of bias correction. From Table 2, we can get the other two findings. Firstly, for the same time interval , as the sample sizes tends larger, the performances of these estimators improved due to more information used for estimators. Secondly, for the same sample sizes , as the time interval expands larger, the performances of these estimators get worse due to the fact that more jumps happened in larger time interval From Table 3 and 4, for the same jump size or the same jump arrival intensity, as the sample sizes tends larger, the performances of the estimators for are also improved due to the fact that more jump information for estimation procedure is collected as as However, for the same sample sizes , as the amplitude or frequency of jump becomes larger, the RMSE of the estimators gradually becoming larger.

Example 2 (Infinite Activity case). The infinite activity jump in second-order diffusion model (3.1) is a Variance Gamma (VG) jump process, that is with , where is an independent Gamma process subject to Gamma(t/b,b) with as that in Madan mad . As is known, VG process is an infinite activity jump process with finite variation.

For the infinite activity case, from Figures 4, 5, 6 and Tables 5, 6 we can observe the similar finding as the finite activity case, which confirms the smaller bias and the normality in Theorems 2.6. This also shows that the methodology proposed in this paper is robust to the presence of infinite activity jumps. Figure 4: Various Nonparametric Estimators for μ(x)=−10∗x with T=10, n=1000 and Variance Gamma jump process Jt Figure 5: 95% Monte Carlo Confidence Interval for μ(x)=−10∗x with T=10, n=1000 and Variance Gamma jump process Jt Figure 6: QQ plot of local linear estimator for μ(x)=−10∗x with T=10, n=1000 and Variance Gamma jump process Jt

## 4 Empirical Analysis

In this section, we apply the integrated diffusion with jump to model the return of stock index from Shenzhen Stock Exchange in China under five-minute high frequency data, that is (t = 1 meaning one day), and then apply the local linear estimators to estimate the unknown coefficients in model (1.2).

We assume that

 {dlogYt=Xtdt,dXt=μ(Xt−)dt+σ(Xt−)dWt+∫Ec(Xt−,z)r(ω,dt,dz), (4.1)

where is the log integrated process for stock index or commodity price and is the latent process for the log-returns. According to (2.1), we can get the proxy of the latent process, that is the return of ,

 ˜XiΔn=logYiΔn−logY(i−1)ΔnΔn. (4.2)

The plots of Shenzhen Composite Index and its proxy (4.2) under five-minute frequency data from Jan 5, 2015 to Dec 31, 2015 are depicted in Figures 7 and 8. Figure 8 indicates the existence of jumps.

Trough the Augmented Dickey-Fuller test statistic in Table 7, we can easily observe that the null hypothesis of non-stationarity is accepted for the logarithm of stock index

at the 5% significance level, but is rejected for the difference sequence proxy , which confirms the stationary Assumption 2 for Furthermore, based on the statistic proposed in Aït-Sahalia and Jacod aj , the degree of activity jumps is 0.3487, which indicates the existence of infinite activity jumps for and confirms the validity of model (4.1) with infinite activity jumps for Shenzhen Composite Index.

Here we use Gaussian kernels and the empirical bandwidth for the estimation procedure. The local linear estimators under (4.2) for the unknown coefficients in model (4.1) are displayed in Figures 9 and 10. It is observed that the linear shape with negative coefficient for drift estimator in Figure 9 which reveals the economic phenomenon of mean reversion. It is also shown the quadratic form with positive coefficient for volatility estimator with a minimum at 0.07 in Figure 10, which coincides with the economic phenomenon of volatility smile. The shapes of estimated curves for these unknown coefficients coincide with those in Nicolau n2 .

However there is an interesting phenomenon discovered in the volatility line of return series: this line is asymmetric not like the symmetric one in Nicolau n2 and the volatility has different rates to positive return and negative return. The finding coincides with the conclusion in most empirical analysis that asymmetric features are depicted for volatility with respect to positive perturbations (good news) and negative perturbations (bad news). This is statistically due to the higher agglomeration effect of jumps for Shenzhen stock index in 2015. As Johannes joh pointed out, jumps account for more than half of the conditional variance. Furthermore, as Chen and Sun cs concluded, the jump behavior has a significant and asymmetric feedback effect in the expected volatility, moreover, jumps exacerbate the degree of asymmetric features of the volatility for stock index. Using the statistics proposed in Aït-Sahalia and Jacod aj , we can observe that at the points of the stock index return which are more than 0.07, jumps happened at approximately 75.50% points in them and at the points of the return which are less than 0.07, jumps happened at about 58.90% points among them. In conclusion, different frequencies of jumps at various points of stock index return lead to the asymmetric volatility line.

What may explain this phenomenon economically is the special situation of stock market in 2015. As we know, Shenzhen index in China performed quite well during first half year of 2015, so more shares changed hands arises for high yield, which leads to higher slope for volatility at larger positive value of log-return increments. However, Shenzhen index performed very unsatisfactorily in the second half year. Facing the big transformation, investors are still immersed in the past prosperity and believe that stocks market crisis is temporary and the market can rebound after hitting rock bottom. Hence less shares changed hands arises, which leads to milder slope for volatility at lower value of log-return increments. From Figure 11, we can observe that more and larger volume happened at positive and large return rate. As a result, investors have different responses to negative and positive returns, which leads to the asymmetric volatility line. Figure 9: local linear estimator and its 95% confidence bands of the drift coefficient Figure 10: local linear estimator and its 95% confidence bands of the volatility coefficient Figure 11: Scatter between the asset return rate and volume

## 5 Appendix

In this section, we first present some technical lemmas and the proofs for the main theorems.

### 5.1 Some Technical Lemmas with Proofs

###### Lemma 1

(Shimizu and Yoshida syo ) Let be a -dimensional solution-process to the stochastic differential equation

 Zt=Z0+∫t0μ(Zs−)ds+∫t0σ(Zs−)dWs+∫t0∫Ec(Zs−,z)r(ω,dt,dz),

where

is a random variable,

, are

-dimensional vectors defined on

respectively, is a diagnonal matrix defined on , and is a -dimensional vector of independent Brownian motions.

Let be a -class function whose derivatives up to 2th are of polynomial growth. Assume that the coefficient and are -class function whose derivatives with respective to up to 2th are of polynomial growth. Under Assumption 6, the following expansion holds

 E[g(Zt)|Fs]=l∑j=0Ljg(Zs)Δjnj!+R, (5.1)

for and , where is a stochastic function of order

###### Remark 5.1

Consider a particularly important model:

 {dYt=Xt−dt,dXt=μ(Xt−)dt+σ(Xt−)dWt+∫Ec(Xt−,z)r(w,dt,dz).

As = 2, we have

 Lg(x,y)=x(∂g/∂y)+μ(x)(∂g/∂x)+12σ2(x)(∂2g/∂x2)             +∫E{g(x+c(x,z),y)−g(x,y)−∂g∂x⋅c(x,z)}f(z)dz. (5.2)

Based on operator (5.2), one can obtain the equations such as (2.2) and (2.3), one can refer to Appendix A in Song, Lin and Wang slw for detailed calculations.

###### Lemma 2

(Jacod ja12 ) A sequence of valued variables defined on the filtered probability space is measurable for all Assume there exists a continuous adapted valued process of finite variation and a continuous adapted and increasing process , for any we have

 sup0≤s≤t∣∣[s/Δn]∑i=1E[ζn,i|F(i−1)Δn]−Bs∣∣P⟶0, (5.3)