Statistical inference for fractional diffusion process with random effects at discrete observations

by   El Omari Mohamed, et al.

This paper deals with the problem of inference associated with linear fractional diffusion process with random effects in the drift. In particular we are concerned with the maximum likelihood estimators (MLE) of the random effect parameters. First of all, we estimate the Hurst parameter H from one single subject. Second, assuming the Hurst index H is known, we derive the MLE and examine their asymptotic behavior as the number of subjects under study becomes large, with random effects normally distributed.


page 1

page 2

page 3

page 4


Maximum likelihood estimation for sub-fractional Vasicek model

We investigate the asymptotic properties of maximum likelihood estimator...

Inference of Random Effects for Linear Mixed-Effects Models with a Fixed Number of Clusters

We consider a linear mixed-effects model with a clustered structure, whe...

Nonparametric estimation for fractional diffusion processes with random effects

We propose a nonparametric estimation for a class of fractional stochast...

Inference in the stochastic Cox-Ingersol-Ross diffusion process with continuous sampling: Computational aspects and simulation

In this paper, we consider a stochastic model based on the Cox- Ingersol...

Estimation of all parameters in the reflected Orntein-Uhlenbeck process from discrete observations

Assuming that a reflected Ornstein-Uhlenbeck state process is observed a...

Moment evolution equations and moment matching for stochastic image EPDiff

Models of stochastic image deformation allow study of time-continuous st...

1 Introduction

Parameteric and nonparameteric estimation in the context of random effects models has been recently investigated by many authors (e.g. [1, 4, 5, 6, 19, 20]). In these models, the noise is represented by a Brownian motion characterized by independence property of its increments. Such a property is not valid for long-memory phenomena arising in a variety of different scientific fields, including hydrology [17], biology [3], medicine [14], economics [9] or traffic network [24]. As a result self-similar processes have been used to successfully model data exhibiting long-range dependence. Among the simplest models that display long-range dependence, one can consider the fractional Brownian motion (fBm), introduced in the statistics community by Mandelbrot and Van Ness [16]. A normalized fBm with the Hurst index is centered Gaussian process having the covariance

In modeling, the problems of statistical estimation of model parameters are of particular importance, so the growing number of papers devoted to statistical methods for equations with fractional noise is not surprising. We will cite only few of them; further references can be found in [18, 22]. In [12] the authors proposed and studied maximum likelihood estimators for fractional Ornstein-Uhlenbeck process. Related results were obtained by Prakasa [21], where a more general model was considered. In [10] the author proposed a least squares (LS) estimator for fractional Ornstein-Uhlenbeck process and proved its asymptotic normality. Recently, the same results are obtained using the same approach (LS) in [25], for the fractional Vasicek model with long-memory.

It is worth to mentioning the papers [11, 23] that deal with the whole range of Hurst parameter . Meanwhile, we have cited other papers that only the case where (which corresponds to long range dependence); recall that in the case , we get a classical diffusion extensively treated in literature [15].
This paper deals with statistical estimation of population parameters for fractional SDE’s with random effects. To our knowledge, this problem has not been yet investigated. Precisely, we concider only a fractional diffusion processes of the form



is a random variable relying on parameter

to be estimated, and is a normalized fBm with Hurst parameter to be estimated. We study the additive linear case, , when . The estimators , and respectively of , and respectively are constructed and their asymptotic behaviors are investigated. The model (1) is simple and we can derive explicit estimators, also the model generalizes the model considered in [11], while the techniques used here to investigate asymptotic properties are elementary (due to the incorporation of the random effects, hence we avoid the Malliavin techniques), which gives as a first reason to choose it. The second reason is that (1) is widely applied in various fields. in fact the Vasiceck model is an example of type (1). The third reason is that the estimation of the population parameters requires few observations per subject, which coincide with several natural phenomena where the repeated measurements are rarely available if not impossible. Finally let’s note that nonparametric estimation has been realized recently by us for similar model[8].

The rest of the paper is organized as follows. In Section 2, we introduce the model and some preliminaries about the likelihood function. In Section 3 we derive the parameters estimators and we establish consistency and asymptotic normality. The simulations are presented in Section 4 while Section 5 contains some concluding remarks and gives directions of further research.
Throughout the paper the notations , and

mean, respectively, simple convergence, convergence almost surely with respect to the probability measure

and convergence in distribution.

2 Model and Preliminary results

Before introducing our estimation techniques, we first state some basic facts about fractional Brownian motions and likelihood function. Let be a stochastic basis satisfying the usual conditions. The natural filtration of a stochastic process is understood as the -completion of the filtration generated by this process. Let , be independent normalized fractional Brownian motion (fBm) with a common Hurst parameter . Let be independent and identically distributed (i.i.d) -valued random variables on the common probability space independent of . Consider subjects with dynamics ruled by the following general linear stochastic differential equations:


where and are supposed to be known in their own spaces. Let the random effects be -measurable with common density , where is some dominating measure on and is unknown parameter. Set , where is an open set in . Sufficient conditions for the existence and uniqueness of solutions to (2) can be found in [18, p. 197] and references therein.

Let denote the space of real continuous functions defined on endowed with -field . The -field is associated with the topology of uniform convergence on . We introduce the distribution on of the process . On ,

denotes the joint distribution of

. Let be the marginal distribution of on . Since the subjects are independent (this is inherited from the independence of and ), the distribution of the whole sample on is defined by . Thus the likelihood can be defined as

where and , provided that for some fixed . It is well known that coincides with the distribution of the process defined by:

when , since in this case the process is markovian (e.g. [7]); hence, the Girsanov formula can be applied to get the derivative . When , the non Markovian property of the coupled process makes the construction of the likelihood very difficult. But in our case, the process is transformed into a for which the law of coincides with the distribution of a parametric fractional diffusion process .

3 Construction of estimators and their asymptotic properties

Consider the following process


Since and are independent the process is a Gaussian process. Furthermore, for each , we have and . For each subject , we consider observations where is a subdivision of . The density of given is expressed as

where and is the common covariance matrix of the subjects , . The log-likelihood of the whole sample is defined as


For a specific distribution (say , we can solve the integrals given in (5). Indeed,


3.1 Estimation of the Hurst parameter

Using data induced by one single subject (without loss of generality, say with ), we may construct a class of estimators of the Hurst index . More precisely, for all and for any filter of order , that is,


Let consider the following arguments : , where , and , with is the usual gamma function. For invertibility of the function , we refer to [2, p. 7].

Theorem 3.1.

The following statements holds true, as the number of observations ,



, where


Following Coeurjolly [2], we set , for . From (7) we see that the filter is of order , . Therefore, substituting by , we obtain

Hence, our estimators coincide with estimators based on -variations of the fBm (see [2, Proposition 2]) and the proof is complete. ∎

3.2 Estimation of the population parameter

Now, assume that is known. From the log-likelihood given by (5) and (6), we derive an estimator given by


For the parameter it sounds very difficult to derive an estimator. However, we can construct an alternative estimator and study its asymptotic behavior. Observing that

is a sample mean drawn from a sequence of i.i.d random variables, one might think that sample variance could also be used to estimate

. Unfortunately, simple computations shows that such a sample variance is not consistent. Thus, as an alternative we propose the following estimator for :

Theorem 3.2.

The estimator is unbaised, and as .


Set . Substituting by , we have so . For the second statement, we consider the random variables defined by


Clearly, are i.i.d random variables with

, then by strong law of large numbers,

converges almost surely to as . Set , we have

Before, we establish the bias of the estimator of , we first give the following result:

Lemma 3.3.

where are random variables given by (10).


Substituting by , the independence of and gets

For the last equality we used the same techniques as in the proof of Theorem 2. For the second statement; by using the random variables s defined previously, we have

Theorem 3.4.

The estimator is asymptotically unbiased, and as .


By virtue of Lemma 3, we get

Applying the strong law of large numbers and the continuous mapping theorem for almost sure convergence, we get

Similar computations lead to

where . In the last equality we used the fact that is a centered Gaussian random variables with variance . ∎


For the case of continuous observation with horizon , we propose the following estimator defined by

It is easy to see that as and is consistent when . The reason we choose this double asymptotic framework, is that we proceed in two steps; in the first step we estimate random effects as the horizon increases to , then we use the empirical mean and variance to estimate , where the random effects are replaced by their estimators.

Theorem 3.5.

The estimators and are asymptotically normal, i.e.




Since is the average of i.i.d random variables with finite mean and finite variance, (11

) follows imediately from the central limit theorem. In order to show (

12) we consider the following random variables , and set . We see that is centered Gaussian process, with variance , and . So using the strong law of large numbers, we have as . Furthermore, the central limit theorem leads to as . Since . Therefore, using Slutsky theorem, the convergence in (12) is easily concluded. ∎

4 Simulations

We will implement the two population parameter estimators for the model that we have studied to show their empirical behavior. We will simulate the observed vectors

using (4) for two numbers of subjects and with different lengths of observations per subject; , and . The fractional Brownian motions are simulated as in [13]. The experiment is as follows : we set equal to , and . For each case, replications involving samples are obtained by resampling trajectories of .

The averages of the estimators and their exact against empirical standard deviations are reported in the Tables

1-3. The tables show that the parameter estimations are generally much closer to their true values as the number of subjects increases. Figures 1-3 display the histograms densities of the estimators, which reveal the convergence toward a limit distribution also as is sufficiently large, this confirms what was established before. Looking at Table 1, we see that the estimating for is not really close to exact values when there are very few observations () per subject when , this case has been observed every time when becomes large than . In this situation, for the real cases where the true value of is not available, it will be better to choose as large as possible () but this leads to huge computational cost for large values of . Yet, to keep the balance between the computational cost and goodness of fit, a small values of and sufficiently large values of should be considered.

True values
Mean (Std. dev.)
Mean (Std. dev.)
Mean and Std. dev.’s
-1.9902 (0.1456 0.1430)
 0.9744 (0.2099 0.1942)
 -1.9964 (0.1549 0.1594)
  1.0303 (0.2376 0.2494)
 -1.9820 (0.1795 0.2009)
 1.3314 (0.3191 0.3891)
-2.0009 ( 0.0460 0.0441)
  0.9964 (0.0670 0.0689)
-1.9986 (0.0490 0.0515)
  1.0442 (0.0758 0.0836)
 -1.9985 (0.0568 0.0634)
  1.2022 (0.1018 0.1228)
Table 1: The means with exact (red) and empirical (blue) standard deviations of estimators , based on samples, with true values , , and different values of ().(For interpretation of the references to colour in this table the reader is referred to the electronic version of this article).
Figure 1: Frequency histograms of population parameter estimates based on samples for different values of . In each box of the two rows (top and bottom ) histograms of (pink) and (gray) are given for fixed parameters . (For interpretation of the references to colour in the legend of this figure, the reader is referred to the electronic version of this article)
True values
Mean (Std. dev.)
Mean (Std. dev.)
Mean and Std. dev.’s
-2.0050 (0.1449 0.1427)
 0.9713 (0.2077 0.2075)
 -2.0146 (0.1549 0.1518)
  1.0028 (0.2376 0.2247)
 -1.9824 (0.1793 0.1920)
 1.0871 (0.3181 0.3391)
-2.0057 (0.0458 0.0434)
  1.0005 (0.0663 0.0671)
-1.9979 (0.0490 0.0498)
  1.0021 (0.0758 0.0758)
 -2.0038 (0.0567 0.0596)
  1.0849 (0.1015 0.1011)
Table 2: The means with exact (red) and empirical (blue) standard deviations of estimators , based on samples, with true values , , and different values of ().(For interpretation of the references to colour in this table the reader is referred to the electronic version of this article).
Figure 2: Frequency histograms of population parameter estimates based on samples for different values of . In each box of the two rows (top