# On Cusp Location Estimation for Perturbed Dynamical Systems

We consider the problem of parameter estimation in the case of observation of the trajectory of diffusion process. We suppose that the drift coefficient has a singularity of cusp-type and the unknown parameter corresponds to the position of the point of the cusp. The asymptotic properties of the maximum likelihood estimator and Bayesian estimators are described in the asymptotics of small noise, i.e., as the diffusion coefficient tends to zero. The consistency, limit distributions and the convergence of moments of these estimators are established.

11/13/2017

### Adaptive estimation and noise detection for an ergodic diffusion with observation noises

We research adaptive maximum likelihood-type estimation for an ergodic d...
03/29/2021

### Inference in the stochastic Cox-Ingersol-Ross diffusion process with continuous sampling: Computational aspects and simulation

In this paper, we consider a stochastic model based on the Cox- Ingersol...
04/30/2019

### On the parameter estimation of ARMA(p,q) model by approximate Bayesian computation

In this paper, the parameter estimation of ARMA(p,q) model is given by a...
01/10/2020

### Convergence of Bayesian Estimators for Diffusions in Genetics

A number of discrete time, finite population size models in genetics des...
05/29/2018

### Bayesian Estimations for Diagonalizable Bilinear SPDEs

The main goal of this paper is to study the parameter estimation problem...
05/25/2021

### Diffusion Means in Geometric Spaces

We introduce a location statistic for distributions on non-linear geomet...
07/21/2020

### Maximum likelihood estimation of potential energy in interacting particle systems from single-trajectory data

This paper concerns the parameter estimation problem for the quadratic p...

## 1 Introduction

Let us consider the following problem. The observed continuous time trajectory of the diffusion process satisfies the stochastic differential equation

 dXt=S(ϑ,Xt)dt+εdWt,X0=x0,0≤t≤T, (1)

where is the standard Wiener process and the drift coefficient has a cusp-type singularity, i.e., at the vicinity of the point we have , where . The parameter is unknown and we have to estimate it by the observations . We are interested in the asymptotic properties of the estimators of this parameter in the asymptotics of small noise: .

Such stochastic models, called sometimes, dynamical systems with small noise or perturbed dynamical systems attract attention of probabilists and statisticians (see, for example, Freidlin and Wentzel [7] and Kutoyants [12]

and references therein). The interest to this stochastic models can be explained as follows. Suppose that we have a dynamical system described by the ordinary differential equation

 dxtdt=S(ϑ,xt),x0,0≤t≤T. (2)

The right hand part (rhp) of this system depends on some parameter and therefore the state of the dynamical system of course depends on the value of this parameter, i.e., . If we know , then we know the trajectory . For many real systems it is natural to suppose that the rhp contains some small noise (perturbations)

 dXtdt=S(ϑ,Xt)+εnt,x0,0≤t≤T. (3)

The most “popular” noise considered in the corresponding literature is the so-called white Gaussian noise (WGN), i.e., is a Gaussian process with the properties . Here is the Dirac delta-function. In this case the observations of the system (3) can be written as solution of the stochastic differential equation (1). Therefore we replaced by the derivative of the standard Wiener process. Of course, the Wiener process is not differentiable and the equation (1) is just a short-writing of the corresponding integral equation

 Xt=x0+∫t0S(ϑ,Xs)ds+εWt,0≤t≤T.

A wide class of estimation problems (parameter estimation and nonparametric estimation) were considered in [12]. The properties of estimators (maximum likelihood, Bayesian, minimum distance) are well studied in regular (smooth with respect to the unknown parameter) and non regular (change point, delay estimation) cases. The smooth case corresponds to the trend coefficient continuously differentiable w.r.t. and finite Fisher information. The change-point problem can be described by the following example

 dXt=h(Xt)1I{ϑ

i.e., we have a switching diffusion process with unknown threshold . Such models are called threshold diffusion processes like threshold autoregressive (TAR) time series [1] and statistical problems related to this model are singular [14]. If we have a cusp-type singularity as

where , then for close to zero we have cusp-type switching similar to change-point, but without jump. Usually the characteristics of the real systems can not “make jumps” and the cusp-type switching sometimes fits better to the real systems.

In the present work we are interested in the properties of these estimators when the trend coefficient has a singularity like cusp. This case is in some sense intermediate between regular case and the change-point (discontinuous drift) case. The statistical problems with the models having cusp-type singularities were studied since 1968, when Prakasa Rao [19] described the asymptotic distribution of the MLE in the case of i.i.d. observations with the density function having the representation with at the vicinity of the point . It was shown that

where

is some constant and the random variable

will be described later. Note that in this case the Fisher information does not exist and the study of estimators requires special techniques. The exhaustive treatment of singular estimation problems (including cusp-type singularity) can be found in the Chapter VI of the fundamental work by Ibragimov and Khasminskii [9]. In this work one can find the general results concerning the asymptotic behavior of the MLE and Bayesian estimators in the situations including cusp-type singularity. In particular, they described the asymptotic distribution of the MLE and BE and showed that the BE are asymptotically efficient in minimax sense. For inhomogeneous Poisson processes with the intensity functions having a cusp-type singularity the properties of the MLE and BE were described in [3]. For ergodic diffusion processes with the drift coefficient having cusp-type singularity the similar results were obtained in [4]. The case of cusp-type singularity for the model of observations of regression model were treated in [20] and in [6]. For the model of signal in WGN, where the signal has cusp-type singularity such results were obtained in [2]. Note that the case was considered in [8] (ergodic diffusion) and in [10]. The survey of the properties of estimators for the different models of stochastic processes with cusp-type singularities can be found in [5].

The method of the study of estimators through the properties of the normalized likelihood ratio developed in the work [9] is in some sense of universal nature. It was applied in the study of estimators for a wide class of models of observations and is applied in the present work too. In particular, we check the conditions of two general theorems (Theorem 1.10.1 and Theorem 1.10.2) in [9] concerning the behavior of estimators.

We show that the MLE and Bayesian estimators are consistent, have different limit distributions

 ε1κ+12(^ϑε−ϑ)⟹c^u,ε1κ+12(~ϑε−ϑ)⟹c~u,

with the same constant , the polynomial moments of these estimators converge and that the BE are asymptotically efficient. The random variables and are defined in the next section.

## 2 Main result

We suppose that the following condition is fulfilled:
Condition . The drift coefficient

 S(ϑ,x)=a|x−ϑ|κ+h(x),

where and . The function is bounded, has continuous bounded derivative w.r.t. : and is separated from zero: (for all ). The parameter , where and .

The limit of is – solution of the deterministic equation

 dxtdt=a|xt−ϑ0|κ+h(xt),x0,0≤t≤T. (4)

Note that by this condition we have the estimate

 |S(ϑ,x)|≤L(1+|x|κ) (5)

with some . Here and in the sequel we denoted the true value. Let us denote

 0

where .

The properties of the maximum likelihood and Bayesian estimates are described with the help of the limit likelihood ratio. Let us remind that the likelihood ratio in this problem is (see Liptser and Shiryaev [15])

 L(ϑ,Xε)=exp{∫T0S(ϑ,Xt)ε2dXt−∫T0S(ϑ,Xt)22ε2dt}.

The maximum likelihood estimator (MLE) is defined as solution of the equation

 L(^ϑε,Xε)=supθ∈ΘL(θ,Xε).

If this equation has more than one solution, then we can take anyone as MLE. Note that we cannot use the maximum likelihood equation

 ˙L(θ,Xε)=0,θ∈Θ,

where dot means derivative w.r.t. because the likelihood ratio function is not differentiable.

The Bayesian estimator (BE)

for the quadratic loss function and density a priori

(continuous positive function) is defined by the expression

 ~ϑε=∫βαθp(θ|Xε)dθ=∫βαθp(θ)L(θ,Xε)dθ∫βαp(θ)L(θ,Xε)dθ.

We take quadratic loss function for the simplicity of exposition. The established in this work properties of the likelihood ratio allow to describe the behavior of the BE for essentially wider class of loss functions (see Theorem 1.10.2 in [9]).

The limit behavior of the MLE and BE are described with the help of two random variables and defined as follows. Let us introduce the random function

 Z(u)=exp{WH(u)−|u|2H2},u∈R (6)

and put

 Z(^u)=supu∈RZ(u),~u=∫RuZ(u)du∫RZ(u)du. (7)

Here is two-sided fractional Brownian motion with Hurst parameter . The random variable is well defined [18]. We need as well the definitions

 Γ2ϑ =a2h(ϑ)∫∞−∞(|s−1|κ−|s|κ)2ds,γϑ=Γ1/Hϑ, ^uϑ0 =^uγϑ0,~uϑ0=~uγϑ0,^W=sup0≤t≤T|Wt|.

As usual in such problems, we can introduce the lower minimax bound on the risks of all estimators:

###### Proposition 1

Let the condition be fulfilled then for all and all estimators we have

 (8)

The proof of this proposition we discuss after the proof of the Theorem 1 below.

According to this bound we call an estimator asymptotically efficient if for all we have the equality

 limδ→0limε→0sup|ϑ−ϑ0|≤δε−42κ+1Eϑ(ϑ∗ε−ϑ)2=E(~u)2γ2ϑ0.

The main result of this work is the following theorem.

###### Theorem 1

Let the condition be fulfilled, then the MLE and the BE are uniformly on compacts consistent, have different limit distributions

 ε−1/H(^ϑε−ϑ0)⟹^uϑ0,ε−1/H(~ϑε−ϑ0)⟹~uϑ0,

the moments converge (uniformly on compacts): for any

 Eϑ0∣∣ ∣∣^ϑε−ϑ0ε1/H∣∣ ∣∣p⟶Eϑ0∣∣^uϑ0∣∣p,Eϑ0∣∣∣~ϑε−ϑ0ε1/H∣∣∣p⟶Eϑ0∣∣~uϑ0∣∣p,

and the Bayesian estimators are asymptotically efficient.

Proof. Let us introduce the normalized likelihood ratio

 Zε(u)=L(ϑ0+ε1/Hu,Xε)L(ϑ0,Xε),u∈Uε=(α−ϑ0ε1/H,β−ϑ0ε1/H).

It has the representation

 Zε(u) =exp{∫T0S(ϑ0+ε1/Hu,Xt)−S(ϑ0,Xt)εdWt −∫T0(S(ϑ0+ε1/Hu,Xt)−S(ϑ0,Xt))22ε2dt⎫⎬⎭.

We show below that converges in distribution to the random function .

The first result which we are going to prove is the uniform convergence of the random process to the deterministic solution of the ordinary equation (4). To prove it we need the following estimate.

###### Lemma 1

(N.V. Krylov [11]) Let the conditions be fulfilled, then there exists a constant

such that with probability 1

 sup0≤t≤T|Xt−xt|≤L∗(εκ^Wκ+ε^W). (9)

Proof. Let us denote by the right hand part of the equation (4). Then we can write

 dxtF(xt)=dt,and∫xtx0dyF(y)=t. (10)

If we put , then the equation

 dXt=a|Xt−ϑ0|κdt+h(Xt)dt+εdWt,X0=x0

can be written as

 dYt=a|Yt−ϑ0+εWt|κdt+h(Yt+εWt)dt,Y0=x0,

or

 dYtdt=a|Yt−ϑ0+εWt|κ+h(Yt+εWt),Y0=x0.

Using the smoothness of and the elementary inequalities

 |a+b|κ≤|a|κ+|b|κ,|a+b|κ≥|a|κ−|b|κ,

we write two estimates

 dYtdt ≤a|Yt−ϑ0|κ+h(Yt)+εκ|Wt|κ+εC|Wt|, dYtdt ≥a|Yt−ϑ0|κ+h(Yt)−εκ|Wt|κ−εC|Wt|.

Hence we have

 dYtdt ≤F(Yt)+εκ^Wκ+εC^W, dYtdt ≥F(Yt)−εκ^Wκ−εC^W

and (remind that )

 ∫Ytx0dyF(y) ≤t+b−1Tεκ^Wκ+b−1εCT^W, ∫Ytx0dyF(y) ≥t−b−1Tεκ^Wκ−b−1εCT^W.

The equality (10) allows to write

 ∫YtxtdyF(y) ≤b−1Tεκ^Wκ+b−1εCT^W, ∫YtxtdyF(y) ≥−b−1Tεκ^Wκ−b−1εCT^W.

As the function is continuous we have

 −b−1Tεκ^Wκ−b−1εCT^W≤(Yt−xt)F(~y)≤b−1Tεκ^Wκ+b−1εCT^W,

where . Hence

 ∣∣∣Yt−xtF(~y)∣∣∣≤b−1Tεκ^Wκ+b−1εCT^W.

Recall that is bounded and separated from zero by a positive constant which does not depend on . Further, there exists a constant such that

 ∣∣∣Yt−xtF(~y)∣∣∣≥c1|Xt−xt+εWt|≥c1|Xt−xt|−cε^W.

Therefore

 |Xt−xt|≤L∗(εκ^Wκ+ε^W)

where the constant .

###### Lemma 2

Let the condition be fulfilled, then for any there exist the constants and such that

 supϑ∈ΘPϑ{sup0≤t≤T|Xt−xt|>εκ1}≤e−c∗ε−ν (11)

for all with some .

Proof. Remind that for any

 P{^W>N}=P{sup0≤t≤T|Wt|>N}≤4P{WT>N}≤4N√T2πe−N2/2T.

Hence we can write

 Pϑ0{sup0≤t≤T|Xt−xt|>εκ1}≤P{L∗(εκ^Wκ+ε^W)>εκ1} =P{^Wκ+ε1−κ^W>L−1∗εκ1−κ} ≤P{2^Wκ>L−1∗εκ1−κ}+P{ε1−κ^W>^Wκ} ≤P{^W>(2L∗)−1/κεκ1−κκ}+P{^W>ε−1} ≤4εκ−κ1κ(2L∗)1κ√T2πexp⎧⎪⎨⎪⎩−ε−2(κ−κ1)κ2T(2L∗)2κ⎫⎪⎬⎪⎭+4ε√T2πe−ε−22T.

The last expression allows us to take such that for all we have the estimate (11) where and .

###### Lemma 3

Let the condition be fulfilled then the finite dimensional distributions of the stochastic process converge to the finite dimensional distributions of and this convergence is uniform on the compacts .

Proof. Consider the stochastic integral

 Iε(u,X0) =1ε∫T0(S(ϑ0+ε1/Hu,xt)−S(ϑ0,xt))dWt =aε∫T0(∣∣xt−ϑ0−ε1/Hu∣∣κ−|xt−ϑ0|κ)dWt.

Note that is a Gaussian process. By condition the solution is strictly increasing function. Therefore we can put by the relation

 t=∫xx0dyS(ϑ0,y),x∈[x0,xT].

This provides us the equality ()

 Eϑ0(Wt(x1)−Wt(x2))2=∫x2x1dyS(ϑ0,y).

Hence if we put

 w(x)=∫xx0√S(ϑ0,y)dWt(y),x0≤x≤xT,

then is a Gaussian process with independent increments

 Eϑ0(w(x1)−w(x2))2=x2−x1

and

 ∫T0(∣∣xt−ϑ0−ε1/Hu∣∣κ−|xt−ϑ0|κ)dWt =∫xTx0(∣∣x−ϑ0−ε1/Hu∣∣κ−|x−ϑ0|κ)√S(ϑ0,x)dw(x).

Further, let us change the variables . Then

 Iε(u,X0) =a∫xT−ϑ0ε1/Hx0−ϑ0ε1/H(|s−u|κ−|s|κ)√S(ϑ0,ϑ0+sε1/H)dW(s) =a√h(ϑ0)∫xT−ϑ0ε1/Hx0−ϑ0ε1/H(|s−u|κ−|s|κ)dW(s)(1+o(1))

with the corresponding two-sided Wiener process

 W(s)=W1(s)1I{s≥0}+W2(−s)1I{s≤0},s∈[x0−ϑ0ε1/H,xT−ϑ0ε1/H].

Here and are two independent standard Wiener processes. We used here the relation

 S(ϑ0,ϑ0+sε1/Hu)=aεκ/H|su|κ+h(ϑ0+sε1/Hu)=h(ϑ0)+o(1).

Therefore for any fixed value we have the following representation of the limit process

It has the following properties: and

 EI0(u)2=a2h(ϑ0)∫∞−∞(|s−u|κ−|s|κ)2ds=|u|2κ+1Γ2ϑ0.

The process

 WH(u)=a√h(ϑ0)Γϑ0∫∞−∞(|s−u|κ−|s|κ)dW(s),u∈R

is known as a representation of the two-sided fractional Brownian motion, because is a Gaussian process with the properties:

 EWH(u)=0,E[WH(u)]2=|u|2κ+1=|u|2H.

Hence using the standard arguments we obtain the convergence of the finite-dimensional distributions

 (Iε(u1,X0),…,Iε(uk,X0))⟹(I0(u1),…,I0(uk))

and this convergence is uniform on the compacts .

Let us consider the ordinary integral

 Jε(u,Xε) =∫T0(S(ϑ0+ε1/Hu,Xt)−S(ϑ0,Xt)ε)2dt =a2ε2∫T0(∣∣Xt−ϑ0−ε1/Hu∣∣κ−|Xt−ϑ0|κ)2dt.

If we show the convergence in probability

 Jε(u,Xε)−Jε(u,X0)⟶0, (12)

then we obtain the convergence

 Iε(u,Xε)⟹I0(u)=Γϑ0WH(u).

We can write

 Jε(u,Xε)−Jε(u,X0)=a2ε2∫T0(Δ(u,Xt)2−Δ(u,xt)2)dt,

where we denoted

 Δ(Xt,u)=∣∣Xt−ϑ0−ε1/Hu∣∣κ−|Xt−ϑ0|κ.

Let us denote the normalized local time of the diffusion process and remind that for any function we have the occupation time formula

 ∫T0g(Xt)dt=∫∞−∞g(x)ℓε(x)dx. (13)

Moreover, according to (9), we know that

 ∫T0g(Xt)dt⟶ ∫T0g(xt)dt=∫T0g(xt)S(ϑ0,xt)dxt=∫xTx0g(x)S(ϑ0,x)dx =∫∞−∞g(x)ℓ0(x)dx, (14)

where we denoted Hence for any continuous function we have the convergence

 ∫∞−∞g(x)ℓε(x)dx⟶∫xTx0g(x)ℓ0(x)dx, ∫∞−∞g(x)Eϑ0ℓε(x)dx⟶∫xTx0g(x)ℓ0(x)dx

(see details in [13]). For example, for any small and

 Eϑ0∫T01I{y−δ

We can write

 Jε(u,Xε) =a2∫∞−∞(|v−u|κ−|v|κ)2ℓε(ϑ0+vε1/H)dv ⟶a2ℓ0(ϑ0)∫∞−∞(|v−u|κ−|v|κ)2dv =a2ℓ0(ϑ0)|u|2κ+1∫∞−∞(|s−1|κ−|s|κ)2ds=Γ2ϑ0|u|2κ+1,

where we put and .

For we have the similar relations

 Jε(u,X0) =a2ε2∫T0(∣∣xt−ϑ0−ε1/Hu∣∣κ−|xt−ϑ0|κ)2dt =a2ε2∫xTx0(∣∣x−ϑ0−ε1/Hu∣∣κ−|x−ϑ0|κ)2S(ϑ0,x)dx =a2ε2ε2κ+1H∫xT−ϑ0ε1/Hx0−ϑ0ε1/H(|v−u|κ−|v|κ)