# Drift Estimation for Stochastic Reaction-Diffusion Systems

A parameter estimation problem for a class of semilinear stochastic evolution equations is considered. Conditions for consistency and asymptotic normality are given in terms of growth and continuity properties of the nonlinear part. Emphasis is put on the case of stochastic reaction-diffusion systems. Robustness results for statistical inference under model uncertainty are provided.

## Authors

• 4 publications
• 4 publications
05/15/2020

### Diffusivity Estimation for Activator-Inhibitor Models: Theory and Application to Intracellular Dynamics of the Actin Cytoskeleton

A theory for diffusivity estimation for spatially extended activator-inh...
04/30/2020

### Parameter estimation for semilinear SPDEs from local measurements

This work contributes to the limited literature on estimating the diffus...
01/08/2016

### Cox process representation and inference for stochastic reaction-diffusion processes

Complex behaviour in many systems arises from the stochastic interaction...
07/17/2015

### Classification of Complex Wishart Matrices with a Diffusion-Reaction System guided by Stochastic Distances

We propose a new method for PolSAR (Polarimetric Synthetic Aperture Rada...
02/13/2020

### M-estimation in a diffusion model with application to biosensor transdermal blood alcohol monitoring

With the goal of well-founded statistical inference on an individual's b...
10/12/2021

### Optimal rate of convergence for approximations of SPDEs with non-regular drift

A fully discrete finite difference scheme for stochastic reaction-diffus...
02/22/2019

### AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs

Stochastic differential equations are an important modeling class in man...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. The Model

### 1.1. General Form of the Equation

Throughout this work we fix a final time . Let be a Hilbert space with inner product . Let be some negative definite self-adjoint operator on with compact resolvent and domain . We write . The general model we are interested in is given by the following equation in :

 (3) dXt=(θAXt+F(t,Xt))dt+BdWt,

together with initial condition . Here, is a (possibly nonlinear) measurable operator, is a cylindrical Wiener process on , and is of Hilbert-Schmidt type. As we need weak solutions only, the stochastic basis as well as the cylindrical Wiener process need not to be determined in advance. The number is the unknown parameter to be estimated.

For simplicity, we restrict ourselves to the case . For later use, we introduce some notations. Let

be an ONB of eigenvectors of

such that the corresponding eigenvalues (taking into account multiplicity)

are ordered increasingly. For , the projection onto the span of is called . The Sobolev norms on the spaces will be denoted by . The following Poincaré-type inequalities hold for :

 |PNx|ρ2 ≤ λρ2−ρ1N|PNx|ρ1, |x−PNx|ρ1 ≤ λρ1−ρ2N+1|x−PNx|ρ2.

For our analysis, the regularity spaces

 (4) R(ρ):=C([0,T];D((−A)ρ))∩L2([0,T];D((−A)ρ+12))

will be crucial. Let . We say that (3) has a weak solution111More precisely, this solution is weak in the probabilistic sense as well as in the sense of usual PDE theory. in on if there is a stochastic basis together with a cylindrical Wiener process on and an -adapted process such that

 (5) Xt=X0+∫t0(θAXs+F(s,Xs))ds+∫t0BdWs

a.s. for . We say that ”is” a weak solution to (3) if a stochastic basis and a cylindrical Wiener process can be found such that (5) holds. We need the following class of assumptions, parametrized by :

1. The observed process is a weak solution to (3) on with a.s.

Sufficient conditions for the existence of solutions to (3) can be derived e.g. with the help of [14] or [20]. For , the projected process satisfies

 (6) dXNt=(θAXNt+PNF(t,Xt))dt+PNBdWt.

Throughout this work we assume that the eigenvalues of have polynomial growth, i.e. there exist such that

 (7) λk≍Λkr.

In particular, . Here, denotes asymptotic equivalence of two sequences of positive numbers , in the sense that . Similarly, means for a constant independent of .

### 1.2. Statistical Inference

We describe three estimators for (see [3]), which correspond to different levels of knowledge about the solution trajectory . All estimators depend on a contrast parameter .

1. Given continuous-time observation of the full solution , the heuristic derivation of the maximum likelihood estimator (see [3]) yields the following term:

 (8) ^θfullN:=−∫T0⟨(−A)1+2αXNt,dXNt⟩∫T0|(−A)1+αXNt|2Hdt+biasN(X),

where

 (9) biasN(U):=∫T0V⟨(−A)1+2αXNt,PNF(t,Ut)⟩V∗dt∫T0|(−A)1+αXNt|2Hdt.

This estimator depends on the whole of via the bias term. Note that for this is precisely the estimator given in (2).

2. Assume we observe just the projected solution . In this case, we need to replace the term by and consider the estimator:

 (10) ^θpartialN:=−∫T0⟨(−A)1+2αXNt,dXNt⟩∫T0|(−A)1+αXNt|2Hdt+biasN(XN).
3. In any of the preceding observation schemes, we may leave out the nonlinear term completely:

 (11) ^θlinearN:=−∫T0⟨(−A)1+2αXNt,dXNt⟩∫T0|(−A)1+αXNt|2Hdt

For notational convenience, we suppress the dependence on of all estimators.

### 1.3. The Main Result

In order to state the main theorem of this paper, let us introduce some further conditions on the nonlinearity , indexed by :

1. There is , an integrable function and a continuous function such that

 (12) |F(t,v)|2ρ−12+ϵρ≤(fρ(t)+C|v|2ρ+12)gρ(|v|ρ)

for any and .

Equivalently, we may choose to be just locally bounded, because in this case there is a continuous with . We call the excess regularity of .222Of course, the choice of is not unique.

1. There is , and a continuous function such that

 (13) |F(t,u)−F(t,v)|2ρ−12≤hρ(|u|ρ,|v|ρ)|u−v|Bρρ+12−δρ

for and .

Condition is sufficient to carry out a perturbation argument with respect to the linear case. Condition is sufficient to formalize the intuition that should not be worse than , given that the nonlinear behavior is taken into account at least partially in the bias term.

The regularity must be chosen maximally, i.e. is the maximal value such that holds with probability one. Under the standing assumption , we have the following result:

###### Theorem 1.

Assume and hold with maximal . Let .

1. The estimators , , are consistent for .

2. is asymptotically normal. More precisely,

 (14) Nr+12(^θfullN−θ)→N(0,2θ(r(2α−2γ+1)+1)2TΛ(r(4α−4γ+1)+1))

in distribution as .

3. Assume holds with parameters and . If , then (14) holds with replaced by . Otherwise, converges to with rate bounded from below by .

4. If , where is as in , then (14) holds with replaced by either or . Otherwise, the rate of convergence of and to is bounded from below by .

###### Remark.
• If is a solution to the two-dimensional stochastic Navier-Stokes equations with additive noise and periodic or Dirichlet boundary conditions, we reobtain the results from [3].

• Note that the convergence rate and the asymptotic variance do not depend on properties of

. In this regard, our results are compatible with previous results on linear (see e.g. [7], [17]) for .

• While the conditions and are natural conditions satisfied by a big class of examples, we do not claim that they are necessary for the conclusions of Theorem 1 to hold. Indeed, if and belong to a class of linear differential operators, [7] and subsequent works prove that an estimator of the type is consistent and asymptotically normal for if and only if

 (15) order(A)≥12(order(θA+F)−d),

or equivalently, , where is the dimension of state space. In particular, the degree of may exceed the degree of .

• Elementary considerations show that the asymptotic variance in (14) is minimal for , whereas the convergence rate is not affected by the choice of . In the ideal setting of full information that we study in this work, it is possible to reconstruct (and hence also the regularity ) from the observed trajectory via its quadratic variation process, so we may set right from the beginning. In the case , this corresponds to the true maximum likelihood estimator. In the case of incomplete information on , for example time-discrete observations, which will be studied in future work, the parameter can be used to ensure the divergence of the denominator of the estimators (whose expected value corresponds to the Fisher information).

• Note that the asymptotic variance depends itself on the unknown parameter

. This means that in order to construct confidence intervals it is necessary to modify

(14) in a suitable way. This can be done by means of a variance-stabilizing transform (see e.g. [23, Section 3.2]). Alternatively, Slutsky’s lemma can be used together with any of the consistent estimators for , e.g.

 (16) Nr+12√^θfullN(^θfullN−θ)→N(0,2(r(2α−2γ+1)+1)2TΛ(r(4α−4γ+1)+1)).
• In general, it is not to be expected that holds, whereas with is valid for a broad class of examples.

• Instead of a.s. we may just assume a.s. for any , where is maximal with this property. In this case, the results are still true up to minor technical modifications. For instance, we have to assume instead of that holds for some with in order to to ensure additional regularity for the nonlinear part. The proof of Theorem 1 remains valid up to obvious notational changes.

• It is possible to allow for -dependent nonlinearities . In this case, it suffices to assume that and hold almost surely in such a way that and are deterministic, while , and are allowed to depend on . In particular, it is possible to extend the result to solutions of non-Markovian functional SDEs whose nonlinearity depends on the whole solution trajectory .

## 2. Applications

We now illustrate the general theory by means of some examples. More precisely, we show that and/or hold. We write whenever the nonlinearity in these examples does not depend on time explicitly.

### 2.1. The Linear Case

For completeness, we restate the result for the purely linear case . In this case we can drop condition . All estimators coincide, i.e. , and Theorem 1 reads as follows:

###### Corollary 2.

If , then

 (17) Nr+12(^θfullN−θ)→N(0,2θ(r(2α−2γ+1)+1)2TΛ(r(4α−4γ+1)+1))

in distribution as .

### 2.2. Reaction-Diffusion-Systems

In this section, we consider a bounded domain , , with Dirichlet boundary conditions.333The argument does not depend on the boundary conditions, so Neumann- or Robin-type conditions may be used instead. Set , where is the number of coupled equations. is the Laplacian with domain . Let be a Nemytskii-type operator on , i.e. , whose components are polynomials in variables. The highest degree of the component polynomials of will be denoted by . We assume that .

###### Example 3.

One may choose the Allen-Cahn-type nonlinearity .

The corresponding SPDE

 (18) dXt=(θΔXt+F(Xt))dt+BdWt

is assumed to satisfy for a suitable (maximal) .

###### Proposition 4.
1. If , then holds with .

2. If , then holds with .

###### Proof.
1. We have to control the term . Note that in order to control the norm , it suffices to control its one-dimensional components, so w.l.o.g. we assume . Taking into account the triangle inequality, it suffices to control for some integer . The case is trivial, so let . Now is a closed subspace, and given that , the Sobolev space is closed under multiplication [1]. Thus, for :

 (19) |ul|ρ+1mF≤C|ul|ρ+1l≤C|u|lH2ρ+2l≤C|u|2H2ρ+1|u|l−2H2ρ≤C|u|2ρ+12(1+|u|mF−2ρ),

where we used the interpolation property of Sobolev spaces.

2. As before, we can restrict ourselves to the case with . For , the estimate from is trivial, so assume . Again using the algebra property of the Sobolev spaces , we have for :

 |ul−vl|ρ−12≤C|u−v|ρ−12(l−1∑i=0|u|iρ−12|v|l−1−iρ−12)≤C|u−v|ρ−12(l−1∑i=0|u|iρ|v|l−1−iρ),

and the claim follows with and .

###### Remark.

Note that the same proof allows to cover the more general case of polynomial nonlinearities whose coefficients depend on , as long as these coefficients are regular enough.

Taking into account that the growth rate of the eigenvalues of the Laplacian is given by (see [24], or e.g. [22, Section 13.4]), we get the following result:

###### Corollary 5.

Let .

1. If is arbitrary and , then the estimator is asymptotically normal with rate and asymptotic variance given by

 (20) V=2θ(4α−4γ+n+2)2TΛn(8α−8γ+n+2).
2. If and , the same is true for .

Said another way, and are asymptotically normal if is sufficiently regular and the contrast parameter is sufficiently high.

###### Remark.

Consider the important special case and . In this case, is true for any . Furthermore, , so part of Theorem 1 applies. All three estimators are asymptotically normal without further assumptions on the regularity of .

### 2.3. Burgers’ Equation

We point out that the validity of this example has been conjectured in [2]. Consider the stochastic viscous Burgers’ equation

 (21) dXt=(νΔXt−Xt∂xXt)dt+BdWt

on , , with Dirichlet boundary conditions.444As before, the argument does not depend on the boundary conditions being of Dirichlet type. Here,

 (22) F(v)=−v∂xv=∂x(−12v2).

In this setting we have , .

We follow the convention to denote the viscosity parameter by instead of . Likewise, the estimators will be called , and . Existence and uniqueness of a solution to (21) can be shown as in [14]. We need just slightly more regularity, i.e. for some , in order to infer .

###### Proposition 6.

Property holds for any with .

###### Proof.

In one spatial dimension, the Sobolev space is an algebra for . So,

 |F(v)|2ρ−14 = 14|∂x(v2)|2H2ρ−12(D)≤14|v2|2H2ρ+12(D) ≤ C|v|4H2ρ+12(D)=C|v|4ρ+14≤C|v|2ρ|v|2ρ+12.

###### Corollary 7.

Let be the maximal regularity of and . Then the estimator is asymptotically normal with rate and asymptotic variance given by

 (23) V=2θ(4α−4γ+3)2TΛ(8α−8γ+3).

Similar calculations show that holds with , which is not sufficient to transfer asymptotic normality to but yields consistency with rate at least .

### 2.4. Robustness under Model Uncertainty

In the preceding examples we assumed that the dynamical law of the process we are interested in is perfectly known. However, it may be reasonable to consider the case when this is not true. We may formalize such a partially unknown model as

 (24)

where is an unknown perturbation. We assume that the model is well-posed (i.e. holds for suitable ) and that satisfies . Let , and be given by the same terms as before, i.e. and include knowledge on but not on .

###### Proposition 8.

If satisfies , then , and are consistent.

This follows directly from the discussion in Subsection 4.2, taking into account the decomposition

 (25) ^θfullN−θ=−∫T0⟨(−A)1+2αXNt,PNBdWt⟩∫T0|(−A)1+αXNt|2Hdt−biasGN(X)

with

 (26) biasGN(X)=∫T0V⟨(−A)1+2αXNt,PNG(t,Xt)⟩V∗dt∫T0|(−A)1+αXNt|2Hdt

and similar decompositions for and .

It is easy to verify that if holds for and separately with excess regularity resp. , then a version of holds for as well, with excess regularity . However, in general the excess regularity of can be chosen higher due to cancellation effects of and . A lower bound for the rate of convergence of the estimators is given by .

###### Corollary 9.
1. If , then is asymptotically normal with rate .

2. If and satisfies with , then is asymptotically normal with rate .

3. If , then asymptotic normality with rate carries over to all estimators.

Said another way, the excess regularity of determines essentially to what extent the results from Theorem 1 remain valid. A high value for corresponds to a small perturbation.

###### Remarks.

• In applications it is common to approximate a complicated nonlinear system by its linearization. From this point of view, the case that itself is linear in (24) becomes relevant. Of course, it is desirable to maintain the statistical properties of the linear model under a broad class of nonlinear perturbations.

• It is possible to interpret the nonlinear perturbation as follows: Assume there is a true nonlinearity describing the model precisely. Assume further that we either do not know the form of or we do not want to handle it directly due to its complexity. Instead, we approximate by some nonlinearity which we can control. If our approximation is good (in the sense that holds for with suitable excess regularity), then the quality of the estimators which are merely based on the approximating model can be guaranteed, i.e. they are consistent or even asymptotically normal. The approximating quality of is measured by the excess regularity of .

• As is unknown, no knowledge of can be incorporated into the estimators, and condition need not be required to hold for .

• The previous examples show that is fulfilled for a broad class of nonlinearities (assuming that is sufficiently high if necessary).

## 3. Numerical Simulation

We simulate the Allen-Cahn equation

 dXt=(θΔXt+Xt−X3t)dt+(−Δ)−γdWt

on with Dirichlet boundary conditions and initial condition . We discretize the equation in Fourier space and simulate modes with a linear-implicit Euler scheme with temporal stepsize up to time . The spatial grid is uniform with mesh . The true parameter is . We have run Monte-Carlo simulations for each of the choices and . In any case, we have set . Remember that in this setting all estimators are asymptotically normal.

Figure 1 illustrates consistency, the convergence rate and the asymptotic distribution from Theorem 1. As expected, the values of and are closer to each other than to . Note that the quality of in this simulation depends on the level of noise given by , with decreasing accuracy under smooth noise. Our interpretation is that the nonlinearity becomes more highlighted if the noise is less rough.

We mention that for simulations with even higher values of (take ), the values of are mostly negative and therefore not related to the true parameter, while and stay consistent. Of course, this effect may be influenced by the number of Fourier modes used for the simulation.

## 4. Proof of Theorem 1

We follow closely the arguments which have been given in [3]

for the special case of the Navier-Stokes equations in two dimensions. Using a slightly different version of the central limit theorem (CLT) for local martingales, we obtain a direct proof of the asymptotic normality for

.

### 4.1. Asymptotic Estimates for the Linear Case

First, we recall briefly some results for the case . Consider the linear equation

 (27) d¯¯¯¯¯Xt=θA¯¯¯¯¯Xtdt+BdWt,¯¯¯¯¯X0=0,

where . We define . Then the are independent one-dimensional Ornstein-Uhlenbeck processes

 (28) d¯¯¯xkt=−θλk¯¯¯xktdt+λ−γkdWkt,¯¯¯xk0=0,

where are independent one-dimensional Brownian motions, and the solutions have the explicit representation

 (29) ¯¯¯xkt=λ−γk∫t0e−θλk(t−s)dWks.

It holds

1. and

as .

###### Sketch of proof.

Use that and , , are jointly Gaussian with mean zero and

 E[¯¯¯xks¯¯¯xkt]=λ−(2γ+1)k2θ(e−θλk(t−s)−e−θλk(t+s)).

Now follows with the help of . For , use

 E[(¯¯¯xks)2(¯¯¯xkt)2]=Var[¯¯¯xks]Var[¯¯¯xkt]+2Cov(¯¯¯xks,¯¯¯xkt)2

and . ∎

We write . By multiplying the asymptotic representations from Lemma 10 with resp. and summing up to index , we obtain the following cumulative version if :

 (30) E∫T0|(−A)1+α¯¯¯¯¯XNt|2Hdt≍CEαNr(2α−2γ+1)+1,

and if :

 (31) Var∫T0|(−A)1+α¯¯¯¯¯XNt|2Hdt≍CVarαNr(4α−4γ+1)+1,

where

 CEα = CE(θ,T,