# L2 convergence of smooth approximations of Stochastic Differential Equations with unbounded coefficients

The aim of this paper is to obtain convergence in mean in the uniform topology of piecewise linear approximations of Stochastic Differential Equations (SDEs) with C^1 drift and C^2 diffusion coefficients with uniformly bounded derivatives. Convergence analyses for such Wong-Zakai approximations most often assume that the coefficients of the SDE are uniformly bounded. Almost sure convergence in the unbounded case can be obtained using now standard rough path techniques, although L^q convergence appears yet to be established and is of importance for several applications involving Monte-Carlo approximations. We consider L^2 convergence in the unbounded case using a combination of traditional stochastic analysis and rough path techniques. We expect our proof technique extend to more general piecewise smooth approximations.

## Authors

• 5 publications
06/24/2020

### Semi-implicit Taylor schemes for stiff rough differential equations

We study a class of semi-implicit Taylor-type numerical methods that are...
03/19/2021

### Multilevel Picard approximations for McKean-Vlasov stochastic differential equations

In the literatur there exist approximation methods for McKean-Vlasov sto...
01/15/2020

### Stability equivalence among stochastic differential equations and stochastic differential equations with piecewise continuous arguments and corresponding Euler-Maruyama methods

In this paper, we consider the equivalence of the pth moment exponential...
07/12/2019

### Convergent discretisation schemes for transition path theory for diffusion processes

In the analysis of metastable diffusion processes, Transition Path Theor...
10/04/2021

### Taming singular stochastic differential equations: A numerical method

We consider a generic and explicit tamed Euler–Maruyama scheme for multi...
07/25/2020

### Convergence of Density Approximations for Stochastic Heat Equation

This paper investigates the convergence of density approximations for st...
11/08/2020

### Smooth approximations and CSPs over finitely bounded homogeneous structures

We develop the novel machinery of smooth approximations, and apply it to...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Given a filtered probability space

satisfying the usual conditions and a time interval , we are interested in a stochastic process given by

 Xt=x+∫t0b(Xs)ds+∫t0σ(Xs)dBs, (1.1)

where , and satisfy appropriate regularity conditions and is a -dimensional Wiener process. The stochastic integrals in the rhs of (1.1) are interpreted in the Ito sense. It is often necessary in many applications to utilise smooth approximations of the Brownian motion and to consider the corresponding solutions to (1.1). To this end, let denote a partition of the time interval consisting of units, and associated filtration . In the case where a sequence of approximations are -martingales and satisfy a uniform tightness condition [JMP89], it is known that (1.1) with instead of converges in distribution to (1.1)111This holds for more general semimartingale drivers, but for the purposes of this paper, we focus on the case of Wiener processes. [KP91]. Many piecewise smooth approximations do not satisfy these conditions so that the limiting process involves additional correction terms. More precisely, consider a piecewise smooth approximation of the Brownian motion , with as its limiting process, and the ODE

 dXdt=b(Xdt)dt+σ(Xdt)˙Bdtdt. (1.2)

In the above and throughout the article, we drop the for notational ease. It is now well-known that under certain regularity conditions on the coefficients and conditions on the construction of , solutions of (1.2) do not converge to solutions of (1.1) as the mesh size , but rather to a Stratonovich SDE plus an anti-symmetric drift term that depends on how the approximation is constructed, i.e.

 dXt=b(Xt)dt+σ(Xt)∘dBt+v∑n,m=1Sn,m[AnXt,AmXt]dt, (1.3)

where the notation denotes the -th entry of a matrix , denotes the Lie bracket and the operator is defined by

. One important condition for the previous statement is the existence of a skew symmetric matrix

as the limit of the area process

 12δE[∫δ0Bds⊗˙Bds−˙Bds⊗Bdsds].

For reference, see Theorem 7.2 in [IW89] which captures much of the pioneering work on this matter for the multi-dimensional case by e.g. [Sus91], [McS72], [SV72], [NY76], [IYN77]. Such piecewise smooth approximation models are relevant in many areas of mathematical modelling where it is necessary to work with ODEs with random fluctuations, e.g. for extracting macroscopic models from detailed microscopic stochastic dynamics (e.g. [HSS03], [PS08]). Secondly such approximations are useful because they are naturally tied to the Stratonovich interpretation of stochastic integrals, in particular, when the anti-symmetric term in (1.3

) is zero, which holds in the scalar case, when the vector fields

commute, or when which occurs for piecewise linear approximations or mollifiers [IW89]. Continuity of the solutions of ODEs driven by such smooth approximations with respect to the driving path in the uniform topology is ensured by the Doss-Sussman theorem. This property is advantageous for various applications, in particular for robust filtering [CC05]. From a numerical approximation point of view, it is possible to utilise the many techniques available for ODEs (rather than SDEs) to then construct time-discrete approximations to (1.3). This is in constrast to more direct numerical approximation schemes applied to Ito SDEs (1.1) like the Euler-Maruyama or Milstein which converge strongly to (1.1) as the time step tends to zero, i.e. without any additional correction term. The convergence analysis of Wong-Zakai approximations is complicated by the presence of anticipative integrands, so that the traditional martingale techniques cannot be directly applied as in the analysis of Euler-Maruyama and related schemes.

The literature on convergence analyses of piecewise smooth approximations for both SDEs and SPDEs (e.g. [HP15], [GS06], [DFS14], [TZ06], [HKX02], [BF95]) is enormous, starting from the seminal work of [EM65] where convergence in probability of (1.2) to (1.3) with (antisymmetric term) in the scalar case was established. Much of the inital work in this direction for SDEs relies on standard stochastic analysis techniques, loosely speaking, by introducing time shifts so that martingale inequalities can be applied (e.g. [GM04], [IW89]). In these works, uniform boundedness of the coefficients is crucial in order to control the extra terms introduced by the time shift. More recently, [LH20] showed convergence in probability in the Skorokhod topology of Wong-Zakai approximations of scalar SDEs driven by a general semimartingale (thereby extending the earlier work of [Pro85] and [KPP95]), again for uniformly bounded coefficients. Over the last couple of decades, the theory of rough paths pioneered by T. Lyons [Lyo98] has been utilised to obtain convergence results for a broader class of drivers and smooth approximations (see e.g. [FO09], [Lej06], [CFV07], [FH14]). After defining an appropriate lift of the driving path (e.g. Stratonovich lift in the case of piecewise linear approximations), and by ensuring that the pathwise solution of the rough differential equation (RDE) coincides with the limiting SDE, convergence is easily established using continuity of the Ito-Lyons map (i.e. the solution map corresponding to the RDE, as a function of the initial condition and driver) (see e.g. Theorem 9.3 in [FH14]). Generally, weak convergence of smooth approximations of the driver then imply weak convergence of the (random) ODE solutions. Strong convergence results can be obtained in specific situations; e.g. when is and is in the sense of Stein, for dyadic piecewise linear approximations of Brownian motion [CFV07] (see also [FH14] for the non-dyadic case). A common feature of these works is the lipschitz assumption as in the Stein sense [FO09], [CFV07], therefore requiring uniform boundedness of the coefficients. [KM16] use rough path techniques to obtain weak convergence under much weaker conditions ( and for the drift and diffusion coefficients respectively) and for a broad range of smooth approximations satisfying a weak invariance principle. As noted in [FH14], it is in general not straightforward to use rough path techniques for convergence, although Chapter 10 of [FH14] and also [FV10] presents some analysis in this direction in the case of Gaussian processes. To summarise, the aforementioned results apply to the case of bounded coefficients and/or weak convergence or almost sure convergence.

Our aim is to obtain rates of convergence of piecewise smooth approximations of (1.1) with unbounded coefficients.

convergence of numerical approximations of SDEs is often of central importance in many Monte Carlo sampling methods such as in Markov Chain Monte Carlo

[HJK12] and in control-type sequential Monte Carlo methods which rely on empirical approximations of McKean-Vlasov SDEs [PRS20]. Here we present a simple approach that combines elements of stochastic analysis and rough path techniques to achieve this result, specifically, focusing on nested piecewise linear approximations of the Brownian path. That is, is a partition of the time interval with associated mesh size such that and as and

 Bdt=Bti+t−tiδd(Bti+1−Bti),fort∈[ti,ti+1).

We expect that our result can be extended to more general approximations using the same proof structure.

### 1.1 Statement of Main Result

Before stating our main result, we state the main assumptions utilised throughout the paper.

###### Assumption 1.1.

and and there exists some constant such that

where denotes the th column of the matrix .

###### Assumption 1.2.

The vector function with -th entry given by

 Σi(x):=r∑k,m=1v∑l=1σl,k(x)∂∂xlσi,m(x)

is globally Lipschitz-continuous.

These assumptions ensure well-posedness of the ODE and limiting SDE, as well as the RDE corresponding to the Stratonovich lift of . Assumption 1.2 in particular plays a crucial role in obtaining a pathwise control on solutions to the RDE, see Lemma 3.3.

###### Theorem 1.1.

Assume (1.1) and (1.2). Suppose is the unique solution of

 dXdt=b(Xdt)dt+σ(Xdt)˙Bdtdt (1.4)

and is the unique strong solution of

 dXt=b(Xt)dt+σ(Xt)∘dBt. (1.5)

with for some fixed but arbitrary . It holds that for every ,

 limd→∞E[sup0≤t≤T∣∣Xt−Xdt∣∣2]=0.
###### Remark 1.3.

Almost sure convergence under Assumptions 1.1 and 1.2 can be obtained using continuity of the Ito-Lyons map established in [Lej12] and standard rough path arguments.

## 2 Notation and Background on rough path theory

We briefly summarise some of the necessary background on rough path theory that will be utilised in Lemma 3.3. This summary is by no means exhaustive and much of the algebraic and geometric technicalties are left out as they are not needed for understanding the main results. We refer to the extensive literature on rough paths (e.g. [FH14], [FV09], [LQ02]) for further details.

A rough path framework allows for developing pathwise solutions to SDEs by considering a fixed realisation of the driver (i.e. for fixed , in our case), so that the differential equation is seen rather as a controlled ODE. The irregularity of the driver means that special care is needed to define solutions to such ODEs. The fundamental insight from rough path theory is that -order iterated integrals of the form222the notation is used to refer to the th index of a permutation of the set with for a driver of

-Hoelder regularity (and an apriori interpretation of these iterated integrals) is sufficient to make these differential equations well-defined in a deterministic sense. The iterated integrals or “signature” can be seen heuristically as arising from a Taylor expansion

[Lej03]. Since Brownian paths are -Hoelder regular with it is only necessary to consider the second order iterated integral taking the form . There is still some ambiguity as to how this stochastic integral should be interpreted; a Stratonovich (Ito) interpretation corresponds to the so-called Stratonovich (Ito)-lift or enhancement. We use the notation to denote the rough path lift of a continuous path , where and denotes the second order process such that where . We are concerned with Brownian rough paths; these paths belong to the space of -Hoelder rough paths such that

 \normXα:=sup0≤s

is finite and Chen’s relation is satisfied. For an -Hoelder continuous path taking values in , we define

 \normXα:=sup0≤s

Throughout the article, we denote by the Stratonovich lift of , which is a geometric rough path. These have the important property that there exists a sequence of smooth (or piecewise smooth) rough paths converging uniformly on in the rough path topology to the lifted path (see e.g. Proposition 2.5 in [FH14]). Likewise, we denote by the Stratonovich lift of a piecewise smooth approximation .

It is classical that the solution of (1.2) coincides with the solution (when it exists and is unique) of the following Rough Differential Equation (RDE) driven by

 ^Xdt=^X0+∫t0b(^Xdt)dt+∫t0σ(^Xdt)Bdt, (2.1)

(see e.g. Section 9 in [Lej12]). Roughly speaking, the integrand must be “smooth enough” to counteract the irregularity of the driver [Unt12]. Several concepts of solutions to RDEs have been developed, e.g. solution in the sense of Lyons [Lyo98], Gubinelli’s controlled paths formulation [Gub04], Davie’s formulation in terms of a Taylor expansion or Euler discretisation [Dav08] and also more recently the coordinate independent definition from [Bai15]. There exist equivalencies between various solution concepts [BC18], e.g. solutions in the sense of Lyons and Davie are equivalent when is -Hoelder continuous with , see [Lej12]. Here we consider solutions in the sense of Davie [Dav10] i.e. where is a continuous path from of finite -variation such that for some constant ,

 ∣∣∣^Xdt−^Xds−∫tsb(^Xu)du−σ(^Xs)Bd,1s,t−Σ(^Xs)Bd,2s,t∣∣∣≤L|t−s|θ∀0≤s

where . With a slight abuse of notation, we denote by a linear vector field from and is also a linear vector field from .

## 3 Proof of Theorem 1.1

The proof involves utilising localisation arguments to extend well-known convergence results in the case of coefficients (i.e. the space of twice continuously differentiable uniformly bounded functions with bounded derivatives) as in [IW89] to unbounded coefficients. The main goal is to close the commutative diagram in Figure 3.1. We do so by establishing uniform in the approximation parameter and pointwise in the localisation parameter convergence results which thereby permit the interchange of limits in and using the Moore-Osgood theorem. We also develop uniform in

pathwise moment bounds on

in Lemma 3.4 using rough path techniques, which plays a crucial role in establishing uniform convergence.

We start by considering a localised form of the limiting SDE in Ito form. For all , let . Consider which is the unique strong solution of

 dXnt=bn(Xnt)dt+Σn(Xnt)dt+σn(Xnt)dBt (3.1)

where

 bn(x)={b(x),x∈Snfb(x),x∉Sn

where is chosen such that . The functions and are constructed in a similar manner. Consider also the following ODE utilising the smooth approximation of the Brownian motion as described above

 dXd,nt=bn(Xd,nt)dt+σn(Xd,nt)˙Bdtdt. (3.2)

We will assume throughout that where is fixed but arbitrary. The following lemma utilises standard results to establish convergence along the clockwise route starting from in Figure 3.1.

###### Lemma 3.1.

For every and satisfying (3.1) and satisfying (1.5),

 (3.3)
###### Proof.

Since the coefficients of (3.1) and (1.5) are continuously differentiably and uniformly bounded with bounded derivatives, we have from Theorem 7.2 in [IW89] that for every ,

 limd→∞E[sup0≤t≤T|Xd,nt−Xnt|2]=0,∀n∈N. (3.4)

Since the coefficients are globally Lipschitz continuous, we have from standard localisation arguments that

 limn→∞E[sup0≤t≤T|Xnt−Xt|2]

which gives the desired result.

As discussed in Section 2, the solution of the RDE (2.1) coincides with , so that we can now work with the rich toolbox offered by rough path theory to obtain a pathwise moment bound on , uniformly in (Lemma 3.3). An analogous result to Lemma 3.3 using stochastic analysis techniques in the case of bounded coefficients can be found in Lemma 7.2 in [IW89], where the now standard approach of applying a time shift to obtain martingales has proven difficult to adapt to the case of unbounded coefficients. The rough path framework allows to side-step this difficulty, particularly since dealing with RDE solutions in the sense of Davie means we can avoid having to directly control the integrals driven by .

Firstly, existence and uniqueness of solutions (in the sense of Davie, and equivalently in the sense of Lyons) to (2.1) under Assumption 1.1-1.2 follows in a straightforward manner from Proposition 3 and Theorem 1 for driftless RDEs in [Lej12]. We obtain the following bound on the solution to (2.1) in the small time horizon by a straightforward extension of the boundedness of solutions to driftless RDEs in Proposition 2 of [Lej12], where the crucial point is to utilise the sewing lemma to obtain an expression for . The proof of the following Lemma can be found in the Appendix.

###### Lemma 3.2.

Boundedness of solution to (2.1) in the short time horizon. Under Assumptions 1.1 and 1.2, and with for Brownian motion and recalling the notation , , we have that

 \norm^Xdα≤C15(μ)+C16(μ)|x0|, (3.5)

where

 C15(μ) :=C13(C14μ2(\norm∇Σ∞((1+μ2)\normBd2α+\normBdα)+\norm∇σ2∞\normBd2α|σ(0)|) C16(μ) :=C13\norm∇σ∞(C14μ2\norm∇σ2∞\normBd2α+\normBdα+μ\norm∇σ∞\normBdα+μ1α−1) C13 :=1(1−3M)(1−K2) C14 :=K1−K2

for a small enough time horizon satisfying

 \normBdα\norm∇Σ∞μ2 ≤M (3.6a) \normBdα\norm∇σ∞μ ≤M (3.6b) \norm∇b∞μ1α ≤M (3.6c) \norm∇σ∞\normBdαμ ≤K2K (3.6d) μ2(C4(μ)+C3(μ)) ≤K2(1−3M)(1−K2)K (3.6e) with C3(μ) :=\norm∇Σ∞((1+μ2)\normBd2α+\normBdα) C4(μ) :=\norm∇σ2∞\normBd2αμ

and where , and .

###### Lemma 3.3.

Uniform estimates on the smooth approximation

Under Assumptions 1.1 and 1.2, and when is a fixed value, it holds that

 supd∈NE[supt∈[0,T]|Xdt|2]<∞
###### Proof.

Before extending to the large time horizon, we require a precise expression for based on the conditions on in (3.6). In particular,

 μ≤min⎛⎝1\normBd1/2α,1\normBdα,1⎞⎠C(M,\norm∇Σ∞,\norm∇b∞,\norm∇σ∞,α) (3.7)

ensures satisfies (3.6a)-(3.6d). To analyse the final condition (3.6e), let and and . We then require

 μ3a2\normBd2α+β(\normBd2α+\normBdα)μ2+β\normBd2αμ4≤K3. (3.8)

It is not difficult to see that (3.8) holds in the case when

 (a2\normBd2α+β(2\normBd2α+\normBdα))μ4≤K3

and in the case when

 (a2\normBd2α+β(2\normBd2α+\normBdα))μ2≤K3.

Together, these conditions imply that we require

 μ ≤min⎛⎜⎝⎛⎝K3(a2+2β)\normBd2α+\normBdα⎞⎠1/2,⎛⎝K3(a2+2β)\normBd2α+\normBdα⎞⎠1/4⎞⎟⎠.

Combining the above with (3.7) means that (3.6a)-(3.6e) are satisfied when

 μ ≤min⎛⎜ ⎜ ⎜⎝1[(a2+2β)\normBd2α+\normBdα]1/2,1[(a2+2β)\normBd2α+\normBdα]1/4,1\normBd1/2α,1\normBdα,1⎞⎟ ⎟ ⎟⎠ ×C(M,\norm∇Σ∞,\norm∇b∞,\norm∇σ∞,K,K2,α).

Again, considering the cases and separately we obtain the following expression

 K∗ =11+\normBdαC∗(M,\norm∇Σ∞,\norm∇b∞,\norm∇σ∞,K,K2,α). (3.9)

Furthermore, (3.6b) and (3.6c) imply that

 C16(μ) ≤C13\norm∇σ∞(\normBdα+C14M2+M+C17) :=C18\normBdα+C19

where . Likewise, (3.6a), (3.6b) and (3.6c) imply

 C15(μ) ≤C13((C14M+|σ(0)|)\normBdα+1+C14M(1+M\norm∇Σ∞)+C14|σ(0)|M2+|σ(0)|(M+C17)) :=C20\normBdα+C21.

Therefore, when , we have that

 \norm^Xdα≤C20\normBdα+C21+|x0|(C18\normBdα+C19).

Then by Proposition 7 in [Lej12],

 supt∈[0,T]|^Xdt| ≤R(T)|x0|+R(T)C20\normBdα+C21C18\normBdα+C19, (3.10)

with

 R(T) =exp(C16(μ)(1+1K∗)1−α)exp(max{T,μ}) ≤exp(C16(μ)(1+1K∗))exp(max{T,μ}),

since . Furthermore,

 C20\normBdα+C21C18\normBdα+C19 ≤C20C18+C21C19 :=C22

and also using (3.9) we have

 exp(C16(μ)(1+1K∗)) ≤exp(C∗2C19)⋅exp(C∗2C18\normBdα),

where . Finally, we can conclude using (3.10) that

 supt∈[0,T]|^Xdt|2 ≤exp(2C∗2C19)exp(2max{T,μ})(|x0|+C22)2exp(2C∗2C18\normBdα) ≤exp(2C∗2C19+2max{T,μ}+2C∗2C18\normBdα)(|x0|+C22)2 :=C24exp(2C∗2C18\normBdα)

where are positive constants depending only on . Finally, for nested piecewise linear approximations, Theorem 13.19 in [FV09] gives us that

 exp(C∗2C18\normBdα)≤exp(C∗2C18Mg)<∞∀d∈N

where

is a positive random variable with Gaussian tails independent of

and almost surely. Taking expectation of both sides gives the desired result.

We are now ready to show that solutions of localised ODEs converge to the solution of the ODE of interest, uniformly in the approximation parameter.

###### Lemma 3.4.

Uniform convergence in . For every and satisfying (3.1) and satisfying (1.4),

 limn→∞supd∈NE[sup0≤t≤T|Xd,nt−Xdt|2]=0 (3.11)
###### Proof.

For each fixed , define

 τdn:=inf{t≥δd||Xd,nt−δd|2∉Sn∪|Xdt−δd|2∉Sn}−δd

which is a stopping time with respect to . Define

 SdnT:=sup0≤t≤T|Xd,nt−Xdt|2.

It holds that

and by Hoelder inequality,

 E[1τdn≤TSdnT] ≤E[(1τdn≤T)p]1/pE[(SdnT)q]1/q =(P(τdn≤T))1/pE[(SdnT)q]1/q,

where . For the case , we start by considering the Stratonovich step 2 lift of as in Lemma 3.3. We can again work with the RDEs whose solutions coincide with the ODEs. Continuity of the Ito-Lyons map under Assumptions 1.1 and 1.2 is established in Theorem 1 in [Lej12], which implies that for all ,

 |Xd,nt−Xdt|=0a.s.

since both and are driven by the same and . Furthermore, it follows directly that when for every fixed (and uniformly in ). Therefore, we have that for fixed ,

 supd∈NE[SdnT] ≤supd∈N(P(τdn≤T))1/pE[(SdnT)q]1/q.

Lemma 3.3 clearly holds for uniformly in , therefore we have that

 E[(Sdn)q]1/q ≤(E[sup0≤t≤T|Xd,nt|2q]+E[sup0≤t≤T|Xdt|2q])1/q <∞

uniformly in and . Furthermore, by standard arguments we have that

 P(τdn≤T) =P(supt∈[0,T]|Xdt|2>n) =E[1{supt∈[0,T]|Xdt|2>n}] ≤1nE[supt∈[0,T]|Xdt|2]

Then taking sup wrt on both sides in the above, and using from Lemma 3.3 implies

 limn→∞supd∈N(P(τdn≤T))1/p=0.

Combining all gives

 ≤(limn→∞supd∈N(P(τdn≤T))1/p)(limn→∞supd∈NE[(Sdn)q]1/q) =0.

since both limits are finite.

The final ingredient is the pointwise convergence statement, as stated in the following lemma.

###### Lemma 3.5.

Pointwise convergence in . For every and satisfying (3.2) and satisfying (3.1),

 limd→∞E[sup0≤t≤T|Xd,nt−Xnt|2]=0∀n∈N (3.12)
###### Proof.

Follows trivially from Theorem 7.2 in [IW89] since for every fixed , and are uniformly bounded. ∎

Since is a distance metric on the space of integrable processes333Here we are working with equivalence classes, to which all proceses considered here belong, we can apply the Moore-Osgood Theorem (see Theorem A.1) together with Lemmas 3.5, 3.4 and 3.1 to obtain

 =limd→∞limn→∞E[sup0≤t≤T|Xd,nt−Xdt|2+sup0≤t≤T|Xdt−Xt|2] =0.

Furthermore, since uniform convergence implies pointwise convergence, we have from Lemma 3.4 that for any fixed ,

 limn→∞E[sup0≤t≤T|Xd,nt−Xdt|2]=0

which then gives the desired result. This concludes the proof of Theorem 1.1.

## Acknowledgements

This research has been partially funded by Deutsche Forschungsgemeinschaft (DFG)- SFB1294/1 - 318763901. The author is grateful to Wilhelm Stannat and Sebastian Reich for helpful feedback on this work.

## Appendix A Appendix

###### Theorem A.1.

Moore-Osgood Theorem. Let be a metric space and be a sequence in where . If

• and

• for all ,

then the joint limit exists. In particular, it holds that

 limd,n→∞md,n=limn→∞m∞,n=limd→∞md,∞.

### Proof of Lemma 3.2

Define

 D(s,t):=∫tsb(^Xdu)du−σ(^Xds)Bd,1s,t−