 # Uniformly accurate numerical schemes for a class of dissipative systems

We consider a class of relaxation problems mixing slow and fast variations which can describe population dynamics models or hyperbolic systems, with varying stiffness (from non-stiff to strongly dissipative), and develop a multi-scale method by decomposing this problem into a micro-macro system where the original stiffness is broken. We show that this new problem can therefore be simulated with a uniform order of accuracy using standard explicit numerical schemes. In other words, it is possible to solve the micro-macro problem with a cost independent of the stiffness (a.k.a. uniform cost), such that the error is also uniform. This method is successfully applied to two hyperbolic systems with and without non-linearities, and is shown to circumvent the phenomenon of order reduction.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We are interested in problems of the form, for and ,

 ⎧⎨⎩˙xε=a(xε,zε),xε(0)=x0,˙zε=−1εAzε+b(xε,zε),zε(0)=z0, (1.1)

with a small parameter, a diagonal positive matrix with integer coefficients, and where are respectively the -component and the -component of an analytic map which smoothly depends on . In the sequel we shall more often write this problem as

 ˙uε=−1εΛuε+f(uε),uε(0)=u0, (1.2)

where , and . We set the dimension of such that . In particular, the dimension of can be zero without impacting our results. The map is assumed to be smooth. Our theorems do not consider the case where

involves a differential operator in space (i.e. the case of partial differential equations). Nonetheless, two of our examples are discretized hyperbolic partial differential equations (PDEs) for which the method is successfully applied, even though a special treatment is required.

Systems of this kind appear in population dynamics (see [greiner1994singular, auger1996emergence, sanchez2000singular, castella2015analysis]), where accounts for migration (in space and/or age) and account for both the demographic and inter-population dynamics. The migration dynamics is quantifiably faster than the other dynamics involved, which explains the rescaling by in the model. When solving this kind of system numerically, problems arise due to the large range of values that can take.

Considering a numerical scheme of order , by definition, for all , there exists a constant and a time-step such that for all , the error when solving (1.2) is bounded by

 Eε(Δt)≤C(ε)Δtq.

Assume now that there exists such that this scheme is stable for all and .444 In particular, the scheme cannot be any usual explicit scheme since it would require a stability condition of the form with independent of . The order reduction phenomenon manifests itself through the existence of and , both independent of such that the uniform error satisfies

 supε∈(0,1]Eε(Δt)≤CΔts. (1.3)

Note that in general is much smaller than . This behaviour is documented for instance in [hairer1996stiff, Section IV.15] or in [hundsdorfer2007imex]. In order to ensure a given error bound, one must either accept this order reduction (if ), as is done for asymptotic-preserving (AP) schemes [jin1999efficient] by taking a modified time-step , or use an -dependent time-step for some . In practice, both approaches cause the computational cost of the simulation to increase greatly, often prohibitively so.

Another common approach to circumvent this is to invoke the center manifold theorem (see [vasil1963asymptotic, carr1982, sakamoto1990invariant]) which dictates the long-time behaviour of the system and presents useful characteristics for numerical simulations: the dimension is reduced and the dynamics on the manifold is non-stiff. However, this approach does not capture the transient solution of the problem, i.e. the solution in short time before it reaches the stable manifold. This is troublesome when one wishes to describe the system out of equilibrium. Furthermore, even if the solution is close to the manifold, these approximations are accurate up to a certain order , rendering them useless if is of the order of .

We first provide a systematic way to compute asymptotic models at any order in that approach the solution even in short time. Then we use the defect of this approximation to compute the solution with usual explicit numerical schemes and uniform accuracy (i.e. the cost and error of the scheme must be independent of ). This approach automatically overcomes the challenges posed by both extremes and .

In order to achieve this goal, for any non-negative integer we construct a change of variable for the dissipative problem (1.2), , and a non-stiffvector field , such that

 uε(t)=Ω[n]t/ε(v[n](t))+w[n](t) (1.4)

where is the macro component with dynamics dictated by , and is the micro component of size . The main result we prove is that from this decomposition, it is possible to compute with uniform accuracy when using explicit exponential Runge-Kutta schemes of order (which can be found for instance in [cite:exp_erk_schm]), i.e. it is possible to take in (1.3). In other words, if is a discretisation of time-step , and and are computed numerically using such a scheme, then there exists independent of such that

 max0≤i≤N∣∣uε(ti)−Ω[n]ti/ε(vi)−wi∣∣≤CΔtn+1

where is the usual Euclidian norm on . Furthermore, using a scheme of order generates an error proportional to on the -component of the solution. This is interesting as is of size after a time . IMEX methods such as CNLF and SBDF (see [ascher1995implicit, akrivis1999implicit, hu2019uniform]), which mix implicit and explicit solving (for the stiff and non-stiff part respectively) are not the focus of the article, but their use is briefly discussed in Remark 3.4.

Recently in [cite:asym_bseries], asymptotic expansions of the solution of (1.1) were constructed in the case allowing an approximation of the solution of (1.1) with an error of size . This method could be considered to compute the change of variable . However it involves elementary differentials and manipulations on trees which are impractical to implement, especially for higher-orders. For highly-oscillatory problems, another approach, developed in [cite:mic_mac], involves a recurrence relation which could later be computed automatically for high orders [chartier2020high_order]. We start by considering the following problem

 ˙yε=−ie−itεΛf(eitεΛyε),yε(0)=y0:=u0 (1.5)

on which we apply averaging methods detailed in [cite:strobo] that are in the vein of those initiated by [perko1969higher] in order to approach the solution with the composition of a near-identity periodic map and a flow following a vector field : for all , where is of size and can be computed numerically with a uniform error. The change of variable and the vector field are then deduced from and using Fourier series. From this, the micro-macro problem defining and in (1.4) for the dissipative problem (1.2) is deduced.

The rest of the paper is organized as follows. In Section 2, we construct the change of variable and smooth vector field used to obtain the macro part in (1.4) for Problem (1.2). These maps are constructed using averaging methods on (1.5) and properties similar to those of averaging are proven, ensuring the well-posedness of the micro-macro equations on as defined in (1.4). In Section 3, we study the micro-macro problems associated with this new decomposition (1.4), and prove that the micro part is indeed of size , and that the problem is not stiff. We then state the result of uniform accuracy when using exponential RK schemes. In Section 4, we present some techniques to adapt our method to discretized hyperbolic PDEs. Namely, we study a relaxed conservation law and the telegraph equation, which can be respectively found for instance in [jin1995relaxation] and [lemou2008asymptotic]. In Section 5, we verify our theoretical result of uniform accuracy by successfully obtaining uniform convergence when numerically solving micro-macro problems obtained from a toy ODE and from the two aforementioned PDEs.

## 2 Derivation of asymptotic models with error estimates

In this section, we construct the change of variable and vector field used in the micro-macro decomposition (1.4). In Subsection 2.1, assumptions on the vector field and on the solution  of (1.2) are stated. In Subsection 2.2, we define a highly-oscillatory problem and construct an asymptotic approximation of the solution of this problem as in [cite:mic_mac]. We finish the subsection by summarizing the error bounds associated to this approximation. In Subsection 2.3, we finally define and , and state results on error bounds akin to those in the highly-oscillatory case. While these are asymptotic expansions, the error bounds are valid for all values of , so that the micro-macro decomposition (1.4) is always valid.

### 2.1 Definitions and assumptions

In order for the highly-oscillatory problem (1.5) to be well-defined, we first make the following assumption.

###### Assumption 2.1.

Let us set the dimension of Problem (1.2). There exists a compact set and a radius such that for every in , the map can be developed as a Taylor series around , and the series converges with a radius not smaller than .

It is therefore possible to naturally extend to closed subsets of defined by

 Uρ:={u∈Cd;∃x∈X1,∣∣∣u−(x0dz)∣∣∣≤ρ},

for all as it is represented by a Taylor series in on these sets. Here is the natural extension of the Euclidian norm on to .

It may seem particularly restrictive to assume that the -component of the solution of (1.2) stays in a neighborhood of , however this is somewhat ensured by the center manifold theorem. This theorem states that there exists a map smooth in and , such that the manifold defined by

 M={(x,z)∈Rdx×Rdz:z=εhε(x)}

is a stable invariant for (1.1). It also states that all solutions of (1.1) converge towards it exponentially quickly, i.e. there exists independent of such that

 (2.1)

This means that the growth of is bounded by that of , and that after a time , is of size . Therefore it is credible to assume that stays somewhat close to . This is translated into a second assumption.

###### Assumption 2.2.

There exist two radii  and a closed subset such that the initial condition satisfies

 minx∈X0∣∣∣u0−(x0dz)∣∣∣≤ρ0,

and for all , Problem (1.2) is well-posed on with its solution in .

Note that this is different to assuming that the initial data is close to the center manifold. For we define the sets

 Kρ:=Uρ1+ρ={u∈Cd;∃x∈X1,∣∣∣u−(x0)∣∣∣≤ρ1+ρ} (2.2)

which help quantify the distance to the solution . By Assumption 2.2, the solution of (1.2) is in at all time.

### 2.2 Constructing an approximation of the periodic problem

Writing , we define a map by

 gθ(u):=−ie−iθΛf(eiθΛu). (2.3)

Thanks to Assumption 2.1, is well-defined and is analytic w.r.t. both  and . In this subsection, we consider the highly-oscillatory problem

 ˙yε=gt/ε(yε),yε(0)=y0:=u0, (2.4)

of which we approach the solution using averaging techniques based on a recurrence relation from [cite:strobo]. The following construction and results are taken from [cite:mic_mac], where they are described in (much) more detail and where they serve to construct the macro-part of a micro-macro decomposition of . We start by writing the solution of (2.4) as a composition

 yε(t)=Φ[n]t/ε∘Ψ[n]t∘(Φ[n]0)−1(y0)+O(εn+1) (2.5)

where is a change of variable and is the flow map of an autonomous differential equation

 ddtΨ[n]t(u)=G[n](Ψ[n]t(u)),Ψ[n]0=id

where is a smooth map which must be determined.

The idea behind this composition is that captures the slow drift while captures rapid oscillations. In this work, we focus on standard averaging, meaning the change of variable is of identity average, i.e. . The average is defined by

 ⟨φ⟩(u):=12π∫π−πφθ(u)dθ. (2.6)

The change of variable is computed iteratively using the relation

 Φ[n+1]θ=id+ε∫θ0T(Φ[n])σdσ−ε⟨∫∙0T(Φ[n])σdσ⟩ (2.7)

with initial condition . The operator is defined for maps with identity average as

 T(φ)σ=gσ∘φσ−∂uφσ⋅⟨g∘φ⟩. (2.8)

From these changes of variable we define vector fields and defects by

 G[n]:=⟨g∘Φ[n]⟩,δ[n]:=1ε∂θΦ[n]+∂uΦ[n]G[n]−g∘Φ[n]. (2.9)

Note that by definition, .

Given a radius and a map analytic in and -times continuously differentiable in , we define the norms

 ∥φ∥T,ρ:=sup(θ,u)∈T×Kρ|φθ(u)|,∥φ∥T,ρ,ν:=sup0≤β≤ν∥∂βθφ∥T,ρ. (2.10)

We later use these norms to state error bounds on maps and .

###### Property 2.3.

Assumptions 2.1 and 2.2 ensure the following properties:

1. There exists a final time such that for all , Problem (2.4) is well-posed on with its solution  in .

2. There exists a radius such that for all , the function is analytic from to .

3. As the function is analytic w.r.t. , we fix an arbitrary rank and set a constant such that for all ,

 ∀ 0≤ν≤p+2,σνν!∥∥∂νθg∥∥T,2R≤M, (2.11)

This allows us to get averaging results which can be summed up in the following theorem:

###### Theorem 2.4 (from [cite:mic_mac] and [cite:strobo]).

For , let us denote and with and defined in Property 2.3. For all such that , the maps and are well-defined by (2.7) and (2.9). The change of variable and the defect are both -times continuously differentiable w.r.t. , and is invertible with analytic inverse on . Moreover, the following bounds are satisfied for ,

 (i) ∥Φ[n]−id∥T,R≤4εM≤rn4, (ii) ∥∂νθΦ[n]∥T,R≤8εMν! (iii) ∥G[n]∥T,R≤2M (iv) ∥δ[n]∥T,R,p+1≤2M(2Qpεεn)n

where is a -dependent constant.

These properties ensure that the micro-macro problem is well-posed in [cite:mic_mac]. We now use these maps , and in order to define a decomposition for the dissipative problem (1.2), and show that similar properties are satisfied.

### 2.3 A new decomposition in the dissipative case

A map which is continuously differentiable w.r.t. coincides everywhere with its Fourier series, i.e.

 (2.12)

We define the shifted map by

 ˜φθ(u)=eiθΛφθ(u). (2.13)

Using these Fourier coefficients , we consider new maps by setting the change of variable and the defect , for ,

 Ω[n]τ(u):=∑j∈Ze−jτcj(˜Φ[n])(u),η[n]τ(u):=i∑j∈Ze−jτcj(˜δ[n])(u). (2.14)

These series are purely formal for now, and their convergence is demonstrated at the end of this subsection. Here and are respectively the shifted change of variable and the shifted defect, with the shift given by (2.13). If there exists an index and a vector such that , then cannot be bounded uniformly for all . We also define the flow by setting

 ddtΓ[n]t(u)=F[n](Γ[n]t(u)),whereF[n]=iG[n]. (2.15)

Note that we do not know the lifetime of any particular solution of the Cauchy problem , yet.

###### Remark 2.5.

From the identity , one can obtain the relations on the Fourier coefficients

 cj(Φ[n])(¯¯¯u)=¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯cj(Φ[n])(u)andcj(δ[n])(¯¯¯u)=−¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯cj(δ[n])(u). (2.16)

The same holds for and , as the tilde operator simply shifts the indices of these coefficients component by component. This ensures that if is in then so are and . Similarly, if is in then is in .

For , for instance, it is equivalent to , and can be shown by induction using (2.7). Indeed, for , (making the result straightforward), and for the change of variable satisfies with defined by (2.8). The following calculation is also valid for ,

 cj(g∘Φ[n])(¯¯¯u) =12π∫π−πe−ijθgθ(Φ[n]θ(¯¯¯u))dθ=12π∫π−πe−ijθgθ(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯Φ[n]−θ(u))dθ =−12π∫π−π¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯eijθg−θ(Φ[n]−θ(u))dθ=−¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯cj(g∘Φ[n])(u).

From this, one gets , yielding the desired result.

The micro part of the decomposition is the difference between the solution of (1.2) and the asymptotic approximation . Assuming that and are well-defined (this is proved it in Theorem 2.8), it is necessary to show that the map can be characterized as a defect (similarly to ). Being a defect means that characterises the error of the approximation . A straightforward computation yields

 η[n]τ=∑j∈Ze−jτcj(iε˜∂θΦ[n]+˜∂uΦ[n]⋅(iG[n])−i˜g∘Φ[n]) (2.17)

where we can recognize , and . The characterization as a defect requires the following result:

###### Lemma 2.6.

Let and be two radii such that and let be a positive integer. We set a periodic map that is near-identity in the sense

 ∀(θ,u)∈T×Kρ,|φθ(u)−u|≤r−ρ

and that is continuously differentiable w.r.t. for all . With the definitions of (2.12) and (2.13), assume that all the Fourier coefficients of negative index of the shifted map vanish. Then, setting and its closure, the map is well-defined with values in , -times continuously differentiable. Furthermore, for all , when composing with the vector field from (1.2) (satisfying Assumption 2.1), the following identity is met

 f(∑j≥0ξjcj(˜φ)(u))=∑j∈Zξjcj(f∘˜φ)(u)

and for all , is identically zero. In particular the map is well-defined with values in .

###### Proof.

Let us work at fixed . By product is continuously differentiable w.r.t. , therefore the series of its Fourier coefficients is absolutely convergent. Furthermore, only has nonnegative modes by assumption, meaning the indices can be restricted to nonnegative values in the definition

 ζ:ξ∈¯¯¯¯¯D↦ζ(ξ):=∑j∈Zξjcj(˜φ)(u).

As such, the function is well-defined on and is holomorphic on .

Let us now show that it has values in . Because is in , by (2.2), we set such that Using a triangle inequality in the definition of  yields

 ∣∣∣ζ(ξ)−(x0)∣∣∣≤∣∣∣ξΛu−(x0)∣∣∣+∣∣ ∣∣∑j≥0ξjcj(˜φ−id)(u)∣∣ ∣∣ (2.18)

where if and by convention for all . Because for all ,

 ∣∣∣ξΛu−(x0)∣∣∣≤∣∣∣u−(x0)∣∣∣≤ρ1+ρ,

and according to the maximum modulus principle

 ∣∣ ∣∣∑jξjcj(˜φ−id)(u)∣∣ ∣∣≤supθ∈T|φθ(u)−u|≤r−ρ.

The bound follows from (2.18), therefore by (2.2), is in .

In turn, the function is well-defined for all , is continuous on this set, and is holomorphic on . As such, it can be developed as a power series around . We write the coefficients of this power series such that for in a neighborhood of , . By Cauchy formula,

 βj=12iπ∮|ξ|=1ξ−(j+1)f(ζ(ξ))dξ=12π∫π−πe−ijθf∘˜φθ(u)dθ=cj(f∘˜φ)(u),

therefore For , Cauchy’s integral theorem ensures that vanishes, i.e. that vanishes. ∎

Assuming now that satisfies the assumptions of Lemma 2.6 (this will be proved in Theorem 2.8), from (2.17) we get

 η[n]τ=1ε(∂τ+Λ)Ω[n]τ+∂uΩ[n]τ⋅F[n]−f∘Ω[n]. (2.19)

This means that is indeed a defect, and this relation will later serve to prove that is of size .

Before proceeding, given a radius and a map , let us introduce the norm

 ∥ψ∥ρ:=sup(τ,u)∈R+×Kρ|ψτ(u)|. (2.20)
###### Lemma 2.7.

Given a radius and an integer , let be a periodic map that is analytic w.r.t. , that is -times continuously differentiable w.r.t. and that has vanishing Fourier coefficients for negative indices, i.e. for all , is identically zero. Then the associated dissipative map defined by

 ψτ(u):=∑j∈Ze−jτcj(φ)(u)

is well defined for , analytic w.r.t. and -times continuously differentiable w.r.t. . Furthermore it respects the following bounds for ,

 ∥∥∂kτψτ∥∥ρ≤∥∂kθφ∥T,ρ

where the norm on and its derivatives is defined by (2.20).

###### Proof.

It is well-known that the Fourier series of and of its derivatives for are absolutely convergent. Therefore and are well-defined for in by

 ∂kτψτ(u)=∑j≥0(−j)ke−jτcj(φ)(u)=ik⋅∑j≥0e−jτcj(∂kθφ)(u).

The absolute convergence also ensures that analyticity w.r.t. is preserved, as an absolutely convergent series of holomorphic functions is holomorphic. We define the power series defined for all and all  by

 ζk(ξ,u)=∑j≥0ξjcj(∂kθφ)(u)

such that . The maximum modulus principle ensures

 supτ∈R+|∂kτψτ(u)|≤sup|ξ|≤1|ζk(ξ,u)|=sup|ξ|=1|ζk(ξ,u)|≤∥∂kθφ∥T,ρ

which is the desired result. ∎

Using the lemma’s notations and assumptions, since is -times continuously differentiable, we may also define the norm

 ∥ψ∥ρ,ν:=max0≤k≤ν∥∂kτψ∥ρ (2.21)

with defined by (2.20). This result can now be applied to maps and , after checking that the Fourier coefficients of the shifted maps and vanish for negative indices. The shift is given by (2.13).

###### Theorem 2.8.

For in , let us denote and with and defined in Property 2.3. For all such that , the maps , and given by (2.14) and (2.15) are well-defined on and are analytic w.r.t. . The change of variable and the residue are both -times continuously differentiable w.r.t. . Moreover, with and given by (2.20) and (2.21), the following bounds are satisfied for all ,

 (i) ∥∥Ω[n]−e−τΛ∥∥R≤4εM, (ii) (iii) ∥F[n]∥R≤2M (iv)

where is the induced norm from to , and is a -dependent constant.

###### Proof.

We show by induction that and only have non-negative Fourier modes. To start off the induction, notice , therefore . Since only has coefficients in , only positive modes are generated. Assuming for that has vanishing Fourier coefficients for negative indices, let us prove that does as well. By definition (2.7),

 ˜Φ[k+1]θ(u)=eiθΛu+ε∫θ0eiθΛT(Φ[k])σ(u)dσ