 # Adaptive Krylov-Type Time Integration Methods

The Rosenbrock-Krylov family of time integration schemes is an extension of Rosenbrock-W methods that employs a specific Krylov based approximation of the linear system solutions arising within each stage of the integrator. This work proposes an extension of Rosenbrock-Krylov methods to address stability questions which arise for methods making use of inexact linear system solution strategies. Two approaches for improving the stability and efficiency of Rosenbrock-Krylov methods are proposed, one through direct control of linear system residuals and the second through a novel extension of the underlying Krylov space to include stage right hand side vectors. Rosenbrock-Krylov methods employing the new approaches show a substantial improvement in computational efficiency relative to prior implementations.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

This work is concerned with solving initial value problems:

 dydt=f(t,y),   t0≤t≤tF,   y(t0)=y0;y(t)∈RN,f:R×RN→RN. (1)

We consider numerical discretizations of (1) the following general form:

 ki = φ(hγAn)(hFi+hAni−1∑j=1γi,jkj), (2a) Fi = f(yn+i−1∑j=1αi,jkj), (2b) yn+1 = yn+s∑i=1biki. (2c)

where is either the exact Jacobian matrix, , coming from the right-hand-side vector

 Jn:=∂f∂y∣∣∣t=tn, (3)

or an approximation of it.

The general form (2) encapsulates explicit Runge-Kutta methods when , Rosenbrock methods when , and exponential-Rosenbrock methods when . When the coefficients , , and are chosen such that must equal the Jacobian matrix , the scheme (2) is known as either a classical Rosenbrock or exponential-Rosenbrock method depending on the choice of . Alternatively, the coefficients can be chosen such that the matrix can be any arbitrary approximation of the Jacobian , and in this case the scheme (2) is known as either a Rosenbrock-W or exponential-Rosenbrock-W method depending once again on the choice of .

The use of W-methods is convenient when the exact Jacobian is difficult, or particularly expensive, to compute, or when iterative linear system solution methods are preferred Tranquilli2017 . The drawback of this approach is a dramatic increase in the number of required order conditions imposed on the coefficients , , and in order to achieve the desired accuracy Hairer_book_II . For this reason, among others, we focus our attention on an alternative formulation of (2) called Rosenbrock-Krylov Tranquilli_2014_ROK or exponential-Krylov Tranquilli_2014_expK methods, in which the coefficients are chosen to permit a specific, Krylov-based, approximation of the Jacobian matrix . These methods have been shown to be at least efficient for a broad spectrum of problems through comparisons of relative performance against an assortment of standard integrators Tranquilli_2014_ROK ; Sandu_2017_EPIRK ; Tranquilli_2014_expK ; Tranquilli2017 ; Wu_2016_ROK4E , and have been effectively used to study engineering problems in Bravo_2018_transcritical ; wu_2018_efficient ; wu_2018_application ; Sarshar_2017_numerical .

Stability analysis for W-methods with arbitrary matrices is still an open problem. Stability questions for the case when is small are considered in (Hairer_book_II, , Section IV.11). Similarly, exponential schemes are trivially -stable when the functions are evaluated exactly, however when these functions are evaluated using approximate methods this property is lost. Quantifying the impact of the approximation of on the stability of exponential methods is closely related to the stability questions for W-methods and also remains open.

K-methods are similarly plagued with stability questions, though have the benefit of making use of a very specific formulation for the approximate Jacobian . In Buttner_1995_partitioning , stability for W-methods making use of the Krylov based approximation, similar to that used for K-methods, is examined through the lens of contractivity for semilinear problems. Here we examine the linear stability of K-methods and present two practical approaches for improving it.

The remainder of the paper is organized as follows. Section 2 provides a brief overview of K-methods. Section 3 performs a linear stability analysis for K-methods, and Section 4.1 introduces two new adaptive strategies for improving the stability and efficiency through the direct control of stage residuals. Numerical results are presented in Section 5, and concluding remarks are given in Section 6.

## 2 K-type Time Integration Methods

Rosenbrock-Krylov methods have the same form as Rosenbrock-W methods (2), but use a particular approximation of . Without loss of generality we restrict the presentation to the case of autonomous systems. Let and consider the -dimensional Krylov space:

 KM(Jn,fn) = (4)

An orthogonal basis for is constructed using a modified Arnoldi process vanderVorst . The Arnoldi iteration returns two valuable pieces of information: a matrix whose columns are the orthonormal basis vectors of , and an upper Hessenberg matrix , such that

 V=[v1,…vM]∈RN×M;H=VTJnV∈RM×M. (5)

The Krylov approximation matrix is the restriction of the full ODE Jacobian to the Krylov space:

 An=VHVT=VVTJnVVT. (6)

A Rosenbrock-Krylov method leverages the Krylov approximation matrix (6), within the context of a Rosenbrock-W framework Tranquilli_2014_ROK . Similarly, an exponential-Krylov method leverages the Krylov approximation matrix (6) within the exponential-W framework Tranquilli_2014_expK ; Sandu_2017_EPIRK . The special structure of the Krylov approximation matrix leads to the reduced space form of the method as summarized in Algorithm 1. The Krylov approximation matrix has the additional feature that when is smaller than the dimension of the Krylov subspace (), leading to a significant reduction in the number of order conditions, and thus the number of required stages, compared to a W-type method. A full presentation of the order condition theory for K-methods can be found in Tranquilli_2014_ROK ; Tranquilli_2014_expK ; Sandu_2017_EPIRK .

## 3 Stability Analysis

The standard approach for investigating the linear stability of a time integration scheme is through application of the method to the linear Dahlquist test equation so that a local recurrence relation for the solution at in terms of the solution at can be constructed:

 y′=λy,y0=1⇒yn+1=R(hλ)yn.

The set

 S={z∈C;|R(z)|≤1}

is called the stability region of the method Hairer_book_II .

Unfortunately, this approach is not sufficient for examining the stability of Rosenbrock methods where , such as W-methods and K-methods, or for exponential schemes where the are computed approximately. For these schemes, since in general and are not simultaneously diagonalizable, it is necessary to consider the vector-valued linear test problem:

 y′=Jy,y0=1. (7)

Application of a W- or K-method to equation (7) leads to the local recurrence

 yn+1=R(hJ,hA)yn,

so that the stability region is determined by those step sizes for which the spectral radius of the transfer matrix is bounded: .

Specifically, the -stage Rosenbrock method (2) applied to the linear test problem (7) reads:

 ki = hJ(y0+i−1∑j=1αi,jkj)+(hAi∑j=1γi,jkj),  i=1,…,s, (8a) yn+1 = yn+s∑i=1biki. (8b)

We define the coefficient matrices:

 α=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0⋯⋯0α2,10…⋮⋮⋱⋱⋮αs,1⋯αs,s−10⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,γ=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣γ0⋯0γ2,1γ0⋮⋮⋱⋱0γs,1⋯γs,s−1γ⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,β=α+γ,

and the supervectors:

 Kn=⎡⎢ ⎢⎣k1⋮ks⎤⎥ ⎥⎦∈RNs,Yn=⎡⎢ ⎢⎣yn⋮yn⎤⎥ ⎥⎦=(1s⊗IN)yn∈RNs.

The method (8) can be written in matrix notation as follows:

 Kn = h(Is⊗J)Yn+h[(α⊗J)+(γ⊗A)]Kn (9a) = h(Is⊗J)Yn+h(β⊗J)Kn+ρn, (9b) yn+1 = yn+(bT⊗IN)Kn. (9c)

The stage equation (9b) makes exclusive use of the exact Jacobian by adding the residual supervector:

 ρn:=hγ⊗(A−J)Kn=⎡⎢ ⎢⎣r1⋮rs⎤⎥ ⎥⎦∈RNs, (10)

where is the -th stage residual vector, to obtain the same intermediate stage values .

Isolating in both (9a) and (9b), leads to

 Kn = h[INs−(α⊗hJ)−(γ⊗hA)]−1(Is⊗J)Yn = h(INs−hβ⊗J)−1(Is⊗J)Yn+(INs−hβ⊗J)−1ρn.

From here we construct an explicit formulation of the residual vector , and apply the useful linear algebra identity:

 [INs−(α⊗hJ)−(γ⊗hA)]−1−[INs−(β⊗hJ)]−1=−[INs−(β⊗hJ)]−1[γ⊗(hJ−hA)][INs−(α⊗hJ)−(γ⊗hA)]−1, (11)

to obtain

 ρn = (INs−hβ⊗J)[INs−(α⊗J)−(γ⊗A)]−1(Is⊗hJ)Yn−(Is⊗hJ)Yn (INs−β⊗hJ)−1ρn = ([INs−(α⊗J)−(γ⊗A)]−1−(INs−hβ⊗J)−1)(Is⊗hJ)Yn.

The stability of the Rosenbrock-K method is then given by:

 yn+1 = yn+(bT⊗I)Kn = = R(hJ)yn+(bT⊗IN)([INs−(α⊗J)−(γ⊗A)]−1−(INs−hβ⊗J)−1)(1s⊗hJ)yn = = ˜R(hJ,hA)yn.

Here is the stability function of the traditional Rosenbrock method with the same coefficients. The effective stability function of the Rosenbrock-K method is the Rosenbrock stability function plus an additional term that depends on both and . This additional term is the stage stability function .

The linear stability of the underlying Rosenbrock method does not guarantee the stability of the Rosenbrock-Krylov method

 ρ(R(hJ))≤1

does not guarantee the stability of the Rosenbrock-Krylov method

 ρ(˜R(hJ,hA))≤1,

unless the stage stability function is small in some sense. This function can be expressed explicitly as:

 S(hJ,hA)yn=−(bT⊗IN)[INs−(β⊗hJ)]−1[γ⊗(hJ−hA)][INs−(α⊗hJ)−(γ⊗hA)]−1(1s⊗hJ)yn=−(bT⊗IN)[INs−(β⊗hJ)]−1[γ⊗(hJ−hA)]Kn,

or more generally as:

 S(hJ,hA)yn=−(bT⊗IN)[INs−(β⊗hJ)]−1[γ⊗(hJ−hA)][INs−(γ⊗hA)]−1Fn, (12)

where are the stage function values (2b) concatenated for all stages concatenated:

 Fn=⎡⎢ ⎢⎣F1⋮Fs⎤⎥ ⎥⎦.

In later sections we will attempt to keep (12) under control through two different strategies. The most obvious of these approaches, presented in Section 4.1, directly controls the residual of the linear system solutions. This can be accomplished by simply increasing the dimension of the underlying space on which the Krylov approximation matrix is based, and has the effect of decreasing the projection error

 ∥J−A∥=∥J−VVTJVVT∥,

and directly reducing the corresponding term in (12). The alternative approach, based on the specific form of the residuals presented later, attempts to control (12) by extending the Krylov space with the right hand side values at each internal stage (2b).

## 4 Adaptive Methods to Increase Rosenbrock-K Stability

### 4.1 Controlling the residual magnitude

The most direct strategy for improving the stability of a Rosenbrock-Krylov method is by controlling the residual magnitude at each stage of the time integration scheme.

###### Theorem 1 (Stage residual of a Rosenbrock-Krylov method)

When the time integration scheme (2) is implemented as in Algorithm 1, the residual of the stage is given by:

 (13)

where in (10) with the dimension of the underlying Krylov subspace made explicit.

We begin with the formulation of the method in Algorithm 1 as:

 ki = hFi+hJni∑j=1γi,jkj+rM;i (14a) λi = hψi+hHi∑j=1γi,jλj (14b) ψi = VTFi (14c) ki = Vλi+h(Fi−Vψi). (14d)

Inserting (14d) into (14a) and expanding leads to:

 rM;i=Vλi−hVψi−h2Jni∑j=1γi,j(Fj−Vψj)−hJnVi∑j=1γi,jλj.

Next, apply the Arnoldi relationship

 JnV=VH+hM+1,MvM+1eTM, (15)

to the terms containing :

 rM;i=Vλi−hVψi−h2Jni∑j=1γi,j(Fj−Vψj)−hVHi∑j=1γi,jλj−hM+1,MvM+1eTMi∑j=1γi,jλj

Collecting terms with (14b) in mind, we obtain

 rM;i=V(λi−hψi−hHi∑j=1γi,jλj)=0−h2Jni∑j=1γi,j(Fj−Vψj)−hM+1,MvM+1eTMi∑j=1γi,jλj,

and arrive at the final result (13). ∎

Unfortunately, the residuals depend not only on the reduced stage solutions , but also on the full space matrix-vector products . The dependence on ’s means that verifying the size of the residual in the final stage requires essentially computing the full space timestep, and so any strategy attempting to bound this residual will be computationally infeasible. An alternative approach is to bound the residuals only in the first stage, so that only is required. This has the computational benefit that since , and so no full space matrix-vector products are required.

###### Corollary 1 (Residual of the first stage of a Rosenbrock-Krylov method)

When the time integration scheme (2) is implemented as in Algorithm 1, then the residual of the first stage is given by:

 rM;1=−hγhM+1,MvM+1eTMλ1,

with norm

 ∥rM;1∥=|hγhM+1,M|⋅∣∣eTMλ1∣∣.

It is then possible to implement a Rosenbrock-Krylov method which automatically determines the dimension of the Krylov subspace such as to control the size of the residual in the first stage through the use of an adaptive Arnoldi process. This method is summaruzed in Algorithm 2. The computational cost of computing the residuals can be reduced by only computing them for a subset of iterations. For example it was proposed in Hochbruck_1998_exp to compute the residuals only for the test indices .

### 4.2 Adding vectors from outside the Krylov space

An alternative to increasing the size of the Krylov subspace in the very beginning is to extend the subspace with vectors from other sources, not necessarily from the same Arnoldi sequence.

Consider the Krylov space with basis and the upper Hessenberg matrix . In order to extend the Krylov basis with arbitrary vectors , one forms an extended basis set through the same orthonormalization procedure present in the Arnoldi iteration:

 VM+r=[v1,v2,…,vM,¯v1,…,¯vr],VTM+rVM+r=IM+r.

The extended matrix is constructed such as to maintain the relationship . We have that:

 HM+r=[[HM0r×1]hM+1…hM+r],

with the additional columns given by:

 hM+i=VTM+r(Jn¯vi)for i=1,…,r.
###### Remark 1

In previous sections, the Arnoldi relationship (15) is written in terms of , the next vector which would be added to the Krylov basis , and a coefficient corresponding to which would be added to the upper-Hessenberg matrix . Using the definition of the Arnoldi process we have that

 hM+1,MvM+1=(I−VMVTM)JnvM,

where is the last column in the Krylov basis matrix . With this, one can rewrite the Arnoldi relationship (15) as follows:

 JnVM=VMHM+(I−VMVTM)JnvMeTM. (16)
###### Lemma 1 (Arnoldi-like relationship for extended basis)

When the Krylov basis matrix is extended with orthonormal (but otherwise arbitrary) vectors , , , , the Arnoldi relationship (16) extendeds to the new basis as follows:

 (17)

where is the ’th canonical basis vector of .

Recall that and have the following structure:

where and . Computing gives

which follows from the Arnoldi relationship (with ). Similarly, computing gives

 VM+rHM+r=[VM¯¯¯¯¯V][HM\multirowsetup¯¯¯¯¯H0]=[VMHMVM+r¯¯¯¯¯H].

Subtracting yields:

Substituting in the definition of and rearranging leads to:

Finally, we rewrite the last matrix as a sum of rank-1 matrices using canonical basis vectors :

 [(I−VMVTM)JvMeTM;M(I−VM+rVTM+r)J¯¯¯¯¯V] =(I−VMVTM)JvMeTM+r;M+r∑i=1(I−VM+rVTM+r)J¯vieTM+r;M+i.

With this substitution and after some rearranging we obtain equation (17). ∎

With the ability to extend the basis with arbitrary vectors, the question now is: what vectors shall we add? Intuitively, we seek to add those vectors that provide improved stability, or related, improved benefits in controlling the residuals of later stages of the Rosenbrock-Krylov method.

###### Remark 2 (Extending the Krylov space)

When the Krylov space changes for each stage, so does the Jacobian projection

 Ai=VM+iVTM+iJnVM+iVTM+i.

In this case the additional stability term defined by equation (12) changes to:

 S(hJ,hA1,…,hAs)yn=−(bT⊗I)[I−(β⊗hJ)]−1[γ⊗hJ−h˜Aγ][I−h˜Aγ]−1F, (18)

where the block matrix in (12) has been replaced by:

 ˜Aγ=[γi,jAj]1≤i,j≤s.

Since the coefficient matrices and are lower triangular, the matrices , , as well as are all block lower triangular. The -th stage component vector in (18) has the form:

where the first part corresponds to the action of the diagonal block, and the remaining part contains linear combinations of off-diagonal blocks times the previous stage function values . The first part is the error in the solution of the stage linear system when solved in the subspace space spanned by (with replaced by ). A strategy to reduce is to extend the Krylov space at stage such as to reduce the linear system solution error. In the next section we discuss extending the subspace with the right hand side of the system . Note that the off-diagonal components components may remain large even after extending the space, as this reduces the first part of the vector .

### 4.3 Exploiting the form of the residual

We see from equation (13) in Theorem 1 that residuals for stages beyond the first contain terms for which it is difficult to guarantee boundedness, even when the first stage residuals are controlled. We attempt to overcome this fact, by extending the basis at the ’th stage to include the intermediate RHS values for so that

 (Fi−VM;iψi)=0,for i=1,…,s.

The matrices and can be constructed iteratively, extending the basis with at each stage to form and as outlined in the previous section. One difficulty of extending the basis in this way, is that the -decomposition of the left-hand-side of (2a), cannot be reused. This problem can be overcome through the use of an update to the and matrices. The -decomposition is constructed to solve

 (I−hγH)x=bP(I−hγH)x=PbLUx=Pb

so that

and so exploiting the upper-Hessenberg structure of , the -decomposition can be updated as

 ⎡⎢ ⎢⎣u1,M+1⋮uM,M+1⎤⎥ ⎥⎦=−hγL−1MPM⎡⎢ ⎢⎣h1,M+1⋮hM,M+1⎤⎥ ⎥⎦,uM+1,M+1=(1−hγhM+1,M+1),lM+1,M+1=1
###### Theorem 2 (Residual of the ith stage with extended basis)

When the time integration scheme (2) is implemented with the basis extended (via the process described in Section 4.2) at each stage to include the current right-hand-side vector, , then the residual of the th stage when using an -dimensional Krylov space is given by:

 rM;i=−h(I−VMVTM)JvMeTM+i−1,Mi∑j=1γi,jˆλj−h(I−VM;iVTM;i)Ji−1∑k=1vM+keTM+i−1,M+ki∑j=1γi,jˆλj, (19)

where are canonical basis vectors, , and .

The general stage equations for the method are:

 ki=hFi+hJi∑j=1γi,jkj+rM;iˆλi=hψi+hHM;ii−1∑j=1γi,jˆλjψi=VTM;iFiki=VM;iˆλi+h(Fi−VM;iψi)=VM;iˆλi

 ˆλj={[λTj,0,…,0]T,j

where reduced space solutions from previous stages are extended with zeros to match the dimension of the current stage system.

Starting from the full space equation, we make substitutions for and the ’s:

 rM;i=VM;iˆλi−hVM;iψi−hJVM;ii∑j=1γi,jˆλj,

Making use of lemma 1, we make substitutions for appearances of :

 rM;i=VM;iˆλi−hVM;iψi−hVM;iHM;ii∑j=1γi,jˆλj−h(I−VMVTM)JvMeTM+i−1;Mi∑j=1γi,jˆλj−h(I−VM;iVTM;i)Ji−1∑k=1vM+keTM+i−1;M+ki∑j=1γi,jˆλj

Collecting terms and rearranging allows us to find and cancel the terms from the reduced space equation:

 rM;i=VM;i[ˆλi−hψi−hHM;ii∑j=1γi,jˆλj]=0−h(I−VMVTM)JvMeTM+i−1;Mi∑j=1γi,jˆλj−h(I−VM;iVTM;i)Ji−1∑k=1vM+keTM+i−1;M+ki∑j=1γi,jˆλj
 rM;i=−h(I−VMVTM)JvMeTM+i−1;Mi∑j=1γi,jˆλj−h(I−VM;iVTM;i)Ji−1∑k=1vM+keTM+i−1;M+ki∑j=1γi,jˆλj.

The new residual coming from the extended basis (19) appears to be a substantial improvement over the residuals from the vanilla method. Since here the residuals consist exclusively of terms projected out of the span of and .

## 5 Numerical Results

In this section, we evaluate the behavior of the two presented modifications to the Rosenbrock-K method on a selection of test problems. As both the adaptive basis size modification and the basis extension variant are intended to increase the stability of the method, testing primarily focused on problems that demonstrate stiff behavior: CBM-IV as a stiff system of densely-coupled ODEs, and the Allen-Cahn equation as a PDE with variable stiffness based on coefficient selection. Also, the shallow water equations were tested in order to demonstrate behavior in a non-stiff case. All tests were performed in Matlab on a single workstation, using an implementation of the ROK method with an adaptive timestep-size error controller, and results were compared against reference solutions computed by Matlab’s ode15s integrator. Figures are produced by sweeping over the same set of relative error tolerances .

All the figures in this section use the same labeling scheme, as follows. Data sets labeled use the stated fixed number of basis vectors, where those labeled use an adaptive number based on the stated residual tolerance. Dashed lines indicate basis extension and can apply to either fixed or adaptive basis size selection, indicated in the label by the suffix . Also, lines labeled are experiments where the Arnoldi residual tolerance is set to the same value as the tolerance for the adaptive timestep-size error controller.

### 5.1 Non-stiff case: shallow water equations

First, we examine relative performance of the methods on the shallow water equations. The shallow water equations in spherical coordinates are

 ∂u∂t+1acosθ(u∂u∂λ+vcosθ∂u∂θ)−(f+utanθa)a+gacosθ∂h∂λ = 0 (20a) ∂v∂t+1acosθ(u∂v∂λ+vcosθ∂v∂θ)+(f+utanθa)u+ga∂h∂θ = 0 (20b) ∂h∂t+1acosθ(∂(hu)∂λ+∂(hvcosθ)∂θ) = 0. (20c)

Where , is the height of the atmosphere, is the zonal wind component, is the meridional wind component, and are the latitudinal and longitudinal directions, is the radius of the earth, is the rotational velocity of the earth, and is the gravitational constant. The space discretization is performed using the unstaggered Turkel-Zwas scheme Navon_1987_swe ; Navon_1991_swe , with 72 nodes in the longitudinal direction and 36 nodes in the latitudinal direction. Figure 1: Work-precision diagram for a Rosenbrock-Krylov method applied to the shallow water equations (20a).

Figure 1 shows the relative performance of a fourth order ROK method with a variety of basis selection criteria, applied to the shallow water equations. For non-stiff problems, the primary limiter for stepsize is the error controller, rather than method stability, and making each timestep as cheap as possible will result in the fastest method of a given order. This can be seen in the figure, with the fixed basis