DeepAI

# ParaDIAG: Parallel-in-Time Algorithms Based on the Diagonalization Technique

08/14/2021

### Global space-time Trefftz DG schemes for the time-dependent linear nonhomogeneous and anisotropic wave equation

In this paper we are concerned with Trefftz discretizations of the time-...
07/16/2020

### Fast Iterative Solver for the Optimal Control of Time-Dependent PDEs with Crank-Nicolson Discretization in Time

In this article, we derive a new, fast, and robust preconditioned iterat...
12/06/2019

### Time-parallel simulation of the Schrödinger Equation

The numerical simulation of the time-dependent Schrödinger equation for ...
03/16/2020

### A note on parallel preconditioning for the all-at-once solution of Riesz fractional diffusion equations

The p-step backwards difference formula (BDF) for solving the system of ...
09/05/2018

### Parallel numerical method for nonlocal-in-time Schrödinger equation

We present a new parallel numerical method for solving the non-stationar...
12/07/2019

### Scalable Algorithms for High Order Approximations on Compact Stencils

The recent development of parallel technologies on modern desktop comput...
03/09/2021

### Exploiting Asynchronous Priority Scheduling in Parallel Eikonal Solvers

Numerical solutions to the Eikonal equation are computed using variants ...

## 1 Basic idea of ParaDIAG

We start with a basic introduction to ParaDIAG algorithms. Suppose we need to solve in parallel the system of ODEs with initial value arising from the semi-discretization of a time-dependent PDE, where . For finite element discretizations, is the mass matrix and is the stiffness matrix. For finite difference discretizations,

is just an identity matrix. The classical approach for solving such systems of ODEs is to apply a time-integrator, and then solve the resulting difference equation step-by-step in time. Instead, ParaDIAG tries to solve these difference equations

all-at-once. For linear multi-step methods, the all-at-once system is of the form

 Au=b, A:=B1⊗M+B2⊗K, (1.1)

where are Toeplitz matrices specified by the time-integrator and is the number of time steps111For Runge-Kutta methods, the all-at-once system is different and will be treated in a forthcoming update of this note.. All ParaDIAG algorithms focus on treating the matrices and , while keeping and unchanged. There are mainly two approaches: first, using different step sizes , e.g., a geometrically increasing sequence with , which makes the time-discretization matrices diagonalizable. This yields ParaDIAG algorithms which are direct solvers in the ParaDIAG-I group [12, 5, 7].

The second treatment is to use a uniform step size and solve the all-at-once system (1.1) iteratively, which leads to ParaDIAG algorithms in the ParaDIAG-II group. There are several variants, but the common point is to introduce the -circulant block matrix

 Pα:=C(α)1⊗M+C(α)2⊗K, (1.2)

where and are Strang type -circulant matrices constructed from and , and is a free parameter. One can then either solve (1.1) via the stationary iteration 

 Pαuk=(Pα−A)uk−1+b, (1.3)

where is the iteration index, or via Krylov subspace methods (e.g., GMRES, MINRES) by solving the preconditioned system 

 P−1αAu=P−1αb, (1.4)

which is nothing else than the stationary iteration (1.3) written at its fixed point, i.e. at convergence.

The algorithms proposed in  and  are essentially ParaDIAG-II algorithms as well, but they are derived from a different point of view. For example, in  the authors introduced a Waveform Relaxation (WR) iteration , , and after a time-discretization one can show that at each iteration the all-at-once system is , where . The algorithm in  can be understood similarly.

For each variant of ParaDIAG-II we need to compute with

being an input vector. The reason for using

is twofold: first, since and are Strang type -circulant matrices constructed from the Toeplitz matrices and , it naturally holds that converges to as goes to zero. This implies that by using a relatively small , the ParaDIAG-II algorithms converge rapidly. The second point lies in the fact that and can be diagonalized simultaneously, as is shown in the following Lemma.

###### Lemma 1 (see )

Let (with and ) be the discrete Fourier matrix and define for any given parameter the diagonal matrix

 Γα=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣1α1Nt⋱αNt−1Nt⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦.

Then the two -circulant matrices can be simultaneously diagonalized as

 C(α)j=VDjV−1, Dj=diag(√NtFΓαC(α)j(:,1)), j=1,2,

where and represents the first column of , .

Due to the property of the Kronecker product, we can factor and thus we can compute by performing the following three steps:

 Step-(a)  S1=(V−1⊗Ix)r,Step-(b)  S2,n=(λ1,nM+λ2,nA)−1S1,n, n=1,2,…,Nt,Step-(c)  u=(V⊗Ix)S2, (1.5)

where and . Since and are given by FFT techniques, Step-(a) and Step-(c) can be computed efficiently with operations. Step-(b) can be computed in parallel since all linear systems are completely independent from each other at different time points. These three steps represent the key steps of ParaDIAG algorithms and will appear frequently in this note, although the details differ in the various cases.

For nonlinear problems with , the basic idea for applying ParaDIAG algorithms is as follows: for linear multi-step methods, the non-linear all-at-once system is

 (B1⊗M)u+(B2×Ix)F(u)=b, (1.6)

where . The Jacobian matrix of (1.6) is

 B1⊗M+(B2⊗Ix)∇F(u), (1.7)

where . To apply ParaDIAG, we approximate the Jacobian matrix (1.7) by

 Pα(u):=C(α)1⊗M+C(α)2⊗¯¯¯¯¯¯¯¯∇f(u),

where is constructed from the values by some averaging , e.g., or . Then, we can solve (1.6) by the following simplified Newton iteration:

 Pα(uk−1)Δuk−1=−((B1⊗M)uk−1+(B2×Ix)F(uk−1)−b), uk=uk−1+Δuk−1, (1.8)

where for each iteration the increment can be obtained using a ParaDIAG algorithm performing the three steps in (1.5). If we use different step sizes as in , then and are already diagonalizable, and we can replace by in (1.8).

In practice, the ParaDIAG algorithms can be combined with a windowing technique: after a certain number of time steps computed in parallel in the current time window, the computation can be restarted for the next time window in a sequential way. This permits the use of a certain adaptivity in time and space.

To illustrate the ParaDIAG-I and ParaDIAG-II algorithms, we now use the concrete example of the advection-diffusion equation with periodic boundary conditions222We use periodic boundary condition make the advection dominated situation harder for PinT algorithms, see .

 ⎧⎪⎨⎪⎩ut−νuxx+ux=0,(x,t)∈(−1,1)×(0,T),u(−1,t)=u(1,t),t∈(0,T),u(x,0)=e−30x2,x∈(−1,1), (2.1)

where . Using the method of lines and a centered finite difference scheme for the spatial derivatives, we get the system of ODEs

 ˙U(t)+AU(t)=0, U(0)=U0, (2.2a) where the matrix A∈RNx×Nx is A=νΔx2⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣2−1−1−12−1⋱⋱⋱−12−1−1−12⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦+12Δx⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣01−1−101⋱⋱⋱−1011−10⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (2.2b) Here Nx=2Δx, and the periodic boundary conditions cause a zero eigenvalue in the matrix A.

To use ParaDIAG as a direct solver, one has to use all different time steps to make the time stepping matrix diagonalizable, and one possibility is to use geometrically increasing time step sizes to discretize (2.2a) as proposed in ,

 Δtn=Δt1τn−1,n≥1, (2.3)

where is free parameter and is the first step size. We use the linear -method,

 Un+1−UnΔtn+1+A[θUn+1+(1−θ)Un]=0, n=0,1,…,Nt−1. (2.4)

We will only consider and , which corresponds to the Backward-Euler method and the Trapezoidal rule. For , the method is also called the Crank-Nicolson scheme. The difference equations (2.4) can be combined into the all-at-once system

 (B1⊗Ix+B2⊗A)u=b, (2.5a) where u=(U⊤1,…,U⊤Nt)⊤, Ix∈RNx×Nx is an identity matrix and B1,B2∈RNt×Nt are matrices representing the time-discretization, namely B1=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣1Δt1−1Δt21Δt2⋱⋱−1ΔtNt1ΔtNt⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦, B2=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣θ1−θθ⋱⋱1−θθ⎤⎥ ⎥ ⎥ ⎥ ⎥⎦. (2.5b)

The right hand-side is given by with .

Let and . Then, we can rewrite (2.5a) as

 (B⊗Ix+It⊗A)u=~b, (2.6)

where is an identity matrix. The diagonalization of for and can be found in  and  respectively, but for the reader’s convenience, we show the details here:

###### Theorem 2.1 (see [5, 7])

For the geometrically increasing step sizes given by (2.3) with , the matrix can be diagonalized as , where

. The eigenvector matrix

and its inverse are Toeplitz matrices of the form

 V=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣1p11p2p11⋮⋱⋱⋱pNt−1…p2p11⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦, V−1=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣1q11q2q11⋮⋱⋱⋱qNt−1…q2q11⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,

where

 ⎧⎪ ⎪⎨⎪ ⎪⎩pn=1∏nj=1(1−τj), qn=(−1)nτn(n−1)2pn,θ=1,pn=∏nj=11+τj1−τj, qn=q−n∏nj=11+τ−j+21−τ−j,,θ=12.

Now using the typical ParaDIAG factorization

 B⊗Ix+It⊗A=(V⊗Ix)(D⊗Ix+It⊗A)(V−1⊗Ix),

we can solve (2.6) by performing the three steps

 Step-(a)  S1=(V−1⊗Ix)~b,Step-(b)  S2,n=(1θΔtn+A)−1S1,n, n=1,2,…,Nt,Step-(c)  u=(V⊗Ix)S2, (2.7)

where and . Since and are given in closed form, we only have to do matrix vector multiplications for Step-(a) and Step-(c), or one could use a fast Toeplitz solver based on Fourier techniques. For Step-(b), the linear systems can be solved simultaneously in parallel. There is however an important issue with this direct time parallel solver ParaDIAG-I: if the time steps are very different, the truncation error of the time stepping scheme becomes worse, and if they are very close to each other, ParaDIAG-I suffers from roundoff error in the diagonalization used in Step-(a) and Step-(c). The best one can do is to balance the two errors, as a detailed analysis in [5, 7] shows, and this limits the applicability of ParaDIAG-I to shorter time intervals and few time steps: the roundoff error is proportional to the condition number of , i.e.,

 roundoff error∝Cond2(V).

If is an eigenvector matrix of , the scaled matrix with any invertible diagonal matrix is an eigenvector matrix of as well. From [5, 7], the matrix is a good choice.

To illustrate the limitations of ParaDIAG-I, we provide the Matlab code ParaDIAG_V1_for_ADE, to test it for the advection-diffusion equation. For given , and the constraint determines uniquely the value of the geometric time steps as

 Δtn=τn∑Ntn=1τnT. (2.8)

For the space discretization, we fix . To study the accuracy of ParaDIAG-I, we use a reference solution obtained from the Matlab ODE solver ode45 with a very small absolute and relative tolerance, AbsTol= and RelTol=, see Figure 2.1 for an illustration. Figure 2.1: Left: numerical solution uParaDIAG−I obtained with Nt=64 and τ=1.2. Right: reference solution uode45. Here, ν=10−2 and the Trapezoidal rule is used.

The results of ParaDIAG-I compared to a sequential time integration are shown in Figure 2.2 Figure 2.2: Left: using the geometric time steps (2.8) with 7 different τ the errors measured at the finial time point t=T for two numerical solutions: usbs obtained step by step (dash-dot lines) and uParaDIAG−I obtained by (2.7) (solid lines). For uParaDIAG−I with τ=τ0=1.051, the error is larger than 103 and thus the corresponding solid line is not shown. Right: for Nt=64 the error for the two numerical solutions as τ varies. Here, ν=10−2 and the Trapezoidal rule is used.

We clearly see that using the geometric time steps (2.8) degrades the accuracy of the numerical solution, and when the time steps are too similar, the roundoff error problem sets in. This phenomenon was carefully studied in [5, 7], and the best possible geometrically stretched grid was determined, which leads to precise limits of time window length and number of time steps within which ParaDIAG-I can be reliably used.

Instead of using ParaDIAG as a direct solver with all different time steps to make the time stepping matrix diagonalizable, we can use it iteratively and solve a nearby problem in each iteration chosen such that the time stepping matrix of the nearby problem with uniform time step size can still be diagonalized. This idea leads to ParaDIAG algorithms in the ParaDIAG-II group. Among this group, we can use ParaDIAG within a stationary iteration or a Krylov subspace method. There are so far two very different ways to use ParaDIAG within a stationary iteration, proposed in  and . The use of ParaDIAG within a Krylov subspace method can be found in [13, 18].

#### 2.2.1 ParaDIAG-II – Waveform Relaxation (WR) Variant

The ParaDIAG algorithm introduced in  is based on the Waveform Relaxation iteration

 ˙Uk(t)+AUk(t)=0, Uk(0)=U0+α(Uk(T)−Uk−1(T)), t∈(0,T), (2.9)

where is the iteration index and is a free parameter. Upon convergence, the tail term is canceled and thus the converged solution is the solution of (2.2a). Applying the linear -method with a uniform step size to (2.9) gives

 ⎧⎨⎩Ukn−Ukn−1Δt+A(θUkn+(1−θ)Ukn−1)=0, n=1,2,…,Nt,Uk0=αUkNt−αUk−1Nt+U0, (2.10)

where . We rewrite (2.10) as an all-at-once system,

 (C(α)1⊗Ix+C(α)2⊗A)uk=bk−1, (2.11a) where uk=(Uk1,…,UkNt)⊤, C(α)1,C(α)2∈RNt×Nt and bk−1∈RNtNx are given by C(α)1=1Δt⎡⎢ ⎢ ⎢ ⎢ ⎢⎣1−α−11⋱⋱−11⎤⎥ ⎥ ⎥ ⎥ ⎥⎦, C(α)2=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣θ(1−θ)α1−θθ⋱⋱1−θθ⎤⎥ ⎥ ⎥ ⎥ ⎥⎦,bk−1=((U0−αUk−1Nt)(1ΔtIx−(1−θ)A),0,…,0)⊤. (2.11b)

The matrices are so-called -circulant matrices and can be diagonalized as stated in Lemma 1, and we can again use the typical ParaDIAG factorization . Hence, similar to (2.7) we can solve (2.11a) performing the three steps

 Step-(a)  S1=(F⊗Ix)(Γα⊗Ix)bk−1,Step-(b)  S2,n=(λ1,nIx+λ2,nA)−1S1,n, n=1,2,…,Nt,Step-(c)  uk=(Γ−1α⊗Ix)(F∗⊗Ix)S2, (2.12)

where and . In (2.7), Step-(a) and Step-(c) can be computed efficiently via FFT and Step-(b) is again highly parallel. The eigenvector matrix satisfies

 Cond2(V)=Cond2(Γ−1αF∗)≤Cond2(Γ−1α)Cond2(F∗)=Cond2(Γ−1α)≤1α, (2.13)

and thus the conditioning is depending on the choice of . The convergence properties of this ParaDIAG-II algorithm are summarized in the following theorem.

###### Theorem 2.2 (see )

For the linear system of ODEs , suppose with being an arbitrary eigenvalue of . Let be the -th iterate of the ParaDIAG-II algorithm (2.10) with and

be the reference solution obtained by directly applying the same time-integrator to the system of ODEs. Then the linear convergence estimate

holds, where

 ρ≤⎧⎨⎩αe−Tr1−αe−Tr,\rm Backward% -Euler,α1−α,\rm Trapezoidal~{}rule.

This shows that the ParaDIAG-II algorithm (2.10) converges with a rate independent of the spectrum of the matrix and the step size of the time-discretization. The convergence factor becomes smaller when decreases, but the condition number of (cf. (2.13)) implies that can not be arbitrarily small (e.g., not of the size ), because in this case the roundoff error will pollute the accuracy. The best parameter is again the value balancing the roundoff error and the discretization error, like for the direct solver ParaDIAG-I, see  for more discussions. In practice, and are good choices.

We provide a Matlab code, namely ParaDIAG_V2_WR_for_ADE, to test the ParaDIAG-II algorithm (2.10). In the code, we use the fft command to obtain by just using the first columns of , instead of the entire matrices. To implement Step-(a) in (2.12), we use the fft command as follows:

 b=reshape(b,Nx,Nt);  sol_stepA=fft(Gam.*(b.’)).’;

where b is the vector . Similarly, to implement Step-(c) we use the inverse FFT command ifft,

 Uk=(invGam.*ifft(sol_stepB.’)).’;

Here, Gam and invGam. With an initial guess chosen randomly as random(’unif’,-20,20,, ), the first 2 iterates of this ParaDIAG-II algorithm are shown in Figure 2.3. Figure 2.3: Initial guess and the first two iterates generated by the ParaDIAG-II algorithm (2.10) for ν=10−4 and Δx=Δt=164.

The maximum error at each iteration is shown in Figure 2.4. Figure 2.4: The convergence of the ParaDIAG-II algorithm (2.10) is robust with respect to ν, Δx and Δt. Here, α=10−2 and the Trapezoidal rule is used as the time-integrator.

We next present some parallel speedup results for the ParaDIAG-II algorithm (2.10) based on Waveform Relaxation for a time-dependent advection-diffusion problem with periodic boundary conditions in 2D,

 {∂tu(x,t)−νΔu(x,t)+∇u(x,t)=0,in (0,T)×Ω,u(x,0)=u0(x),in Ω, (2.14)

where and . The results were obtained on the China Tianhe-1 supercomputer , which is a multi-array, configurable and cooperative parallel system with a theoretical peak performance of 1.372 petaflops, composed of high performance general-purpose microprocessors and a high-speed Infiniband network. We used the parallel Fortran library MUMPS (MUltifrontal Massively Parallel sparse direct Solver [2, 1]) version 4.10.0 to solve the linear systems in Step-(b) of (2.12). For Step-(a) and Step-(c), the fft and ifft commands are dissected into complex arithmetic operations. For the particular case when the source term is zero as shown in (2.11b), Step-(a) can be implemented in an economical way: only the first column of is needed to compute .

We provided the parallel codes in Fortran, which are zipped by a document labeled as ’Parallel Codes.zip’. For reader’s convenience, the zip document contains a README file, in which we briefly introduce how to use these codes. Note that the modification of the Fortran library MUMPS (serving the purpose of serializing it in a distributed memory environment) is due to the fact that the relevant spatial solves are currently executed only in serial. In addition, a number of note statements are appended to Fortran’s functions and subroutines. All the Fortran codes are compiled with mpich-3.1.3 using the icc compiler version 11.1.059 and -O2 optimization level.

We denote by ParaDIAG-II (B-E) the algorithm (2.10) using Backward-Euler, and by ParaDIAG-II (TR) the one using the Trapezoidal rule, and set . For comparison, we also apply the parareal algorithm and MGRiT to (2.14). The parareal algorithm is implemented using the two-level XBraid solver with F-relaxation, and MGRiT is the multilevel XBraid solver with FCF-relaxation (i.e., an initial F-relaxation followed by a C-relaxation and then a second F-relaxation). Furthermore, we skip the unnecessary work during the first XBraid down cycle for both the parareal and MGRiT algorithms, and fix the coarsening factor to 8. As shown in Table 2.1,

ParaDIAG-II (B-E), ParaDIAG-II (TR), parareal and MGRiT converge robustly with respect to the number of processors. ParaDIAG-II (B-E) and ParaDIAG-II (TR) lead to parameter-robust convergence, while for parareal and MGRiT the required iteration counts increase dramatically as changes from 1 to . The tolerance tol for all experiments here is set to ). In Figure 2.5 Figure 2.5: Comparison of the overall time-to-solution in a strong scaling study, where Δx=Δy=Δt=1/128 and Nt=512. The coarsening factor is cf=8 for both parareal and MGRiT.

we compare the measured CPU times for these PinT algorithms. Clearly, ParaDIAG-II (B-E) and ParaDIAG-II (TR) are two optimally scaling PinT algorithms, while for MGRiT and parareal the scaling is a little bit worse (this is because of the sequential coarse-grid-correction as we will see in the next subsection). The corresponding data is given in Table 2.2.

Regarding the parallel efficiency measured by  ( is the wall-clock time using processors), the average parallel efficiency for ParaDIAG-II (B-E) is 92.06%, for ParaDIAG-II (TR) it is 90.70%, while it is only 22.82% for MGRiT and 8.75% for parareal.

#### 2.2.2 ParaDIAG-II – Parareal Variant

The second way to use ParaDIAG within a stationary iteration is based on formulating the coarse-grid-correction (CGC) procedure of the parareal algorithm[10, 9] as an all-at-once system and applying ParaDIAG to it. The parareal algorithm is an iterative PinT algorithm, based on the updating formula

 Ukn+1=FJ(Δt,Uk−1n)+G(ΔT,Ukn)−G(ΔT,Uk−1n), n=0,1,…,Nt−1, (2.15)

where and are called coarse and fine propagator, specified by two time-integrators. The quantity denotes a value calculated by applying successively steps of the fine propagator to the differential equations with initial value and the fine step size . The integer is called the coarsening ratio. Let

 bk−1n+1:=FJ(Δt,Uk−1n)−G(ΔT,Uk−1n).

Then, the parareal algorithm is . This is the so called CGC, which is a sequential procedure and is often the bottleneck of the parallel efficiency. In , the author proposed an idea to parallelize the CGC: supposing we have to solve an initial-value problem

 ˙U(t)+f(U(t))=0,U(0)=U0,

we apply to a slightly wrong problem, namely

 ˙U(t)+f(U(t))=0, U(0)=αU(T),

where is a free parameter. We use the linear case to illustrate the details of ParaDIAG-II based on the parareal algorithm (for the nonlinear case, see ). We also use for simplicity Backward-Euler for . Let . The quantity computed from the previous iteration is

 G(ΔT,Uk−1n)={α(Ix+ΔTA)−1Uk−1Nt,n=0,(Ix+ΔTA)−1Uk−1n,n=1,2,…,Nt−1.

Note that all the quantities and can be computed simultaneously in parallel. Hence, . The parareal algorithm (2.15) can be rewritten as

 (Ix+ΔTA)Ukn+1=Ukn+(Ix+ΔTA)bk−1n+1⟹Ukn+1−UknΔT+AUkn+1=(ΔT−1Ix+A)bk−1n+1,

where , which can be represented as

 ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1ΔT⎡⎢ ⎢ ⎢ ⎢ ⎢⎣1−α−11⋱⋱−11⎤⎥ ⎥ ⎥ ⎥ ⎥⎦⊗Ix=C(α)1⊗Ix+⎡⎢ ⎢ ⎢ ⎢ ⎢⎣AA⋱A⎤⎥ ⎥ ⎥ ⎥ ⎥⎦=It⊗A⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣Uk1Uk2⋮UkNt⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦=uk=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣(ΔT−1Ix+A)˜U1−αΔT−1Uk−1Nt(ΔT−1Ix+A)˜U2−ΔT−1Uk−11⋮(ΔT−1Ix+A)˜UNt−ΔT−1Uk−1Nt−1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦=bk.

This problem is now precisely of the form (2.10) for , i.e. the Backward-Euler method, and the solution can be obtained using ParaDIAG-II (cf. 2.12). The convergence rate of this ParaDIAG-II parareal variant is summarized in the following theorem.

###### Theorem 2.3 (see )

Let be the convergence factor of the parareal algorithm with sequential-in-time CGC (i.e., the classical parareal algorithm) and be the convergence factor with parallel-in-time CGC. Then, there exists some threshold of the parameter , such that

 ρPinT−CGC=ρSinT−CGC, if ~{}0<α≤α∗.

In particular, for linear systems of ODEs with