The purpose of this work is the study of structure preserving time-marching schemes for a class of evolution problems in Banach spaces; which essentially are used to describe possibly degenerate parabolic and hyperbolic initial boundary value problems. At this stage, it is enough for our purposes to say that, given a Banach space , a final time , and a mapping , we seek for that solves
As usual, denotes the dual of , and is the duality pairing; see Appendix A for a, far from exhaustive, list of example problems of interest. Solutions of problem (1) satisfy the following energy-balance law:
where the notation is to be specified. Our goal in this work is to find numerical schemes that mimic this law. For instance, given a time partition with being the local timestep size, an approximate solution to (1) could be computed by the backward Euler scheme. Ignoring solvability issues, we obtain and it is immediate to see that
where, once again, the notation is to be specified.
There are several reasons why discrete energy laws such as (3) are important. From the practical point of view, these are a non-perturbative form of stability: they do not rely on any smallness assumption, linearization, asymptotic argument, or proximity to some equilibrium state. For complex PDE problems, where a thorough quantitative and qualitative analysis of the solution, and corresponding numerical scheme, are far out of reach, satisfaction of a discrete energy-balance is usually an excellent surrogate to stability.
On the other hand, from a theoretical point of view, it is often the case that with the help of discrete energy-balance laws one can assert compactness of families of numerical solutions. Once again, for complex PDE problems, (weak) convergence of discrete solutions via compactness is all one can hope to achieve without introducing additional assumptions. Examples of a successful application of this approach are numerous. The interested reader is referred to the following references: elliptic problems [Gidas1981, Evans1987], hyperbolic [Tartar1979, Necas1996], and parabolic [Lions1969]; where one of the main tools is the Aubin-Lions compactness lemma [Lions1969], or some refinement of it [Simon1987, Jungel2012, Jungel2014, Andrei2011, Gllou2012].
The backward Euler scheme is one of the simplest time-stepping schemes, yet it is often possible to prove that it possesses suitable energy-balance laws; see (3). Its major drawback, however, is that it is only first order accurate. In order to remedy this issue, higher order generalizations, like Galerkin-in-time schemes, have been developed and analyzed; see for instance [Lubich1995, Makri2006, Akri2011, Walk2014, Hochbruck2018, Walk2010, Ern2019] and references therein. These schemes can also be shown to possess energy-balance laws, and are of arbitrary high order. However, their practical impact beyond academic examples has been rather limited; see [Pazner2017, Crockatt2019] for a few exceptions. This is due to the fact that Galerkin-in-time methods are algebraically equivalent to full-tableau Runge-Kutta (RK) methods [Akri2011]. As such, they require the solution of linear algebraic systems where the system matrix is of size for each Newton iteration: here is the number of stages of the RK method and
is the number of degrees of freedom of the spatial discretization, see for instance[Smears2017].
Diagonally implicit Runge-Kutta (DIRK) methods lie between these two extreme possibilities (backward Euler and Galerkin-in-time discretizations). DIRK methods offer significantly added computational benefit, since they only require the solution of linear systems of size at each stage, as well as higher order accuracy, see [Crouzeix1976, Alexander1977, Carpenter2016]. Their popularity is, perhaps, in big part due to the paper [Ascher1997] which has been extremely influential in the scientific computing community. The rigorous study of the mathematical properties of DIRK schemes, however, seems to be rather underdeveloped. This is particularly the case if we are interested in the numerical approximation of PDEs satisfying a discrete energy-balance law such as (2). This brings us to the main motivation for our current work: we wish to present classes of DIRK schemes for which we can prove discrete energy-balance laws, and study under which conditions the solution to these schemes enjoy suitable compactness properties.
To achieve these goals we organize our presentation as follows. Notation and the functional framework we shall operate under are introduced in Section 2. Here we also discuss the minimal set of assumptions we require on the mapping . Section 3 then begins by introducing the general form of DIRK schemes and making some comments regarding their implementation when applied to PDEs. Two notions of balance laws: discrete energy-balance, and dissipative discrete energy-balance, respectively, are introduced here for DIRK schemes; their importance and meaning is also discussed. This section then presents an exhaustive literature review regarding the existing (algebraic) notions of stability and why we believe these are not suitable for our purposes. A list of some popular two- and three-stage DIRK schemes finalizes this section. The core of our work is Sections 4 and 5 where, for two- and three-stage schemes, respectively, we explore the existence of discrete energy-balance laws. We arrive at a property, which we call remarkable stability which immediately implies the existence of a dissipative discrete energy law for a DIRK scheme. We note that, given the Butcher tableau of an
-stage DIRK scheme, verifying remarkable stability only requires some algebraic manipulations using the entries of the tableau, and the solution of an algebraic eigenvalue problem of size. Each section concludes by verifying remarkable stability for a list of widely used DIRK schemes. Finally, in Section 6, we refine the assumptions on to arrive at our strongest notion: discrete Bochner-stability. We show that every remarkably stable scheme is discretely Bochner-stable, and that (families of) solutions to discretely Bochner-stable schemes possess suitable compactness properties.
Finally, we believe that one important feature of our exposition is its simplicity. The main results of this manuscript rely, in essence, on the simple polarization identity
and some algebraic manipulations. At times these manipulations may be long and tedious, and some assistance from a computer algebra software package333In this manuscript we used Mathematica©. may be desired. Despite of this, these are nothing but algebraic manipulations. When describing compactness properties, well-known elementary order conditions of RK schemes may be necessary, which are summarized in Appendix C. No high level tools or specialized notions of algebraic stability for ODE solvers are used in our work.
Here we describe the framework and assumptions we shall operate under. If its Hölder conjugate is . We extend this notation such that , , and for all . By we denote a final time.
By we shall mean for a nonessential constant that may change at each occurrence.
2.1. Functional framework
Throughout our work we shall assume that is a separable Hilbert space with inner product and norm . By we denote a Banach space, and its norm is denoted by . The dual of is denoted by , the duality pairing between them is denoted by . The norm in is denoted by .
We shall assume that is a Gelfand triple. We recall that in this setting the duality pairing is an extension of the inner-product. In other words, every defines an element of whose action is defined by
This identification will be repeatedly used in our discussion.
Let be a separable Banach space with norm and . For measurable, we define
Then, we define the Bochner space
which is Banach for the norm . The space of functions that are continuous is denoted by . We endow this space with the -norm.
2.2. Initial value problems
With all the previous notation and preparations at hand we can proceed to rigorously describe the class of problems we are interested in approximating. We assume that the initial condition satisfies , and that the slope function satisfies . We then seek for such that and solves (1).
We will also assume that can be split into an autonomous and purely non-autonomous time-dependent part as
where and . In the context of PDEs and/or ODEs on graphs, we assume that nonhomogeneous boundary data can always be assimilated into .
The minimal set of assumptions we shall impose on the mapping are as follows.
Nonnegativity: The mapping is nonnegative, i.e.,
Local-solvability. We assume that, for every , there exists such that for every the problem
has a unique solution . This assumption guarantees that nonlinear problems associated to each stage in DIRK schemes have a unique solution, possibly under some timestep size constraint.
As we shall see below, property (6) is sufficient in order to derive discrete energy-balance laws for DIRK schemes. While a discrete energy balance law is not, in general, enough to prove a priori bounds in Bochner-type norms for the discrete solution or its time derivative, (6) covers a large family of relevant problems; see Appendix A. Stronger assumptions on will be imposed in Section 6, and these will allow us to establish a priori bounds in Bochner-type norms.
3. DIRK schemes
In this section we recall some general notions related to RK schemes; and, in particular, present some details regarding DIRK schemes. We will also detail the main stability notion for these schemes that we shall pursue.
We recall that RK schemes are uniquely characterized by their so-called Butcher tableau
Here is the number of stages, is the matrix of coefficients, are the pseudo-collocation times, and are the weights. For the sake of completeness, necessary order conditions for (8) are summarized in Appendix C. We remind the reader that a DIRK scheme is one where the matrix is lower triangular.
Let us now, to make things precise, detail how a DIRK scheme is applied to (1). Let be the number of steps. We introduce a partition of , i.e.,
and set , for to be the local timestep. Starting from we will compute the sequence such that as follows. For we solve the following equations in
where we introduced a shorthand notation defined by
For fixed , the quantities are called the stages.
Notice that a generic step in (9) requires, given , to find that solves (7) with, for some , . Thus, owing to the local solvability condition, for this scheme to be well-defined for any partition , it is sufficient to require that . For this reason, in what follows we shall assume that this is always the case for a DIRK scheme.
Finally, observe that, although the equations are posed in , it is not difficult to show that, for all , .
[smoothness of the right hand side] We comment that usually the theory regarding well-posedness of (1), only requires the right hand side to be such that for some . This makes, in general, the quantities , for and , meaningless, as point evaluations of are not possible. This usually is circumvented by replacing by a suitable approximation . In order to avoid unnecessary clutter of the notation we will ignore this issue.
3.1. Discrete energy-balance
As we described above, -stage DIRK methods are characterized by its tableau, and are such that the matrix is lower triangular and has positive diagonal entries. The following definition will be our main notion of stability for DIRK schemes.
[discrete energy-balance] We say that a DIRK scheme with tableau (8), where is lower triangular and has positive diagonal entries, satisfies a discrete energy-balance if, for any , every partition , and all , we have that
where we introduced the notation and . The coefficients and depend only on the tableau, and are expected to satisfy the following constraints:
for all .
for all .
The coefficients must satisfy the constraint
Notice that (10) differs from (3) in at least two salient terms. First, the presence of the terms may seem out of place. However, these terms represent purely numerical artificial damping and are usually quite desirable when dealing with dissipative/parabolic PDEs444For instance, in some very specific contexts such as projection methods for incompressible Navier-Stokes equations, artificial damping terms are critical to guarantee numerical stability of the scheme, see for instance [GS2009, GS2011] and references therein.. However, artificial damping is not desirable when considering the discretization of Hamiltonian, or related, PDEs (e.g., PDEs that preserve quadratic invariants). For this reason, we allow the coefficients to be all zero if that is suitable for the problem at hand. Second, while the terms are expected, the off-diagonal terms, i.e., with , may be problematic and, thus, it is possible that (10
) will not lead to suitable a priori estimates for discrete solutions. For this reason, we introduce a stronger notion, which we call “dissipative discrete energy-balance”.
[dissipative discrete energy-balance] We say that a DIRK scheme with tableau (8) satisfies a dissipative discrete energy-balance if there are strictly positive , and
a nonnegative definite quadratic form on such that, for any , every partition , and all , we have
Expression (12) states that, beyond the “energy” introduced into the system by the non-autonomous part , the scheme will not produce any spurious surplus of energy. Moreover, the fact that is nonnegative definite implies that the scheme may even dissipate some energy. The following result makes this intuition rigorous.
[discrete energy dissipation] Assume that, in (1), the mapping satisfies (6). If the solution to (1) is approximated with a DIRK scheme that satisfies the dissipative discrete energy-balance of Definition 3.1, then for all , any partition , and all , we have
3.2. Literature review
At this point, it is worth making some comments about RK methods, discrete energy laws and the preexisting literature.
There is a significant body of numerical ODE literature attempting to bridge the gap between algebraic notions of stability and nonlinear notions of stability; see for instance: A-stability [Dahlquist1963], L-Stability [Ehle1969, Ehle1973], B-Stability [Butcher1975], AN-Stability [Burrage1979], BN-stability [Burrage1979], BS-stability and BSI-stability [Frank1985I, Frank1985II] and G-stability [Dahlquist1978] among many others (see also [Hairer1993I, Butcher2008]). Similarly, there is a specific body of scientific literature relating algebraic notions of stability and discrete energy laws [Burrage1979, Burrage1980, Humph1994, Ranocha2020, Wu2022]. However, we find that the classical techniques used in the numerical ODE literature are largely incompatible with our current goals, and the notions of stability that we are trying to advance; see Definitions 3.1–3.1 and Definition 6 below. We explain our reasoning in more detail.
Functional setting. A common assumption in the numerical ODE literature is that boundedly; see, for instance [Ranocha2018, Ranocha2020, Higueras2005, Wu2022]. This is a rather stringent assumption that, in general, is not suitable for PDEs. For instance, it does not allow us to capture the linear heat equation, incompressible Navier-Stokes equation, nor large families of advection-reaction-diffusion systems. For this reason we, instead, focus on the Gelfand-triple functional framework and assume that .
Choice of norms. The numerical ODE literature focuses on the development of estimates, i.e., for all ; see, for instance [Burrage1980, Humph1994]. For ODEs in and some limited cases of linear hyperbolic PDEs, this may suffice. However, for many PDE problems, such as parabolic-like problems, this type of estimate may be insufficient. Without a priori bounds on spatial derivatives in Bochner-type norms, it is not possible to assert stability of such schemes. To assert convergence, usually, one additionally requires an a priori estimate on the time derivative, again in a Bochner-type norm. In other words: discrete energy balances of the form (12) are suitable for the analysis of parabolic-like problems, while estimates of the form (13), in general cannot yield enough compactness.
We note that estimates in , for some , are standard in the PDE and numerical-PDE literature [Lions1969, MarionTemam1998, Thomee2006, Walk2010, Gllou2012, Feng2006, Layton2008, Shen2010, Roubicek2013, Elliott2021]. On the other hand, to our knowledge, the numerical ODE literature [Hairer1993I, HumphBook, Butcher2008, Hairer1996II] has not focused on a priori bounds in Bochner-type norms, space-time compactness, or convergence without regularity assumptions for problems of growing dimensionality (i.e. discretization of evolutionary PDEs).
Finite dimensionality. We want to develop stability results that are valid for finite dimensional problems as well as their infinite dimensional limits. This is a somewhat delicate issue when dealing with operators of the form . Let us explain what we mean here. As we detailed above, see (7) and Remark 3, a generic stage must be interpreted as: find such that
At this point one may be tempted to set . However, that is not necessarily well-defined unless additional assumptions are made555For instance, if , the product is meaningful only if we assume -regularity of .. Similarly, higher order compositions of the operator, i.e., , are not meaningful unless maps a Banach space to itself. We note that energy identities and a priori bounds in norm using such constructions are common in the numerical ODE literature; see for instance [Ranocha2018, Ranocha2020, Higueras2005, Wu2022] and the review paper [Humph1994, p. 1464–1465].
In light of the shortcomings mentioned above, in this manuscript we develop stability results that:
Target specifically DIRK schemes in the framework of a Gelfand triple and unbounded operators. We limit ourselves to the case of two- and three-stage schemes.
Energy identities and a priori bounds will solely rely on the inner product in , the duality pairing , and additive telescopic cancellation arguments. We do not use or invoke higher order compositions of the operator , higher order products such as , nor similar “multiplicative” constructions.
While this may be necessary to show existence of solutions to (1), we make no assumption of contractivity/monotonicity of our operator to obtain stability. Our primary notion of nonlinear stability revolves around “dissipative discrete energy-balances”, see (12), which is, strictly speaking, a property of the scheme, not a property of the operator. This enables the proof of a priori bounds in Bochner-type norms for and its time derivative for some families of operators; see Proposition 6.
Our a priori Bochner-type norm estimates make a clear cut distinction between “artificial/numerical damping” and “physical or PDE dissipation” terms. A precise identification of artificial damping terms plays a pivotal role in order to establish stability of the scheme. On the other hand, physical dissipation is fundamental to establish dual norm estimates on the time derivative of the discrete solution.
As a final comment we note that the mathematical theory about Galerkin-in-time and/or full-tableau RK methods developed by the numerical PDE community; see [Lubich1995, Makri2006, Akri2011, Walk2014, Hochbruck2018, Ern2019] and references therein, rarely ever applies (or even mentions) DIRK schemes. We do however, highlight the preexistence of quite relevant material with specific focus on DIRK schemes that shares a few common points of intersection with the material presented in this manuscript. In particular [Shin2020] addresses the issue of gradient flow stability for DIRK schemes, while [Osterman2010] addresses convergence without regularity assumptions using semi-group methods. However, the material presented in the current paper differs quite significantly in relationship to the mathematical tools used and the degree of generality.
3.3. Some popular DIRK schemes
Let us present some popular two- and three-stage DIRK schemes, and briefly mention some of their known properties. These will be our specific examples under consideration used to illustrate the developed theory.
3.3.1. Two-stage schemes
We will consider the following two-stage schemes.
Alexander’s DIRK22 scheme:
Tableau (14) seems to appear for the first time in [Alexander1977].
Butcher-Burrage DIRK22 scheme:
To the best of our knowledge this tableau appears for the first time in [Burrage1979, p. 51].
Kraaijevanger-Spijker DIRK22 scheme
see [Kraaije1989, p. 77].
Crouzeix’s DIRK23 scheme:
This tableau can be found in [Crou1979]. This scheme is third order accurate.
3.3.2. Three-stage schemes
Regarding three-stage methods we will consider:
Alexander’s DIRK33 scheme
where is the root of in the interval . More precisely we have that
Tableau (18) seems appear for the first time in [Alexander1977, p. 1012].
Nørsett DIRK34 order method:
and is one of the roots of the equation . More precisely these roots are
The case of appears in the literature as the Crouzeix-Raviart scheme [Crou1979].
3.4. An alternative representation of DIRK schemes
Let us finish the general discussion about DIRK schemes with an alternative representation of the solution at the next time step as an extrapolation. This will be useful when deriving discrete energy-balance laws.
[extrapolation] Assume that the RK scheme with tableau (8) is such that is invertible. Then , the solution at the next discrete time, has the following representation
and is the canonical basis of .
For we introduce the notation
Therefore, the stages in (8) can be rewritten as
where we used that the matrix is invertible.
We can then write the solution at the next discrete time as
the result follows. ∎
4. Discrete energy-balance for two-stage schemes
Here we study discrete energy-balance laws for two-stage schemes. Our main contribution here is to find a class of schemes, which we will call remarkably stable, see Definition 4.1, which automatically satisfy a dissipative discrete energy-balance, see Definition 3.1.
4.1. General discrete energy-balance laws
We begin by specializing Proposition 3.4 for the case of two-stage schemes.
[two-stage extrapolation] Let the RK scheme (8) be such that , is lower triangular, and with positive diagonal entries. Then, for all ,
A direct computation shows that
as we intended to show. ∎
[some useful identities] Let , , be any partition of , and . If in (8) the matrix is lower triangular and with positive diagonal entries, then we have
where come from (9).
These identities follow from taking duality pairings of each of the stages with suitable functions. Identity (23) comes from testing the equation for the first stage with and using the well–known polarization identity (4). Similarly, identity (24) comes from testing the second stage with . Identity (25) comes from testing the first stage with . Similarly, we get (26) by testing the second stage with . Finally, we combine the two stages and test the result with to obtain (27). ∎
We are now in position to prove a precursor to (10). Notice that here we are not assuming any order conditions on the entries of the Butcher table.
[discrete energy identity I] Let , , be a partition of , and . If for all , then the solution to (9) satisfies
and where introduced in Corollary 4.1.
We begin by taking the inner product of the extrapolation identity (21) with itself to obtain
[consistency check] A direct computation shows that . Assuming that the scheme satisfies second order conditions, see Appendix C, we have that .
We now write a precursor to the dissipative energy identity (12).
[discrete energy identity II] The discrete energy-balance law (28) can be rewritten as
where , , and is a quadratic form given by
Exploiting the bilinearity of the duality pairing we have that
We use this identity to replace the last term on the right hand side of (28). After reorganizing the terms we get
We note that the discrete energy-balance laws (28) and (30) do not carry much practical value unless the sign of the coefficients is correct, and the quadratic form is nonnegative. However, if this is the case, our schemes will have all requisite stability properties. We encode this in the following definition.
[remarkable stability I] We say that the DIRK scheme (8) with , lower triangular and with positive diagonal entries is remarkably stable if the following conditions hold
Remarkable stability defines an exceptional class of schemes for which the off-diagonal term on the right hand side of (28) can always be absorbed into artificial damping terms regardless of the nature of
(coercive, linear, nonlinear, degenerate, skew symmetric, etc.).
The following result provides sufficient, easy to check, conditions for to be nonnegative.
[nonnegativity] Assume that, in the setting of Corollary 4.1, we have
Then, the quadratic from , introduced in (31) in nonnegative definite.
If and we have that
This, in particular, implies that
for all satisfying . ∎
[nonnegativity] We comment that the condition of Proposition 4.1 is only sufficient. Necessary and sufficient conditions are obtained by looking at the spectrum of the coefficient matrix of the quadratic form . In this case we have
Lengthy and painful, but trivial, computations reveal that
[computational aspects] Two important aspects of computational practice are time-adaptivity and nonlinear solver tolerances. Assume that we are using a remarkably stable scheme and that we are able to solve for the stages exactly (i.e., to machine accuracy). As a consequence, we obtain
We note that the functional gives us exactly how much numerical dissipation occurred from time instance to . In this context, the value of the quadratic form