1. Introduction
The purpose of this work is the study of structure preserving timemarching schemes for a class of evolution problems in Banach spaces; which essentially are used to describe possibly degenerate parabolic and hyperbolic initial boundary value problems. At this stage, it is enough for our purposes to say that, given a Banach space , a final time , and a mapping , we seek for that solves
(1) 
As usual, denotes the dual of , and is the duality pairing; see Appendix A for a, far from exhaustive, list of example problems of interest. Solutions of problem (1) satisfy the following energybalance law:
(2) 
where the notation is to be specified. Our goal in this work is to find numerical schemes that mimic this law. For instance, given a time partition with being the local timestep size, an approximate solution to (1) could be computed by the backward Euler scheme. Ignoring solvability issues, we obtain and it is immediate to see that
(3) 
where, once again, the notation is to be specified.
There are several reasons why discrete energy laws such as (3) are important. From the practical point of view, these are a nonperturbative form of stability: they do not rely on any smallness assumption, linearization, asymptotic argument, or proximity to some equilibrium state. For complex PDE problems, where a thorough quantitative and qualitative analysis of the solution, and corresponding numerical scheme, are far out of reach, satisfaction of a discrete energybalance is usually an excellent surrogate to stability.
On the other hand, from a theoretical point of view, it is often the case that with the help of discrete energybalance laws one can assert compactness of families of numerical solutions. Once again, for complex PDE problems, (weak) convergence of discrete solutions via compactness is all one can hope to achieve without introducing additional assumptions. Examples of a successful application of this approach are numerous. The interested reader is referred to the following references: elliptic problems [Gidas1981, Evans1987], hyperbolic [Tartar1979, Necas1996], and parabolic [Lions1969]; where one of the main tools is the AubinLions compactness lemma [Lions1969], or some refinement of it [Simon1987, Jungel2012, Jungel2014, Andrei2011, Gllou2012].
The backward Euler scheme is one of the simplest timestepping schemes, yet it is often possible to prove that it possesses suitable energybalance laws; see (3). Its major drawback, however, is that it is only first order accurate. In order to remedy this issue, higher order generalizations, like Galerkinintime schemes, have been developed and analyzed; see for instance [Lubich1995, Makri2006, Akri2011, Walk2014, Hochbruck2018, Walk2010, Ern2019] and references therein. These schemes can also be shown to possess energybalance laws, and are of arbitrary high order. However, their practical impact beyond academic examples has been rather limited; see [Pazner2017, Crockatt2019] for a few exceptions. This is due to the fact that Galerkinintime methods are algebraically equivalent to fulltableau RungeKutta (RK) methods [Akri2011]. As such, they require the solution of linear algebraic systems where the system matrix is of size for each Newton iteration: here is the number of stages of the RK method and
is the number of degrees of freedom of the spatial discretization, see for instance
[Smears2017].Diagonally implicit RungeKutta (DIRK) methods lie between these two extreme possibilities (backward Euler and Galerkinintime discretizations). DIRK methods offer significantly added computational benefit, since they only require the solution of linear systems of size at each stage, as well as higher order accuracy, see [Crouzeix1976, Alexander1977, Carpenter2016]. Their popularity is, perhaps, in big part due to the paper [Ascher1997] which has been extremely influential in the scientific computing community. The rigorous study of the mathematical properties of DIRK schemes, however, seems to be rather underdeveloped. This is particularly the case if we are interested in the numerical approximation of PDEs satisfying a discrete energybalance law such as (2). This brings us to the main motivation for our current work: we wish to present classes of DIRK schemes for which we can prove discrete energybalance laws, and study under which conditions the solution to these schemes enjoy suitable compactness properties.
To achieve these goals we organize our presentation as follows. Notation and the functional framework we shall operate under are introduced in Section 2. Here we also discuss the minimal set of assumptions we require on the mapping . Section 3 then begins by introducing the general form of DIRK schemes and making some comments regarding their implementation when applied to PDEs. Two notions of balance laws: discrete energybalance, and dissipative discrete energybalance, respectively, are introduced here for DIRK schemes; their importance and meaning is also discussed. This section then presents an exhaustive literature review regarding the existing (algebraic) notions of stability and why we believe these are not suitable for our purposes. A list of some popular two and threestage DIRK schemes finalizes this section. The core of our work is Sections 4 and 5 where, for two and threestage schemes, respectively, we explore the existence of discrete energybalance laws. We arrive at a property, which we call remarkable stability which immediately implies the existence of a dissipative discrete energy law for a DIRK scheme. We note that, given the Butcher tableau of an
stage DIRK scheme, verifying remarkable stability only requires some algebraic manipulations using the entries of the tableau, and the solution of an algebraic eigenvalue problem of size
. Each section concludes by verifying remarkable stability for a list of widely used DIRK schemes. Finally, in Section 6, we refine the assumptions on to arrive at our strongest notion: discrete Bochnerstability. We show that every remarkably stable scheme is discretely Bochnerstable, and that (families of) solutions to discretely Bochnerstable schemes possess suitable compactness properties.Finally, we believe that one important feature of our exposition is its simplicity. The main results of this manuscript rely, in essence, on the simple polarization identity
(4) 
and some algebraic manipulations. At times these manipulations may be long and tedious, and some assistance from a computer algebra software package^{3}^{3}3In this manuscript we used Mathematica©. may be desired. Despite of this, these are nothing but algebraic manipulations. When describing compactness properties, wellknown elementary order conditions of RK schemes may be necessary, which are summarized in Appendix C. No high level tools or specialized notions of algebraic stability for ODE solvers are used in our work.
2. Preliminaries
Here we describe the framework and assumptions we shall operate under. If its Hölder conjugate is . We extend this notation such that , , and for all . By we denote a final time.
By we shall mean for a nonessential constant that may change at each occurrence.
2.1. Functional framework
Throughout our work we shall assume that is a separable Hilbert space with inner product and norm . By we denote a Banach space, and its norm is denoted by . The dual of is denoted by , the duality pairing between them is denoted by . The norm in is denoted by .
We shall assume that is a Gelfand triple. We recall that in this setting the duality pairing is an extension of the innerproduct. In other words, every defines an element of whose action is defined by
(5) 
This identification will be repeatedly used in our discussion.
Let be a separable Banach space with norm and . For measurable, we define
Then, we define the Bochner space
which is Banach for the norm . The space of functions that are continuous is denoted by . We endow this space with the norm.
2.2. Initial value problems
With all the previous notation and preparations at hand we can proceed to rigorously describe the class of problems we are interested in approximating. We assume that the initial condition satisfies , and that the slope function satisfies . We then seek for such that and solves (1).
We will also assume that can be split into an autonomous and purely nonautonomous timedependent part as
where and . In the context of PDEs and/or ODEs on graphs, we assume that nonhomogeneous boundary data can always be assimilated into .
The minimal set of assumptions we shall impose on the mapping are as follows.

Nonnegativity: The mapping is nonnegative, i.e.,
(6) 
Localsolvability. We assume that, for every , there exists such that for every the problem
(7) has a unique solution . This assumption guarantees that nonlinear problems associated to each stage in DIRK schemes have a unique solution, possibly under some timestep size constraint.
As we shall see below, property (6) is sufficient in order to derive discrete energybalance laws for DIRK schemes. While a discrete energy balance law is not, in general, enough to prove a priori bounds in Bochnertype norms for the discrete solution or its time derivative, (6) covers a large family of relevant problems; see Appendix A. Stronger assumptions on will be imposed in Section 6, and these will allow us to establish a priori bounds in Bochnertype norms.
3. DIRK schemes
In this section we recall some general notions related to RK schemes; and, in particular, present some details regarding DIRK schemes. We will also detail the main stability notion for these schemes that we shall pursue.
We recall that RK schemes are uniquely characterized by their socalled Butcher tableau
(8) 
Here is the number of stages, is the matrix of coefficients, are the pseudocollocation times, and are the weights. For the sake of completeness, necessary order conditions for (8) are summarized in Appendix C. We remind the reader that a DIRK scheme is one where the matrix is lower triangular.
Let us now, to make things precise, detail how a DIRK scheme is applied to (1). Let be the number of steps. We introduce a partition of , i.e.,
and set , for to be the local timestep. Starting from we will compute the sequence such that as follows. For we solve the following equations in
(9) 
where we introduced a shorthand notation defined by
For fixed , the quantities are called the stages.
Notice that a generic step in (9) requires, given , to find that solves (7) with, for some , . Thus, owing to the local solvability condition, for this scheme to be welldefined for any partition , it is sufficient to require that . For this reason, in what follows we shall assume that this is always the case for a DIRK scheme.
Finally, observe that, although the equations are posed in , it is not difficult to show that, for all , .
[smoothness of the right hand side] We comment that usually the theory regarding wellposedness of (1), only requires the right hand side to be such that for some . This makes, in general, the quantities , for and , meaningless, as point evaluations of are not possible. This usually is circumvented by replacing by a suitable approximation . In order to avoid unnecessary clutter of the notation we will ignore this issue.
[interpretation] We note that (9) must be understood by its action against suitable “test functions”. Owing to (5), a generic stage for must be such that, for every ,
3.1. Discrete energybalance
As we described above, stage DIRK methods are characterized by its tableau, and are such that the matrix is lower triangular and has positive diagonal entries. The following definition will be our main notion of stability for DIRK schemes.
[discrete energybalance] We say that a DIRK scheme with tableau (8), where is lower triangular and has positive diagonal entries, satisfies a discrete energybalance if, for any , every partition , and all , we have that
(10) 
where we introduced the notation and . The coefficients and depend only on the tableau, and are expected to satisfy the following constraints:

for all .

for all .

The coefficients must satisfy the constraint
(11)
Notice that (10) differs from (3) in at least two salient terms. First, the presence of the terms may seem out of place. However, these terms represent purely numerical artificial damping and are usually quite desirable when dealing with dissipative/parabolic PDEs^{4}^{4}4For instance, in some very specific contexts such as projection methods for incompressible NavierStokes equations, artificial damping terms are critical to guarantee numerical stability of the scheme, see for instance [GS2009, GS2011] and references therein.. However, artificial damping is not desirable when considering the discretization of Hamiltonian, or related, PDEs (e.g., PDEs that preserve quadratic invariants). For this reason, we allow the coefficients to be all zero if that is suitable for the problem at hand. Second, while the terms are expected, the offdiagonal terms, i.e., with , may be problematic and, thus, it is possible that (10
) will not lead to suitable a priori estimates for discrete solutions. For this reason, we introduce a stronger notion, which we call “dissipative discrete energybalance”.
[dissipative discrete energybalance] We say that a DIRK scheme with tableau (8) satisfies a dissipative discrete energybalance if there are strictly positive , and
a nonnegative definite quadratic form on such that, for any , every partition , and all , we have
(12) 
Expression (12) states that, beyond the “energy” introduced into the system by the nonautonomous part , the scheme will not produce any spurious surplus of energy. Moreover, the fact that is nonnegative definite implies that the scheme may even dissipate some energy. The following result makes this intuition rigorous.
3.2. Literature review
At this point, it is worth making some comments about RK methods, discrete energy laws and the preexisting literature.
There is a significant body of numerical ODE literature attempting to bridge the gap between algebraic notions of stability and nonlinear notions of stability; see for instance: Astability [Dahlquist1963], LStability [Ehle1969, Ehle1973], BStability [Butcher1975], ANStability [Burrage1979], BNstability [Burrage1979], BSstability and BSIstability [Frank1985I, Frank1985II] and Gstability [Dahlquist1978] among many others (see also [Hairer1993I, Butcher2008]). Similarly, there is a specific body of scientific literature relating algebraic notions of stability and discrete energy laws [Burrage1979, Burrage1980, Humph1994, Ranocha2020, Wu2022]. However, we find that the classical techniques used in the numerical ODE literature are largely incompatible with our current goals, and the notions of stability that we are trying to advance; see Definitions 3.1–3.1 and Definition 6 below. We explain our reasoning in more detail.

Functional setting. A common assumption in the numerical ODE literature is that boundedly; see, for instance [Ranocha2018, Ranocha2020, Higueras2005, Wu2022]. This is a rather stringent assumption that, in general, is not suitable for PDEs. For instance, it does not allow us to capture the linear heat equation, incompressible NavierStokes equation, nor large families of advectionreactiondiffusion systems. For this reason we, instead, focus on the Gelfandtriple functional framework and assume that .

Choice of norms. The numerical ODE literature focuses on the development of estimates, i.e., for all ; see, for instance [Burrage1980, Humph1994]. For ODEs in and some limited cases of linear hyperbolic PDEs, this may suffice. However, for many PDE problems, such as paraboliclike problems, this type of estimate may be insufficient. Without a priori bounds on spatial derivatives in Bochnertype norms, it is not possible to assert stability of such schemes. To assert convergence, usually, one additionally requires an a priori estimate on the time derivative, again in a Bochnertype norm. In other words: discrete energy balances of the form (12) are suitable for the analysis of paraboliclike problems, while estimates of the form (13), in general cannot yield enough compactness.
We note that estimates in , for some , are standard in the PDE and numericalPDE literature [Lions1969, MarionTemam1998, Thomee2006, Walk2010, Gllou2012, Feng2006, Layton2008, Shen2010, Roubicek2013, Elliott2021]. On the other hand, to our knowledge, the numerical ODE literature [Hairer1993I, HumphBook, Butcher2008, Hairer1996II] has not focused on a priori bounds in Bochnertype norms, spacetime compactness, or convergence without regularity assumptions for problems of growing dimensionality (i.e. discretization of evolutionary PDEs).

Finite dimensionality. We want to develop stability results that are valid for finite dimensional problems as well as their infinite dimensional limits. This is a somewhat delicate issue when dealing with operators of the form . Let us explain what we mean here. As we detailed above, see (7) and Remark 3, a generic stage must be interpreted as: find such that
At this point one may be tempted to set . However, that is not necessarily welldefined unless additional assumptions are made^{5}^{5}5For instance, if , the product is meaningful only if we assume regularity of .. Similarly, higher order compositions of the operator, i.e., , are not meaningful unless maps a Banach space to itself. We note that energy identities and a priori bounds in norm using such constructions are common in the numerical ODE literature; see for instance [Ranocha2018, Ranocha2020, Higueras2005, Wu2022] and the review paper [Humph1994, p. 1464–1465].
In light of the shortcomings mentioned above, in this manuscript we develop stability results that:

Target specifically DIRK schemes in the framework of a Gelfand triple and unbounded operators. We limit ourselves to the case of two and threestage schemes.

Energy identities and a priori bounds will solely rely on the inner product in , the duality pairing , and additive telescopic cancellation arguments. We do not use or invoke higher order compositions of the operator , higher order products such as , nor similar “multiplicative” constructions.

While this may be necessary to show existence of solutions to (1), we make no assumption of contractivity/monotonicity of our operator to obtain stability. Our primary notion of nonlinear stability revolves around “dissipative discrete energybalances”, see (12), which is, strictly speaking, a property of the scheme, not a property of the operator. This enables the proof of a priori bounds in Bochnertype norms for and its time derivative for some families of operators; see Proposition 6.

Our a priori Bochnertype norm estimates make a clear cut distinction between “artificial/numerical damping” and “physical or PDE dissipation” terms. A precise identification of artificial damping terms plays a pivotal role in order to establish stability of the scheme. On the other hand, physical dissipation is fundamental to establish dual norm estimates on the time derivative of the discrete solution.
As a final comment we note that the mathematical theory about Galerkinintime and/or fulltableau RK methods developed by the numerical PDE community; see [Lubich1995, Makri2006, Akri2011, Walk2014, Hochbruck2018, Ern2019] and references therein, rarely ever applies (or even mentions) DIRK schemes. We do however, highlight the preexistence of quite relevant material with specific focus on DIRK schemes that shares a few common points of intersection with the material presented in this manuscript. In particular [Shin2020] addresses the issue of gradient flow stability for DIRK schemes, while [Osterman2010] addresses convergence without regularity assumptions using semigroup methods. However, the material presented in the current paper differs quite significantly in relationship to the mathematical tools used and the degree of generality.
3.3. Some popular DIRK schemes
Let us present some popular two and threestage DIRK schemes, and briefly mention some of their known properties. These will be our specific examples under consideration used to illustrate the developed theory.
3.3.1. Twostage schemes
We will consider the following twostage schemes.

ButcherBurrage DIRK22 scheme:
(15) To the best of our knowledge this tableau appears for the first time in [Burrage1979, p. 51].

KraaijevangerSpijker DIRK22 scheme
(16) see [Kraaije1989, p. 77].

Crouzeix’s DIRK23 scheme:
(17) This tableau can be found in [Crou1979]. This scheme is third order accurate.
3.3.2. Threestage schemes
Regarding threestage methods we will consider:

Alexander’s DIRK33 scheme
(18) where is the root of in the interval . More precisely we have that
Tableau (18) seems appear for the first time in [Alexander1977, p. 1012].

Nørsett DIRK34 order method:
(19) and is one of the roots of the equation . More precisely these roots are
(20) The case of appears in the literature as the CrouzeixRaviart scheme [Crou1979].
3.4. An alternative representation of DIRK schemes
Let us finish the general discussion about DIRK schemes with an alternative representation of the solution at the next time step as an extrapolation. This will be useful when deriving discrete energybalance laws.
[extrapolation] Assume that the RK scheme with tableau (8) is such that is invertible. Then , the solution at the next discrete time, has the following representation
where
and is the canonical basis of .
Proof.
For we introduce the notation
Therefore, the stages in (8) can be rewritten as
where we used that the matrix is invertible.
We can then write the solution at the next discrete time as
Setting
the result follows. ∎
4. Discrete energybalance for twostage schemes
Here we study discrete energybalance laws for twostage schemes. Our main contribution here is to find a class of schemes, which we will call remarkably stable, see Definition 4.1, which automatically satisfy a dissipative discrete energybalance, see Definition 3.1.
4.1. General discrete energybalance laws
We begin by specializing Proposition 3.4 for the case of twostage schemes.
[twostage extrapolation] Let the RK scheme (8) be such that , is lower triangular, and with positive diagonal entries. Then, for all ,
(21) 
with
(22) 
Proof.
A direct computation shows that
as we intended to show. ∎
[some useful identities] Let , , be any partition of , and . If in (8) the matrix is lower triangular and with positive diagonal entries, then we have
(23)  
(24)  
(25)  
(26)  
(27) 
where come from (9).
Proof.
These identities follow from taking duality pairings of each of the stages with suitable functions. Identity (23) comes from testing the equation for the first stage with and using the well–known polarization identity (4). Similarly, identity (24) comes from testing the second stage with . Identity (25) comes from testing the first stage with . Similarly, we get (26) by testing the second stage with . Finally, we combine the two stages and test the result with to obtain (27). ∎
We are now in position to prove a precursor to (10). Notice that here we are not assuming any order conditions on the entries of the Butcher table.
[discrete energy identity I] Let , , be a partition of , and . If for all , then the solution to (9) satisfies
(28) 
where
(29) 
and where introduced in Corollary 4.1.
Proof.
We begin by taking the inner product of the extrapolation identity (21) with itself to obtain
[consistency check] A direct computation shows that . Assuming that the scheme satisfies second order conditions, see Appendix C, we have that .
We now write a precursor to the dissipative energy identity (12).
[discrete energy identity II] The discrete energybalance law (28) can be rewritten as
(30) 
where , , and is a quadratic form given by
(31) 
Proof.
Exploiting the bilinearity of the duality pairing we have that
We use this identity to replace the last term on the right hand side of (28). After reorganizing the terms we get
(32) 
We note that the discrete energybalance laws (28) and (30) do not carry much practical value unless the sign of the coefficients is correct, and the quadratic form is nonnegative. However, if this is the case, our schemes will have all requisite stability properties. We encode this in the following definition.
[remarkable stability I] We say that the DIRK scheme (8) with , lower triangular and with positive diagonal entries is remarkably stable if the following conditions hold
with , , , , and defined in (29), and the quadratic form , introduced in (31), is nonnegative definite.
Remarkable stability defines an exceptional class of schemes for which the offdiagonal term on the right hand side of (28) can always be absorbed into artificial damping terms regardless of the nature of
(coercive, linear, nonlinear, degenerate, skew symmetric, etc.).
The following result provides sufficient, easy to check, conditions for to be nonnegative.
[nonnegativity] Assume that, in the setting of Corollary 4.1, we have
Then, the quadratic from , introduced in (31) in nonnegative definite.
Proof.
If and we have that
Therefore,
This, in particular, implies that
for all satisfying . ∎
[nonnegativity] We comment that the condition of Proposition 4.1 is only sufficient. Necessary and sufficient conditions are obtained by looking at the spectrum of the coefficient matrix of the quadratic form . In this case we have
Lengthy and painful, but trivial, computations reveal that
[computational aspects] Two important aspects of computational practice are timeadaptivity and nonlinear solver tolerances. Assume that we are using a remarkably stable scheme and that we are able to solve for the stages exactly (i.e., to machine accuracy). As a consequence, we obtain
We note that the functional gives us exactly how much numerical dissipation occurred from time instance to . In this context, the value of the quadratic form