The Prager-Synge theorem in reconstruction based a posteriori error estimation

In this paper we review the hypercircle method of Prager and Synge. This theory inspired several studies and induced an active research in the area of a posteriori error analysis. In particular, we review the Braess--Schöberl error estimator in the context of the Poisson problem. We discuss adaptive finite element schemes based on two variants of the estimator and we prove the convergence and optimality of the resulting algorithms.



There are no comments yet.


page 5


Residual-based a posteriori error estimates of mixed methods in Biot's consolidation model

We present residual-based a posteriori error estimates of mixed finite e...

Adaptive finite element approximation for steady-state Poisson-Nernst-Planck equations

In this paper, we develop an adaptive finite element method for the nonl...

Stabilized finite elements for Tresca friction problem

We formulate and analyze a Nitsche-type algorithm for frictional contact...

Recovery-based a posteriori error analysis for plate bending problems

We present two new recovery-based a posteriori error estimates for the H...

A pedagogical review on a posteriori error estimation in Finite Element computations

This article is a review on basic concepts and tools devoted to a poster...

Adaptive VEM: Stabilization-Free A Posteriori Error Analysis

In the present paper we initiate the challenging task of building a math...

An a posteriori error estimate of the outer normal derivative using dual weights

We derive a residual based a-posteriori error estimate for the outer nor...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

In this paper we review the hypercircle method introduced by Prager and Synge [PS47]

and some of its consequences for the a posteriori analysis of partial differential equations. We believe that it is useful to discuss a paper that has been the object of several studies and has induced an active research in the area of a posteriori analysis of partial differential equations. On the one hand, it turns out that the hypercircle method is well appreciated by people working in the field, but less known by applied mathematicians with a less deep knowledge of a posteriori error analysis. On the other hand, we think that it is useful to discuss the consequences of the hypercircle method for a posteriori error analysis after some years of active research in the field, which has led in particular to a nowadays mature study of adaptive finite element schemes. The hypercircle method provides a natural way to get guaranteed upper bounds for the error associated to Galerkin approximations; the corresponding lower bounds are more difficult to obtain and have been widely investigate in the literature.

It is now interesting to address the question whether an error estimator based on the hypercircle technique provides an optimally convergent method when combined with an adaptive strategy. This topic is less studied (see [KS11, CN12]) and we shall see that the answer to this question is not immediate.

The hypercircle method, originally developed for elasticity problems, can be used for several examples of PDEs. Starting from the pioneer work of Ladevèze and Leguillon [LL83], the Prager–Synge idea has led to several applications to the finite element approximation of elliptic problems [AO93, DM99, RSS04, RSS07, BS08b, BPS09b, Bra09, Ver09, Voh10, Voh11, CZ12b, Kim12, CM13] and of problems in elasticity [Bra13, Zha06, BMS10]. Other examples of applications include discontinuous Galerkin approximation of elliptic problems [BFH14] or for convection-diffusion problems [ESV10]; finite element approximation of convection-diffusion and reaction-diffusion problems has been studied in [CFPV09, DEV13]. The Stokes problem and two phase fluid-flow have been considered in [HSV12, DPVY15]. An intense activity is related to multiscale and mortar elements [PVWW13, TW13] as well as to porous media and porous elasticity [MN17, RDPE17, VY18]. Obstacle and contact problems have been studied in [BHS08, WW10, HW12]. Further examples of applications include Maxwell’s equations [CNT17], finite elements [DEV16]

, and eigenvalue problems 

[CDM17, LO13, BBS19]. An interesting unified approach is provided in [EV15a] where the -robustness of the error estimator is considered.

The hypercircle technique leads naturally to two methods: the so called gradient reconstruction (related to the construction of the function of Figure 1) and the equilibrated flux approach (related to the construction of the function of Figure 1).

We develop our study starting from the case of the Laplace operator and we shall focus on the equilibrated flux approach. More precisely, we are going to discuss what is generally known as Braess–Schöberl error estimator [BS08a]. For this estimator an a posteriori error analysis is well known which has been shown to be robust in the degree of the used polynomial [BS08a, BPS09a]. We refer the interested reader in particular to the nice unified framework presented in [EV15b] for more details on these results and for a complete survey of the use of equilibrated flux recovery in various applications.

The convergence analysis of the adaptive finite element method driven by non-residual error estimators has been performed in [KS11] and [CN12]. Both references start from the remark that it is not possible to expect in general a contraction property of the error and the estimator between two consecutive refinement levels. Since [KS11] is based on an assumption on the oscillations that might not be satisfied in our case (i.e., the oscillations are dominated by the estimator), in this paper we adopt the abstract setting of [CN12]. The Braess–Schöberl estimator is considered in [CN12, Section 3.5] where it is claimed that, up to oscillations, it is equivalent to the standard residual error estimator. We shall see that this property is not so immediate and that the consequence analysis has to be performed with particular care. In our paper we consider two variants of the Braess–Schöberl estimator: the first one is the most standard and it is based on single elements (we denote it by ); the second one is more elaborate and is based on patches of elements (denoted by ). The estimator has been introduced in [BS08a], while has been considered in [BPS09a]. We are going to show that actually is equivalent, up to oscillations, to a residual estimator arranged on patches of elements (see Section 4). We could not prove an analogous result for the estimator , which we analyze directly in Section 6. In both cases we have to pay attention to the appropriate definition and to the analysis of the oscillation terms. Oscillations are defined on patches of elements and the theory of [CN12] is modified accordingly. In turn, we present a clean theory where the convergence and the optimality of the adaptive schemes based on and on is rigorously proved.

The structure of the paper is the following: in Section 2 we recall the main results of the Prager–Synge hypercircle theory [PS47], in Section 3 we review the equilibrated flux reconstruction by Braess and Schöberl [BS08a, BPS09a], in Section 4 we show the equivalence of the estimator with the standard residual one. We are then ready to recall in Section 5 the main ingredients of the theory of [CN12] and to apply it to the adaptive finite element method based on . Finally, Section 6 shows how to apply directly the theory of [CN12] to the estimator without bounding it in terms of the standard residual estimator.

2. The Prager–Synge theory and its application to error estimates

We start this section by reviewing the main aspects of the hypercircle theory introduced by Prager and Synge in [PS47]. The theory was developed for the mixed elasticity equation: the problem under consideration was to seek with


where is the linear relationship between the stress and the strain.

In this paper we deal with the Poisson problem where a simplified version of the Prager–Synge theory can be applied.

Given a polytopal domain in and , our problem is to find such that


In this context, it is convenient to describe the Prager–Synge theory with the help of the mixed Laplacian equations. More precisely, let us consider the following problem: given , find and such that


Problem (2.3) corresponds to the case of homogeneous boundary conditions on . Clearly, more general boundary conditions can be considered. For the sake of completeness, we write down explicitly the general formulation associated to mixed boundary conditions on and on , where is split in a Dirichlet part and in a Neumann part . Let and denote the subspaces of vectorfields in with normal component vanishing or equal to , respectively, on . Then the problem is: find and such that

where the brackets in the first equation represent the duality pairing between and which, in the case of smooth functions, can be interpreted as

In this more general setting the analogue of (2.2) reads: find such that

where and denote the subspace of with boundary conditions on vanishing or equal to , respectively.

All the following theory could be stated in this general setting, but for the sake of readability we present it in the case when (so that and .

The equilibrium condition. Let be any function in satisfying the equilibrium equation ; then it is easily seen that

from which the following orthogonality is obtained


Equation (2.4) says that lies on a hypersphere having for diameter. The center of the sphere is denoted by in Figure 1.

Gradients of . Let now be the gradient of any function in

It follows that

and that

which imply


The orthogonality stated in (2.5) can be expressed by saying that and

lie on the same hyperplane orthogonal to


Putting together the orthogonalities of Equations (2.4) and (2.5) leads to the conclusion that and lie on the hypercircle given by the intersection of the hypersphere defined by (2.4) and the hyperplane given by (2.5). Moreover, let be the foot of on the hyperplane; since is orthogonal to and is the orthogonal projection of onto , we have the following orthogonality

which implies that the segment connecting to is a diameter of the hypercircle . The center of this hypercircle is denoted by in Figure 1.

The conclusion of this construction, summarized in Figure 1, is an energy bound with constant one which we state in the following theorem.


Figure 1. The hypercircle construction
Theorem 2.1.

Let be the second component of the solution to (2.3); let be any function in which satisfies the equilibrium condition in and let be the gradient of any function in . Then

where is the multiple of lying in the hyperplane orthogonal to and containing (see Figure 1).

We now state another important consequence of the previous geometrical construction which applies to problem (2.2) and which is usually referred to as Prager–Synge theorem.

Theorem 2.2.

Let be the solution of problem (2.2). Then it holds


for all and all satisfying the equilibrium condition .


From the orthogonalities defining the hypersphere and the hyperplane if follows immediately which gives the results with the identifications and . ∎

The Prager-Synge theorem has been used in order to obtain error estimates in various contexts, starting from  [LL83]. We describe the application of Theorem 2.2 in the case of the conforming finite element approximation of problem (2.2). Let be a finite dimensional subspace of and consider the discrete problem: find such that


We are going to consider a standard conforming , so that , the space of continuous piecewise polynomials of degree less than or equal to .

A direct application of Theorem 2.2 with and gives


where is any function in with in . It turns out that the right hand side in (2.8) is a reliable error estimator with constant one. Clearly, this fundamental idea leads to a viable approach only if it is possible to construct in a practical way. This is what is generally called equilibrated flux reconstruction.

Remark 2.3.

In the case when is piecewise polynomial, a possible (not practical) definition of could be obtained by solving an approximation of the mixed problem (2.3), so that is a discretization of . If is a generic function, a standard oscillation term will show up. A smart modification of this intuition is behind the Braess-Schöberl construction presented later in this paper.

Ainsworth and Oden in [AO00, Chap. 6.4] show that can be efficiently constructed by solving local problems. Let be the set of the interior edges of a shape-regular triangulation . We will also denote by the set of the boundary edges. In the case when problem (2.7) is solved with polynomials of degree , the reconstruction proposed in [AO00] seeks such that


where denotes the projection of onto polynomials of degree .

3. The Braess–Schöberl construction

In [BS08a] Braess and Schöberl show how to realize the above conditions (2.9a) and (2.9b) by exploiting some basic properties of the Raviart–Thomas finite element spaces. The resulting estimator is commonly called the Braess–Schöberl error estimator.

The local problems can be solved on patches around vertices of the mesh. The construction has been extended to different problems and geometrical configurations, thus allowing for a very powerful and general equilibration procedure. The reconstruction aims at defining in the broken Raviart–Thomas space of order , that is


where the Raviart–Thomas element is given by


and denotes the space of polynomials of degree at most on the domain . Clearly, since , we will have that belongs to by virtue of the jump conditions (2.9b).

The Braess-Schöberl reconstruction is performed as follows. Let denote the set of vertices of the triangulation, a vertex, and the patch of elements sharing the vertex


Let be the continuous piecewise linear Lagrange function with and whose support is , (that is, the hat function equal to one at the node ), so that the following partition of unity property holds


Hence can be decomposed into functions living on vertex patches, i.e.

where and .

Since each facet belongs to two elements the conditions (2.9a) and (2.9b) mean that the function has to fulfill


It is common to use a notation where the dependence on the discrete solution is made explicit, so that in general we are going to denote the reconstruction by or its contribution coming from a patch .

Two options are now given for the design of an error indicator based on the above reconstruction. The first one, introduced in [BS08a], considers directly the quantity on each single element


while the second on, presented in [BPS09a], is based on patches of elements


The estimators and are clearly not equivalent. People usually tend to consider as the standard Breass–Schöberl estimator, but it is clear that for the analysis sometimes may be more convenient.

An a posteriori analysis for both estimators is available in the sense that both satisfy a global reliability


and a global efficiency


up to oscillations (see, in particular, [Bra13, Theorems 9.4 and 9.5], and [BS08a, BPS09a]). The definition of the oscillation terms need particular attention. We shall comment on that in the next sections.

Explicit formulas in the case for the computation of are given in [BKMSa]. The direct construction is extended to in [CZ12a].

4. Equivalence with the residual error estimator

In this section we are going to show that, up to an oscillation term, the estimator is equivalent to an estimator based on the standard residual error estimator.

A crucial step for the analysis of the convergence of the adaptive scheme based on the Braess–Schöberl error estimator is its local equivalence with a standard residual error estimator. This fact has been observed (without rigorous proof) in [CN12, Section 3.5] and it has been used (without oscillations) in [KS11, Equation 2.17]. The interested reader is referred to [CFPP14, Section 8] for a more elaborate discussion about the equivalence between residual and non-residual error estimators.

The standard residual estimator for Laplace equation is based on two contributions: the element and jump residuals


where is an element of the triangulation and is a facet in the set of facets . The residual estimator for then reads


where is viewed as a piecewise function over and where as usual denotes the diameter of the element .

It is well known that the error estimator defines a functional as follows


The global residual error estimator on a triangulation is usually defined by adding up the local contributions


Unfortunately, no equivalence holds in general between and ; a crucial difference between the two estimators is that if an element belonging to the patch is refined and the discrete solution doesn’t change, then the error is not reduced, but the estimator decreases because of the reduction of the mesh-size; on the other hand, may not decrease since it is based on the equilibration procedure that might generate a reconstruction that is not different from the one computed on the coarser mesh.

An interesting alternative, described in [BPS09a] for piecewise constant , consists in building a residual error estimator which is based on element patches, so that the comparison with is more natural. This leads, for every node with corresponding Lagrangian function , to the following definition


We denote the corresponding global estimator by




The next lemma states the local equivalence between and the patchwise residual estimator .

Lemma 4.1.

Let be the solution of the variational formulation (2.7) and consider a node of the triangulation . Then, it holds


up to the oscillation term , that is


Let us start with the upper bound (4.9a). When is piecewise polynomial of degree , from [BPS09a, Theorem 7] we have


and thus, using standard scaling arguments,

If now is a generic function in , then the first term in (4.10) transforms into

so that it remains to show that



since . This implies

Let us now show how to prove the lower bound (4.9b). Recall that (3.5) implies

for any satisfying either zero boundary conditions or in the case when is an internal node. Now, take

with is defined as follows

where denotes the cubic Lagrange bubble function corresponding to the barycenter of and one of the Lagrange functions of degree associated to the edge . Since the norm of is bounded, we have

Moreover, we have that

By inserting the expression for and by evaluating the different terms separately we finally obtain

5. Optimal convergence rate for

In this section we recall the abstract theory developed in [CN12] for the analysis of AFEM formulations where nonresidual estimators are used and we show how to use it for the analysis of the AFEM based on the Braess–Schöberl error estimator. The interested reader is referred to [CN12, Sections 4–6]

for all details of the theory. The main results, stated in Theorems 5.1 and 6.6, are the contraction property for the total error (which guarantees the convergence of the AFEM procedure) and the quasioptimality of the rate of convergence in terms of number of degrees of freedom.

As usual when dealing with adaptive schemes, we use a notation that takes into account the levels of refinement instead of the mesh size. We denote by the initial triangulation of and by the discretization of on the triangulation obtained from after refinements. For some of the remaining notation we will adopt the one from [CN12].

Contraction property. If is the solution of problem (2.2) and is the solution of the corresponding discrete problem after refinements, the contraction property states the existence of constants , , and such that


where the norm denotes the -seminorm (equivalent to the norm in ). The main difference with respect to the standard contraction property commonly used in this context is that in general there might not be a contraction between two consecutive refinement levels and , but contraction is guaranteed every levels.

Quasioptimal decay rate. The quasioptimality in terms of degrees of freedom is described as usual in the framework of approximation classes. The triple , of the solution, the right hand side, and the other data of problem (2.2), is in the approximation class if

where the total error , in the set of conforming triangulations generated from with at most elements more than , is defined as


With this notation, the quasioptimal decay rate is expressed by the following formula


where the constant is independent of . We refer the interested reader to [CN12] for more detail on the constant , especially for its dependence on . Clearly, will depend in particular on the initial triangulation and on the integer appearing in the above contraction property.

The assumptions needed in order to get (5.1) and (5.3) are divided into three main groups: assumptions related to the a posteriori error estimators, assumptions related to the oscillations, and assumptions related to the design of the adaptive finite element method. We are going to use the newest vertex bisection algorithm for the refinement of the mesh (see, for instance, [Ste08]). While assumptions on oscillations and on the design of AFEM do not change when residual or nonresidual a posteriori estimators are used, the main modification for the analysis of nonresidual estimators is given by the verification of the assumptions related to the a posteriori error estimators. For this reason, we focus in this section only on these assumptions (see [CN12, Assumption 4.1]), which are the main object of our analysis in the present paper. We will also make more precise the reduction assumption about the oscillations (see condition [H5] later on). We adopt the notation of the previous section and we state the assumptions for a generic error estimator . In [CN12] there are some typos ( instead of , for instance) that we have corrected here.

[CN12] considers a closed set called -element made of elements or sides and denoted by . We restrict to the case when is a triangle. The following definition of refined set of order is needed between to (not necessarily consecutive) meshes and