An adaptive stabilized finite element method based on residual minimization

by   Victor M. Calo, et al.

We devise and analyze a new adaptive stabilized finite element method. We illustrate its performance on the advection-reaction model problem. We construct a discrete approximation of the solution in a continuous trial space by minimizing the residual measured in a dual norm of a discontinuous test space that has inf-sup stability. We formulate this residual minimization as a stable saddle-point problem which delivers a stabilized discrete solution and an error representation that drives the adaptive mesh refinement. Numerical results on the advection-reaction model problem show competitive error reduction rates when compared to discontinuous Galerkin methods on uniformly refined meshes and smooth solutions. Moreover, the technique leads to optimal decay rates for adaptive mesh refinement and solutions having sharp layers.


page 14

page 21

page 22


Residual minimization for goal-oriented adaptivity

In [12], the authors introduce an adaptive, stabilized finite element me...

An adaptive discontinuous Petrov-Galerkin method for the Grad-Shafranov equation

In this work, we propose and develop an arbitrary-order adaptive discont...

Adaptive staggered DG method for Darcy flows in fractured porous media

Modeling flows in fractured porous media is important in applications. O...

Adaptive stabilized finite elements: Continuation analysis of compaction banding in geomaterials

Under compressive creep, visco-plastic solids experiencing internal mass...

Sparse residual tree and forest

Sparse residual tree (SRT) is an adaptive exploration method for multiva...

Optimization and variational principles for the shear strength reduction method

This paper is focused on the definition, analysis and numerical solution...

1 Introduction

Continuous Galerkin Finite Element Methods (CG-FEM) are popular solution strategies in many engineering applications. However, these methods can suffer from instability under reasonable physical assumptions if the mesh is not sufficiently refined. This limitation led to the development of several alternative methods that achieve stability differently (see, e.g., ern2013theory; HUGHES2017 and references therein). Broadly speaking, and referring to Ern_Guermond_16 for a more detailed overview, there are two approaches to enhance stability in CG-FEM: residual and fluctuation-based stabilization techniques. Residual-based stabilization is exemplified by the Least-Squares Finite Element Method (LS-FEM) which was pioneered in the late 60’s and early 70’s (cf., Diska:68; Lucka:69; BrSc70 and Jiang99 for a more recent overview). The suboptimal convergence rate of LS-FEM in the -norm was improved by the Galerkin/Least-Squares (GaLS) method HFH89. The extension of the Least-Squares approach to more general dual norms was studied in CohDahWelM2AN2012 where the authors make the connection between residual minimization in test dual norms and the mixed formulation (cf., Equation (27)). In this context, the use of different test norms to stabilize convection-diffusion problems was further investigated in ChaEvaQiuCAMWA2014. Interestingly, the idea of residual minimization in non-standard dual norms is also at the heart of the recent Discontinuous Petrov–Galerkin (DPG) methods (see, e.g., zitelli2011class; DemHeuSINUM2013; ChaHeuBuiDemCAMWA2014, and DemGopBOOK-CH2014 for a general overview). Alternatively, several strategies based on fluctuation stabilization exist. Some methods penalize the gradient of some fluctuation or some fluctuation of the gradient as in GuM2AN; BecBra:01; MaSkT:07. Other methods penalize the gradient jumps across the mesh interfaces as in Burman:05; BurHa:04. Yet, other techniques enlarge the trial and test spaces with discontinuous functions and penalize the solution jumps across the mesh interfaces, as in the discontinuous Galerkin (DG) methods osti_4491151; LesRa:74; JohPi:86; CoKaS:00; BrMaS:04; Ern2006 (see also the textbook di2011mathematical for a recent overview).

In this work, we introduce a new adaptive stabilized FEM and study its numerical performance when approximating advection-reaction problems. We combine the residual minimization idea with the inf-sup stability offered by a large class of DG methods. More precisely, the discrete solution we seek is the minimizer in a continuous trial space (e.g., -conforming finite elements) of the residual measured in a dual norm with respect to DG test functions. This DG norm provides inf-sup stability to the formulation. In practice, such a residual minimization implies a stable saddle-point problem involving the continuous trial space and the discontinuous test space. The price paid for solving the larger linear system (compared to just forming the normal equations as in LS-FEM) is compensated by two advantages. Firstly, the present solution is more accurate, especially in the

-norm, than the LS-FEM solution. Actually, we prove that the present method leads to error estimates of the same quality as those delivered by DG methods, yet for a discrete solution belonging to a continuous trial space only. The second advantage is that in the present approach, the component in the DG space plays the role of an error representative that can be used to drive adaptive mesh refinement. Thus, we compute on the fly a discrete solution in the continuous trial space and an error representation in the discontinuous test space. From the practical point of view, we illustrate numerically the benefits of the adaptive mesh refinement procedure in the context of the approximation of the advection-reaction model problem with solutions possessing sharp inner layers.

We now put our work in perspective with respect to the state-of-the-art literature. Compared with the LS-FEM paradigm, we solve a larger linear system, but the advantage is that we obtain an error representation that guides the mesh adaptation and that the error decay rates with respect to the number of degrees of freedom are better. Compared to the recent developments on DPG methods, we share the goal of minimizing the residual with respect to an adequate norm, using broken test spaces, implying discrete stability and an error representation to guide adaptivity. Nevertheless, the main difference is that DPG methods are constructed considering conforming formulations and localizable test norms. In particular, for first-order PDEs and if the trial space is continuous, the norm equipping the test space must be the

-norm. Stronger norms can be applied using ultra-weak formulations which only require regularity for the trial space. This in practice leads to a discontinuous approximation of the trial solution, requiring additional unknowns to represent traces of the solution over the mesh skeleton, and therefore increasing the total number of degrees of freedom in the system (however, the broken test space structure with localizable norm allows for practical static condensation procedures, which lead to linear cost solvers when combined with multigrid techniques). Here, the trial space is continuous, but the test space is equipped with a stronger norm than , and the stability of the method is inherited from the DG formulation that provides the inner product and norm on top of which we build our method.

We organize the paper as follows. In Section 2, we recall some basic facts concerning the continuous and discrete settings. In particular, we present the abstract framework from Ern2006 for the error analysis of nonconforming approximation methods (see also di2011mathematical and Strang’s Second Lemma). We devise and analyze the present stabilized FEM in Section 3, where our main results are Theorems 2 and 3. In Section 4, we briefly describe the mathematical setting for the advection-reaction model problem and recall its DG discretization using both centered and upwind fluxes. In Section 5, we present numerical results illustrating the performance of our adaptive stabilized FEM. Finally, we draw conclusions in Section 6.

2 Continuous and discrete settings

2.1 Well-posed model problem

Let (continuous trial space) and (continuous test space) be (real) Banach spaces equipped with norms and , respectively. Assume that is reflexive. We want to solve the following linear model problem:


where is a bounded bilinear form on , is a bounded linear form on , and is the duality pairing in . Equivalently, problem (1) can be written in operator form by introducing the operator such that:


leading to the following problem:


We assume that there exists a constant such that


and that . These two assumptions are equivalent to problem (1) (or equivalently (3)) being well-posed owing to the Banach–Nec̆as–Babus̆ka theorem (see, e.g., (ern2013theory, Theorem 2.6)). Moreover, the following a priori estimate is then satisfied:


Finally, we mention that when , a sufficient condition for well-posedness is that the bilinear form is coercive, that is, there exists a constant such that


Then problem (1) (or equivalently (3)) is well-posed owing to the the Lax–Milgram lemma (see, e.g., lax2005parabolic), and the same a priori estimate (5) is satisfied.

2.2 Functional setting

For any open and bounded set , let be the standard Hilbert space of square-integrable functions over for the Lebesgue measure, and denote by and its inner product and inherited norm, respectively. Denote by

the corresponding space composed of square-integrable vector-valued functions and, abusing the notation, still by

the associated inner product. Considering weak derivatives, we recall the following well-known Hilbert space:


equipped with the following inner product and norm (respectively):


Let be the standard Dirichlet trace space of over the boundary , and be its dual space with as pivot space. More generally, to warrant that traces are well defined at least in , we will call upon the fractional-order Sobolev spaces , with .

2.3 Discrete setting

Figure 1: Skeleton orientation over the internal face .

Let be a conforming partition of the domain into open disjoint elements , such that


We denote by the boundary of the element , by , the set of interior and boundary edges/faces (respectively) of the mesh, and by the skeleton of . Over , we define the following standard broken Hilbert space:


with inner product defined as


For any and any interior face/edge (see Figure 1), we define the jump and the average of a smooth enough function as follows:


for a.e. , where and denote the traces over defined from a predefined orientation which is fixed by the unit normal vector . The subscript is omitted from the jump and average operators when there is no ambiguity.

Finally, we denote by , , the set of polynomials of total degree at most defined on the element , and we consider the following broken polynomial space:


2.4 Abstract setting for nonconforming approximation methods

Let be a finite-dimensional space composed of functions defined on (typically a broken polynomial space). We approximate the continuous problem (1) as follows:


where denotes a discrete bilinear form defined over , and a discrete linear form over . We say that the approximation setting is nonconforming whenever or .

To ascertain that the discrete problem (15) is well posed and to perform the error analysis, we follow the framework introduced for DG approximations in Ern2006 (see also di2011mathematical and Strang’s Second Lemma) which relies on the following three assumptions:

Assumption 1.

(Stability): The space can be equipped with a norm such that there exists a constant , uniformly with respect to the mesh size, such that:

Assumption 2.

(Strong consistency with regularity) The exact solution of (1) belongs to a subspace such that the discrete bilinear form supports evaluations in the extended space with , and the following holds true:


which amounts to , for all (Galerkin’s orthogonality).

Assumption 3.

(Boundedness): The stability norm can be extended to and there is a second norm on satisfying the following two properties:

  1. , for all ;

  2. there exists a constant , uniformly with respect to the mesh size, such that:


For a linear form , we set


where is the stability norm identified in Assumption 1. The above assumptions lead to the following a priori and error estimates (see Ern2006; di2011mathematical).

Theorem 1 (A priori and error estimates).

Denote by the solution of the continuous problem (1) and suppose that the assumptions 13 are satisfied. Then there exists a unique solution to the discrete problem (15), and the following two estimates are satisfied:




3 Residual minimization problem

We obtain the schemes we propose from the following two ingredients:

  • First, we select a finite-dimensional space and a discrete bilinear form such that Assumptions 12, and 3 hold true.

  • Second, we identify a subspace such that has the same approximation capacity as the original space for the types of solutions we want to approximate. The main example we have in mind is to choose as a broken polynomial space and as the -conforming subspace. We refer the reader to Remark 5 for a further discussion on the appoximation capacity of the spaces and in the context of advection-reaction equations.

Starting from a stable formulation of the form (15) in , we use the trial subspace to solve the following residual minimization problem:


where is defined as:


and denotes the inverse of the Riesz map:


The second equality in (22) follows from the fact that the Riesz map is an isometric isomorphism. Classically, the minimizer in (22) is a critical point of the minimizing functional, which translates into the following linear problem:


Defining the residual representation function as


problem (25) is equivalent to finding the pair , such that:


Conversely, if the pair solves (27), then (26) holds true and is the minimizer of the quadratic functional in (22).

Theorem 2 (A priori bounds and error estimates).

If the assumptions 1-3 are satisfied, then the saddle-point problem (27) has one and only one solution . Moreover, such a solution satisfies the following a priori bounds:


and the following a priori error estimate holds true:


recalling that is the exact solution to the continuous problem (1).


See A. ∎

Assumption 4.

(Saturation) Let be the second component of the pair solving the saddle-point problem (27). Let be the unique solution to (15). There exists a real number , uniform with respect to the mesh size, such that .

Theorem 3 (Error representative).

Let be the second component of the pair solving the saddle-point problem (27). Let be the unique solution to (15). Then the following holds true:


Moreover, if the saturation Assumption 4 is satisfied, then the following a posteriori error estimate holds true:


We observe that

which leads to the bound (30) owing to the inf-sup condition (16). Then, invoking the triangle inequality and the saturation assumption leads to

Re-arranging the terms and using (30) proves the a posteriori estimate (31). ∎

Remark 1 ().

In the particular case where , one readily verifies that the unique solution to the saddle-point problem (27) is the pair , where is the unique solution to (15). In this situation, the bound (30) is not informative.

Remark 2 (Ls-Fem).

Assume that , , is the broken polynomial space defined in (14) and that is the -conforming subspace . Assume that is equipped with the -norm and that the inf-sup condition (16) holds true (in the examples we have in mind, for example, a first-order PDE such as the advection-reaction equation described in Section 4, the discrete bilinear form is -coercive). Then, the residual minimization problem (22) coincides with the Least-Squares Finite Element Method (LS-FEM) set in , and the error representative is the -projection onto of the finite element residual. Unfortunately, the -norm is too weak, leading to an error estimate (29) that is suboptimal, typically by one order in the mesh size (i.e., of the form ). To remedy this difficulty, we shall equip with a stronger norm inspired by the DG method, thereby leading to (asymptotic) quasi-optimality in (29), that is, an upper bound of the form .

4 Model problem: Advection-reaction equation

In this section, we present examples of the above formulations in the context of the advection-reaction model problem.

4.1 Continuous setting

Let , with , be an open, bounded Lipschitz polyhedron with boundary and outward unit normal . Let denote a bounded reaction coefficient, and let denote velocity field such that . Assume that the boundary can be split into the three subsets (inflow/outflow) and (characteristic boundary). Finally, let denote a source term and denote a boundary datum, where


The advection-reaction problem reads:


Consider the graph space equipped with the inner product leading to the norm such that . Then, is a Hilbert space. Moreover, assuming that and are well separated, that is, , traces are well defined over , in the sense that the operator:


extends continuously to (cf., Ern2006).

One possible weak formulation of (33) with weak imposition of the boundary condition reads:


with denoting the negative part such that for any real number , and the continuous bilinear form is such that:


The model problem (35) is well-posed under some mild assumptions on and (see  Ern2006 or CanCR2017 for the details). The well-posedness is specific to right-hand sides of the form (35) (and not to any right-hand side in ), so that well-posedness is not established by reasoning directly on (35), but on an equivalent weak formulation where the boundary condition is strongly enforced by invoking a surjectivity property of the trace operator.

4.2 DG discretization

Recall the discrete setting from Section 2.3. For a polynomial degree , the standard DG discretization of the model problem (35) is of the form:


where and are the bilinear forms over such that




where is the penalty parameter, and is the linear form over such that


The choice corresponds to the use of centered fluxes, and the choice (typically ) to the use of upwind fluxes, see BrMaS:04; di2011mathematical. To verify that the assumptions 1-3 are satisfied, we need to make a reasonable regularity assumption on the exact solution.

Assumption 5.

(Partition of   and regularity of exact solution ) There is a partition of into open disjoint polyhedra such that:

  1. The inner part of the boundary of is characteristic with respect to the advective field, that is, for all and all , where is the unit outward normal to ;

  2. The exact solution is such that

  3. The mesh is aligned with this partition, that is, any mesh cell belongs to one and only one subset .

Assumption 5 is instrumental to ensure that the trace of is meaningful on the boundary of each mesh cell, and that can only jump across those interfaces that are subsets of some with . In the case where has no jumps, we can broadly assume that , in which case the statement of Assumption 5-(i) is void.

The broken polynomial space can be equipped with various norms. We consider the following choices:


In view of Assumption 3, we define with defined in (41), and we consider the following extensions of the above norms:


For the proof of the following result, we refer the reader to di2011mathematical.

Proposition 1.

(Verification of the assumptions) In the above framework, Assumptions 1-3 hold true in the following situations: (i) (centered fluxes) and the norms and ; (ii) (upwind fluxes) and the norms and .

Remark 3.

(Coercivity) The case of centered fluxes in Proposition 1 leads to stability in the form of coercivity, whereas the case of upwind fluxes leads to an inf-sup condition, and the norm that is controlled is stronger than with centered fluxes.

An immediate consequence of Theorem 1 is the bound


with uniform with respect to the mesh size, where the lower bound is trivial and simply results from the fact that . When comparing the decay rates with respect to the mesh size for the best-approximation errors of smooth solutions in the norms and , the following terminology is useful: the estimate (44) is said to be suboptimal if the left-hand side decays at a faster rate than the right-hand side, and it is said to be (asymptotically) quasi-optimal if both sides decay at the same rate. The inequality in (44) is suboptimal when centered fluxes are used (the left-hand side decays at a rate of order and the right-hand side decays at rate ), and it is (asymptotically) quasi-optimal when upwind fluxes are used (both sides decay at a rate of order ) (cf., di2011mathematical).

Remark 4 (Continuous trial space).

If we set , that is, if is composed of continuous, piecewise polynomial functions, the discrete bilinear form reduces on to

since functions in have zero jumps across the mesh interfaces and their piecewise gradient coincides with their weak gradient over .

Remark 5 (Best approximation).

An interesting property of the present setting regarding the approximation capacity of the discrete spaces and is that, for all , ,

with uniform with respect to the mesh size (see B). The converse bound is trivially satisfied with constant equal to since .

5 Numerical examples

In this section, we present the 2D and 3D test cases, cover some implementation aspects, and discuss the numerical results.

5.1 Model problems

5.1.1 2D model problem

(a) 2D problem setting
(b) 2D initial mesh
Figure 2: 2D model problem and initial mesh
(a) 2D exact solution for
(b) 2D exact solution for
Figure 3: 2D exact solutions for and

We consider the purely advective problem (33) () over the unit square , with a constant velocity field (see Figure 1(a)). We consider the source term in and, for a parameter , an inflow boundary datum , where , and the exact solution is given by


The parameter tunes the width of the inner layer along the line of equation . The limit value as M grows becomes


The inner layer can be seen in Figure 2(b) for the case (compare with Figure 2(a) for the case ). Although the inflow and outflow boundaries are not well-separated here, the exact solution matches the partition and regularity assumption 5. Indeed, when , whereas is composed of two subsets with the common characteristic interface when (see Figure 1(a)). In addition, the absence of a reactive term precludes the straightforward derivation of -stability by a coercivity argument. However, -stability is recovered via an inf-sup argument which remains valid whenever the advective field is filling (which is trivially the case for a constant field across a square domain); we refer the reader to devinatz1974asymptotic; azerad1996inegalite; ayuso2009discontinuous; cantin2017edge for further insight into this point.

5.1.2 3D model problem

(a) 3D initial mesh
(b) 3D exact solution for
Figure 4: 3D initial mesh and exact solution

We still consider the purely advective problem (33), this time over the unit cube , the source term , the spiral-type velocity field , and the inflow boundary datum , where the exact solution is given by


with and . The limit value as grows becomes


Figure 3(b) illustrates the inner layer for . The inflow boundary corresponds to the following union of portions of planes:


where , , , and . As for the 2D model problem, the solution does not satisfy the regularity assumption 5 since the inflow and outflow boundaries are not well separated; moreover, whenever , the exact solution is piecewise smooth, but the subsets in the corresponding partition are not polyhedra.

5.2 Implementation aspects

We consider the broken polynomial space , with , defined in (14), and consider the -conforming subspace , that is, is composed of continuous, piecewise polynomial functions of degree . Since the trial space for the minimization problem (22) is composed of continuous functions, we use the label “CT” in our figures. We equip the space with one of the two norms defined in (42), leading to the labels “CT-cf” and “CT-up”. For comparison purposes, we also compute the DG solution solving the primal problem (15). When reporting the corresponding error, we use the labels “DT-cf” and “DT-up”, where ”DT” means discontinuous trial space, and the labels “cf” and “up” indicate the use of centered fluxes and upwind fluxes (with ), respectively, as well as the norm in which the error is evaluated. The solution “CT-cf” can be loosely interpreted as a LS-FEM solution. Indeed the residual minimization is performed over the -conforming finite element space, and the minimizing functional is the