    # Auxiliary Space Preconditioning of Finite Element Equations Using a Nonconforming Interior Penalty Reformulation and Static Condensation

We modify the well-known interior penalty finite element discretization method so that it allows for element-by-element assembly. This is possible due to the introduction of additional unknowns associated with the interfaces between neighboring elements.The resulting bilinear form, and a Schur complement (reduced) version of it, are utilized in a number of auxiliary space preconditioners for the original conforming finite element discretization problem. These preconditioners are analyzed and their performance is illustrated on model second order scalar elliptic problems discretized with high order elements.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

The well-known interior penalty (IP) discretization method [4, 11, 20, 15]

couples degrees of freedom across two neighboring elements, so it does not possess the element-by-element assembly property, which is inherent to conforming finite element discretization methods. The element-by-element assembly property is useful, for example, in “matrix-free” computations, since it minimizes the coupling across element interfaces. Also, certain element-based coarsening as in the AMGe methods,

, can be employed then, such that it maintains the element-by-element assembly property on coarse levels. The modification we propose consists of the following simple idea. The IP bilinear form originally contains a jump term , over each interface between any two neighboring elements and . Here, this term is replaced by two other terms, involving a new unknown, , associated with . We have then , where and come from the element , while and come from the other element, . Clearly, letting (and similarly, ), recovers the original jump term, scaled by . It is also clear that introducing the additional space of functions , associated with the set of interfaces , allows associating each of the two new terms with a unique neighboring element, i.e., with and with . Consequently, the coupling occurs only through these interface unknowns. We consider discontinuous from face to face.

Clearly, the modification can be employed with and replaced by subdomains and (for example, being unions of finite elements) and replaced by the interface between and . This is the approach we exploit, when building preconditioners for an original conforming discretization (i.e., no interior penalty terms to begin with). One may view the set of subdomains, , as a coarse triangulation . In that case, each is a union of fine elements from an initial fine triangulation . We refer to as an agglomerated element, or simply an agglomerate. The modified IP method employs two sets of discontinuous spaces: one space of functions associated with the agglomerates and the space of functions associated with the interfaces between any two neighboring and from . The resulting bilinear form consists of the local bilinear forms associated with each ( is the penalty parameter). The trial functions are and , whereas, similarly, the test functions are and . Here, is a local, on , version of the bilinear form in the original conforming discretization.

The goal of the present paper is to study the modified IP bilinear form as a tool for building preconditioners for the original conforming bilinear form utilizing the auxiliary space technique which goes back to S. Nepomnyaschikh (see ) and studied in detail by J. Xu . We analyze both additive and multiplicative versions of the auxiliary space preconditioners for general smoothers, following [19, Theorem 7.18], by verifying the assumptions needed there. Another use of the element-by-element assembly property of the modified IP method is the application of the spectral AMGe technique to construct algebraic multigrid (AMG) preconditioners for the IP bilinear form and for its reduced Schur complement form. We note that the modified IP form allows for static condensation, i.e., we can eliminate the unknowns (they are decoupled from each other) and, thus, end up with a problem for the interface unknowns, , only.

We have implemented these auxiliary space preconditioners and tested their theoretically proven mesh independent performance on model 2D and 3D scalar second order elliptic problems, including high order elements.

The remainder of the paper is structured as follows. Section 2 introduces basic concepts, spaces, notation, and a problem of interest. The IP formulation and the respective auxiliary space preconditioners are presented in Section 3. Their properties are studied, showing (Theorem 3.4) the general optimality of a fine-scale auxiliary space preconditioning strategy via the IP reformulation. Section 4 describes a polynomial smoother employed in the preconditioner. A generic AMGe approach that can be utilized for solving the IP auxiliary space problem is outlined in Section 5. Numerical examples are demonstrated in Section 6, while conclusions and future work are left for the final Section 7.

## 2. Basic setting

This section is devoted to providing foundations by addressing generic notions and basic procedures. Notation and abbreviated designations are introduced to facilitate a more convenient presentation in the rest of the paper.

### 2.1. Mesh and agglomeration

A domain ( is the space dimension) with a Lipschitz-continuous boundary, a (fine-scale) triangulation of , and a finite element space on are given. The mesh is seen as a set of elements and respective associated faces, where a face is the interface, of dimension , between two adjacent elements. In general, the degrees of freedom (dofs) of are split into dofs associated with the interiors of the elements and dofs associated with the faces (cf. Fig. 3), where a face dof belongs to multiple elements and can also belong to multiple faces, whereas an interior dof always belongs to a single element. The focus of this paper is on consisting of continuous piece-wise polynomial functions, equipped with the usual nodal dofs.

Let be a partitioning of into non-overlapping connected sets of fine-scale elements, called agglomerate elements or, simply, agglomerates; see Fig. 1. In general, can be obtained by partitioning the dual graph of , which is a graph with nodes the elements in , where two elements are connected by an edge if they share a face. It can be expressed as a relation elementelement. That is, all are described in terms of the elements via a relation elementelement. Note that capitalization indicates agglomerate entities in , like element (short for “agglomerate element”), face, or entity, whereas regular letters indicate fine-scale entities in , like element, face, or entity. This convention, unless otherwise specified, is used in the rest of the paper. Using elementface next, provides

 {element}face={element}element×elementface,

where elementface represents the relation, in , between elements and their adjacent faces. Then, an intersection procedure over the sets described by elementface constructs the agglomerate faces in as sets of fine-scale faces, expressed via a relation faceface, and related to elements in the form of elementface; cf. Fig. 2. Consequently, each face can be consistently recognized as the -dimensional surface that serves as an interface between two adjacent elements in . The set of obtained faces in is denoted by . Considering the dofs in , the relations elementdof, facedof, and

 elementdof ={element}element×elementdof, facedof ={face}face×facedof

are determined.

For additional information on relation tables (matrices) and their utilization, see [19, 18]. (a) Figure 2. An illustration of the designation of a face as a set of (fine-scale) faces, serving as an interface between elements. Here, the elements T_1 and T_2 are related to the face F in the relation table elementface.

### 2.2. Pairs of nonconforming spaces Figure 3. An illustration of the construction of (fine-scale) nonconforming finite element spaces Eh and Fh from a conforming space Uh, utilizing agglomeration that provides elements and faces. Note that the only inter-entity coupling in the IP reformulation is between elements and their respective faces.

A main idea in this work is to obtain discontinuous (nonconforming) finite element spaces and formulations on and . To that purpose, define the finite element spaces and via restrictions (or traces) of functions in onto and , respectively. Namely,

 Eh ={v_eh∈L∞(Ω);\allowbreak\nonscript ∀T∈TH,∃vh∈Uh:v_eh∣∣T=vh∣∣T}, Fh ={v_bh∈L∞(∪_F∈ΦHF);\allowbreak\nonscript

Note that and are fine-scale spaces despite the utilization of agglomerate mesh structures like and , which justifies the parameter “” (versus “”). Accordingly, the bases in and are derived via respective restrictions (or traces) of the basis in . The degrees of freedom in and are obtained in a corresponding manner from the dofs in , as illustrated in Fig. 3. For simplicity, “dofs” is reserved for the degrees of freedom in , whereas “edofs” and “bdofs” are reserved for and , respectively. Moreover, “adofs” designates the edofs and bdofs collectively and is associated with the product space . In more detail, edofs and bdofs are obtained by “cloning” all respective dofs for every agglomerate entity that contains the dofs, in accordance with elementdof and facedof. Hence, each entity receives and it is the sole owner of a copy of all dofs it contains and there is no intersection between entities in terms of edofs and bdofs, i.e., they are completely separated without any sharing, making and spaces of discontinuous functions. Nevertheless, dofs, edofs, and bdofs are connected via their common “ancestry” founded on the above “cloning” procedure. Thus, the restrictions (or traces) of finite element functions in one of the spaces and their representations as functions in some of the other spaces is seen and performed in purely algebraic context. For example, if

is a vector in terms of the edofs of some

, that represents locally, on , a function in , and , then denotes a vector in terms of the bdofs of , that represents locally, on , a function in . This is unambiguous and should lead to no confusion as it only involves a subvector and appropriate index mapping from edofs to bdofs, without any actual transformations, since are are trace (restriction) spaces for . This constitutes an “algebraic” procedure, mimicking the behavior of restrictions (or traces) of finite element functions, on , that vectors like represent, which are, generally, also supported on . In what follows, finite element functions are identified with vectors on the degrees of freedom in the respective space.

Coarse subspaces and

can be constructed by, respectively, selecting linearly independent vectors

, for every , and , for every , forming the bases for the coarse spaces. This is an “algebraic” procedure, formulating the coarse basis functions as linear combinations of fine-level basis functions, i.e., as vectors in terms of the fine-level degrees of freedom. The basis vectors are organized appropriately as columns of prolongation (or interpolation) matrices and , forming as . For consistency, the corresponding degrees of freedom (associated with the respective coarse basis vectors) in and are respectively called “edofs” and “bdofs”, and the collective term “adofs” is associated with . In essence, this is based on the ideas in AMGe (element-based algebraic multigrid) methods [19, 18, 9, 10, 14, 8].

### 2.3. Model problem

The model problem considered in this paper is the second order scalar elliptic partial differential equation (PDE)

 (2.1) −div(κ∇u)=f(x) in Ω,

where , , is a given permeability field, is a given source, and is the unknown function. For simplicity of exposition, the boundary condition on , the boundary of , is considered, i.e., . The ubiquitous variational formulation of (2.1) is utilized, providing the weak form

 (2.2) Find u∈H1_0(Ω):(κ∇u,∇v)=(f,v),∀v∈H1_0(Ω),

where denotes the inner products in both and . Consider the fine-scale finite element space defined on . Using the finite element basis in , (2.2) induces the following linear system of algebraic equations:

 (2.3) Au=f,

for the global symmetric positive definite (SPD) stiffness matrix . Moreover, the local, on agglomerates, symmetric positive semidefinite (SPSD) stiffness matrices , for , are obtainable, such that (the summation involves an implicit local-to-global mapping).

## 3. Interior penalty approach

In this section, an approach for obtaining preconditioners, based on the ideas of an interior penalty (IP) method (see ) is described and studied. A main idea is that, instead of using penalty terms on the jumps directly between elements, the interface space is employed to avoid direct coupling between elements.

### 3.1. Formulation

Consider the nonconforming discrete quadratic minimization formulation of (2.1)

 min∑_T∈TH[(κ∇v_eh,∇v_eh)_T+1δ∑_F⊂∂T⟨v_eh∣∣F−v_bh,v_eh∣∣F−v_bh⟩_F]−2(f,v_eh),

for , providing, in lieu of (2.2), the weak form: Find , such that

 (3.1)

for all . Here, and denote the inner products in and , respectively, and is a small penalty parameter that can, generally, depend on . The notation in (3.1) requires an explanation. Namely, (3.1) involves an implicit restriction of and , which are generally discontinuous across , onto . This removes the ambiguity from and by considering “one-sided” traces, involving the implicit restrictions to prior to further restricting to . Therefore, the only inter-entity coupling, introduced by the bilinear form in (3.1), is between the edofs in and the bdofs on . Thus, the edofs are not directly coupled across the elements and an assembly procedure for the matrix, corresponding to the bilinear form in (3.1), would only need to add (accumulate) contributions associated with bdofs.

Equation (3.1) is slightly modified to obtain a formulation that is easier to implement algebraically. Particularly, only the interface penalty term is changed. Namely, consider – the local, on , and corresponding to a summand in (3.1), matrix associated with the quadratic form

 (3.2) [v_ev_b]TA_T[v_ev_b]=v_eTA_Tv_e+1δ∑_F⊂∂T(v_e|F−v_b|F)TD_F(v_e|F−v_b|F),

where is defined on the edofs of , – on the bdofs on , is the restriction of the diagonal, , of the global onto the bdofs of , and takes the block form

 (3.3) A_T=[A_T,eeA_T,ebA_T,beA_T,bb].

Here, the discussion in Section 2.2, concerning index mapping as part of the restriction, applies. Respective consistent, with the construction of , relations elementedof, elementbdof, and, consequently, elementadof are also available. The global (SPD) IP matrix is obtainable via standard accumulation (assembly) – – which, as mentioned above, would involve only accumulation (addition) of the blocks, while the remaining portions of the local matrices are simply copied into the global matrix.

Both and are SPSD with null spaces spanned by respective constant vectors (excluding essential boundary conditions). However, both and are positive definite due to the interface terms. In general, using that has strictly positive diagonal entries, is positive definite if and only if every nonzero vector in the null space of has a nonzero value on some face . This is the case here, since the null space of is spanned by the constant vector.

### 3.2. Auxiliary space preconditioners

Here, preconditioners based on the IP formulation are derived. Consider the finite element space , which in the context here is regarded as an “auxiliary” space. Let be a linear transfer operator, to be defined momentarily. Identifying finite element functions with vectors allows to consider as a matrix and obtain . In order to define the action of , recall that the relation between dofs, on one side, and adofs, on the other, is known and unambiguous. Therefore, it is reasonable to define the action of as taking the arithmetic average, formulated in terms of adofs that correspond to a particular dof, of the entries of a given (auxiliary space) vector and obtaining the respective entries of a mapped vector defined on dofs. That is, for any dof , let be the set of corresponding adofs (the respective “cloned” degrees of freedom) and consider a vector , defined in terms of adofs. Then,

 (Π_h^v)_l=1|J_l|∑_j∈J_l(^v)_j.

Clearly, all row sums of equal 1 and each column of has exactly one nonzero entry. Assuming that is a regular (non-degenerate) mesh , is bounded, independently of . That is,

 (3.4) 1≤|J_l|≤ϰ,

for a constant , which depends only on the regularity of , but not on . Indeed, let be the global maximum number of elements and faces, in , that a dof can belong to, which, in turn, is bounded by the global maximum number of elements and faces, in , that a dof can belong to.

Let be a “smoother” for , such that is SPD, and – a symmetric preconditioner for . Define the additive auxiliary space preconditioner for

and the multiplicative auxiliary space preconditioner for

 (3.6) B−1_mult=¯¯¯¯¯¯M−1+(I−M−TA)Π_hB−1Π_hT(I−AM−1),

where is the symmetrized (in fact, SPD) version of . In case is symmetric, in can be replaced by . The action of is obtained via a standard “two-level” procedure:

Given , is computed by the following steps:

1. [label=()]

2. “pre-smoothing”: ;

3. residual transfer to the auxiliary space: ;

4. auxiliary space correction: ;

5. correction transfer from the auxiliary space: ;

6. “post-smoothing”: .

### 3.3. Analysis

Properties of the preconditioners, showing their optimality, are studied next. Define the operator , for111Finite element functions and algebraic vectors are identified, which should cause no confusion. , via

 (I_hv)_j=(v)_l,

for each adof , where , for the corresponding dof . This describes a procedure that appropriately copies the entries of , so that the respective finite element functions, corresponding to and , can be viewed as coinciding in . That is, in a sense, is an injection (embedding) of into . Considering the respective matrices, has the fill-in pattern of , but all nonzero entries are replaced by 1.

It is easy to see that , the identity operator on , implying that is surjective, i.e., it has a full row rank. Moreover, for any , (exactly) approximates in the sense

 (3.7) v−Π_hI_hv=0,

and is “energy” stable, since

 (3.8) (I_hv)TAI_hv=vTAv.

This is to be expected, since, in a sense, “includes” .

The more challenging task is to show the continuity of in terms of the respective “energy” norms. This is addressed next.

###### Lemma 3.1.

The operator , defined as , where is the identity on , is bounded in the sense

 (I_hΠ_h^v−^v)TA(I_hΠ_h^v−^v)≤(1+Λδϰ2)^vTA^v,

for all , where is the one in (3.2), is from (3.4), and is a constant, independent of , , the coefficient in (2.1), and the regularity of (see Remark 3.3).

###### Proof.

The portions of auxiliary space vectors corresponding to edofs and bdofs are respectively indexed by “” and “”, leading to the notation, for , , where and . By further splitting , it is obtained , where “” denotes the edofs in the interiors of all and “” are the edofs that can be mapped to some bdofs, based on the previously-described procedure of “cloning” dofs into edofs and bdofs. Analogously, the splitting , for , is introduced in terms of dofs, indexed “”, in the interiors of all and dofs, indexed “”, related to bdofs. Figure 3 illustrates the utilized indexing. Note that there is a clear difference between “” and “”, but also a clear relation. Locally, on , “” and “” can, in fact, be equated, but in a global setting, it is necessary to distinguish between “” and “” indices. This should not cause any ambiguity below. Based on the splittings, define

 A =[A_iiA_irA_riA_rr], (3.9) A Π_h =[Π_iiΠ_rsΠ_rb], I_h =⎡⎢⎣II_srI_br⎤⎥⎦,

where is the map from “” to “” indices and – from “” to “” indices. Note that is a matrix with the fill-in structure of , where all nonzero entries are replaced by 1, each row has exactly one nonzero entry and each column has at least two nonzero entries (exactly two when the respective dofs are in the interior of a face). Similarly, is a matrix with the fill-in of , where all nonzero entries are replaced by 1, each row has exactly one nonzero entry and each column has at least one nonzero entry (exactly one when the respective dofs are in the interior of a face). Observe that, assuming everything is consistently numbered and ordered, and . It holds that . This is due to the fact that and represent the same “one-sided connections” (meaning that the “connections” are on one side of the faces ), but in terms of different indices.

Let . Then,

 I_hv =I_hΠ_h^v=[^v_iT,[I_sr(Π_rs^v_s+Π_rb^v_b)]T,[I_br(Π_rs^v_s+Π_rb^v_b)]T]T, ~v=I_hv−^v =[0T,[I_sr(Π_rs^v_s+Π_rb^v_b)−^v_s]T,[I_br(Π_rs^v_s+Π_rb^v_b)−^v_b]T]T,

and

 ~vTA~v =1δ∑_T∈TH∑_F⊂∂T(^v_s|F−^v_b|F)TD_F(^v_s|F−^v_b|F) =+∑_T∈TH[(Π_rs^v_s+Π_rb^v_b)|T−^v_s|T]TA_T,rr[(Π_rs^v_s+Π_rb^v_b)|T−^v_s|T],

where is the local version of and it is utilized that locally, as noted above, “” and “” can be identified. Using that the stiffness matrices can be bounded from above by their diagonals, with some constant , it follows:

 (3.10) ~vTA~v≤1δ∑_T∈TH∑_F⊂∂T(^v_s|F−^v_b|F)TD_F(^v_s|F−^v_b|F)≤+Λ∑_T∈TH∑_F⊂∂T[(Π_rs^v_s+Π_rb^v_b)|F−^v_s|F]TD_F[(Π_rs^v_s+Π_rb^v_b)|F−^v_s|F].

Let, for a dof from the “” dofs, denote the set of related “” edofs and – the set of related bdofs; illustrated in Fig. 4. That is, with corresponding to the -th column of and – to the -th column of . Furthermore, let be the respective diagonal entry in , denote, for all , the “” neighbors (see Fig. 5) of , according to the connectivity of , and , are respectively the minimum and maximum values of on the adofs in . Clearly,

 d_l∑_j∈J_ls[1|J_l|(∑_k∈J_ls(^v_s)_k+∑_p∈J_lb(^v_b)_p)−(^v_s)_j]2≤|J_ls|d_l(M_l−m_l)2.

Notice that the adofs corresponding to and are connected via a path, with respect to the connectivity of , whose length is bounded by . Following along this path, applying the triangle inequality and (3.4), it holds

 |J_ls|d_l(M_l−m_l)2≤ϰ2d_l∑_j∈J_ls∑_k∈N_jb[(^v_s)_j−(^v_b)_k]2.

Hence,

 d_l∑_j∈J_ls[1|J_l|(∑_k∈J_ls(^v_s)_k+∑_p∈J_lb(^v_b)_p)−(^v_s)_j]2≤ϰ2d_l∑_j∈J_ls∑_k∈N_jb[(^v_s)_j−(^v_b)_k]2.

Summing over in the last inequality provides

 ∑_T∈TH∑_F⊂∂T[(Π_rs^v_s+Π_rb^v_b)|F−^v_s|F]TD_F[(Π_rs^v_s+Π_rb^v_b)|F−^v_s|F] ≤ϰ2∑_T∈TH∑_F⊂∂T(^v_s|F−^v_b|F)TD_F(^v_s|F−^v_b|F).

Combining the last estimate and (

3.10) implies

 ~vTA~v ≤(1+Λδϰ2)1δ∑_T∈TH∑_F⊂∂T(^v_s|F−^v_b|F)TD_F(^v_s|F−^v_b|F) ≤(1+Λδϰ2)^vTA^v.\qed
###### Corollary 3.2.

The operator is continuous in the sense

 (Π_h^v)TAΠ_h^v≤2(2+Λδϰ2)^vTA^v,

for all , where the constants are the same as in Lemma 3.1.

###### Proof.

Owing to (3.8), the reverse triangle inequality, and Lemma 3.1, it follows:

 [(Π_h^v)TAΠ_h^v]12−[^vTA^v]12 =[(I_hΠ_h^v)TAI_hΠ_h^v]12−[^vTA^v]12 ≤[(I_hΠ_h^v−^v)TA(I_hΠ_h^v−^v)]12 ≤√1+Λδϰ2[^vTA^v]12.\qed Figure 4. An illustration of the sets of adofs, associated with an “r” dof, used in the proof of Lemma 3.1. Figure 5. An illustration of the “b” neighbors, according to the connectivity of A_sb in (3.9), of an “s” dof. These are designated as N_jb and N_kb and used in the proof of Lemma 3.1.
###### Remark 3.3.

The independence of the constants, in the bounds in Corollaries 3.2 and 3.1, on the coefficient in (2.1) holds, since the coefficient information is contained in in (3.2). In fact, the use of the diagonal entries of the global is crucial. However, the particular argument does not exclude dependence of the constants on the order of the finite element spaces. This is due to in (3.10) depending on the maximum number of dofs in an element. This can be mitigated, if necessary, via “strengthening” the interface terms in (3.2) by using a properly scaled version of the diagonal of (e.g., adjusting the value of ), or employing a so called weighted -smoother (see ) in lieu of the diagonal of , which would lead to . Note that the numerical results in Section 6.3 do not indicate a practical necessity for adjusting the interface terms, when increasing the polynomial order.

Finally, based on the above properties, the optimality of the auxiliary space preconditioners can be established.

###### Theorem 3.4 (spectral equivalence).

Let the smoother satisfy the property that is SPD. Assume that in (3.5) and (3.6) is a spectrally equivalent preconditioner for the IP matrix , in the sense that there exist positive constants and , such that

 (3.11) α−1^vTA^v≤^vTB^v≤β^vTA^v,∀^v∈Eh×Fh.

Then, the additive and multiplicative auxiliary space preconditioners,