# FEMPAR: An object-oriented parallel finite element framework

FEMPAR is an open source object oriented Fortran200X scientific software library for the high-performance scalable simulation of complex multiphysics problems governed by partial differential equations at large scales, by exploiting state-of-the-art supercomputing resources. It is a highly modularized, flexible, and extensible library, that provides a set of modules that can be combined to carry out the different steps of the simulation pipeline. FEMPAR includes a rich set of algorithms for the discretization step, namely (arbitrary-order) grad, div, and curl-conforming finite element methods, discontinuous Galerkin methods, B-splines, and unfitted finite element techniques on cut cells, combined with h-adaptivity. The linear solver module relies on state-of-the-art bulk-asynchronous implementations of multilevel domain decomposition solvers for the different discretization alternatives and block-preconditioning techniques for multiphysics problems. FEMPAR is a framework that provides users with out-of-the-box state-of-the-art discretization techniques and highly scalable solvers for the simulation of complex applications, hiding the dramatic complexity of the underlying algorithms. But it is also a framework for researchers that want to experience with new algorithms and solvers, by providing a highly extensible framework. In this work, the first one in a series of articles about FEMPAR, we provide a detailed introduction to the software abstractions used in the discretization module and the related geometrical module. We also provide some ingredients about the assembly of linear systems arising from finite element discretizations, but the software design of complex scalable multilevel solvers is postponed to a subsequent work.

## Authors

• 24 publications
• 12 publications
• 4 publications
08/02/2019

### A tutorial-driven introduction to the parallel finite element library FEMPAR v1.0.0

This work is a user guide to the FEMPAR scientific software library. FEM...
11/20/2019

### MFEM: a modular finite element methods library

MFEM is an open-source, lightweight, flexible and scalable C++ library f...
12/05/2020

### A finite-element framework for a mimetic finite-difference discretization of Maxwell's equations

Maxwell's equations are a system of partial differential equations that ...
11/07/2017

### Exposing and exploiting structure: optimal code generation for high-order finite element methods

Code generation based software platforms, such as Firedrake, have become...
09/27/2021

### The software design of Gridap: a Finite Element package based on the Julia JIT compiler

We present the software design of Gridap, a novel finite element library...
09/15/2020

### Smoothed Particle Hydrodynamics Techniques for the Physics Based Simulation of Fluids and Solids

Graphics research on Smoothed Particle Hydrodynamics (SPH) has produced ...
02/01/2018

### Slate: extending Firedrake's domain-specific abstraction to hybridized solvers for geoscience and beyond

Within the finite element community, discontinuous Galerkin (DG) and mix...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Even though the origins of the finite element (FE) method trace back to the 50s, the field has drastically evolved during the last six decades, leading to increasingly complex algorithms to improve accuracy, stability, and performance. The use of the -version of the FE method and its exponential convergence makes high-order approximations an excellent option in many applications [1]. Adaptive mesh refinement driven by a posteriori

error estimates, i.e.,

-adaptivity, is an essential ingredient to reduce computational cost in an automatic way [2]. For smooth solutions, -adaptivity or hybrid -adaptivity can further reduce computational cost for a target level of accuracy [3]. Originally, FE methods were restricted to nodal Lagrangian bases for structural problems. The extension of FE methods to other applications, like porous media flow or electromagnetism, motivated the design of more complex bases and require different mappings from the reference to the physical space, complicating the implementation of these techniques in standard FE codes. Saddle-point problems also require particular mixed FE discretizations for stability purposes [4, 5]. More recently, novel FE formulations have been proposed within the frame of exterior calculus, e.g., for mixed linear elasticity problems [6]. Physics-compatible discretization are also gaining attention, e.g., in the field of incompressible fluid mechanics. Divergence-free mixed FEs satisfy mass conservation up to machine precision, but their implementation is certainly challenging [7]. During the last decade, a huge part of the computational mechanics community has embraced isogeometric analysis techniques [8], in which the discretization spaces are defined in terms of NURBS (or simply splines), leading to smoother global spaces. In the opposite direction, discontinuous galerkin (DG) methods have also been actively developed, and novel approaches, like hybridizable DG and Petrov-Galerkin DG methods, have been proposed [9, 10]. As the discretization methods become more and more complex, the efficient implementation of these techniques is more complicated. It also poses a challenge in the design of scientific software libraries, which should be extensible and provide a framework for the (easy) implementation of novel techniques, to be resilient to new algorithmic trends.

The hardware in which scientific codes run evolves even faster. During 40 years, core performance has been steadily increasing, as predicted by Moore’s law. In some years, supercomputers will reach 1 exaflop/s, a dramatic improvement in computational power that will not only affect the extreme scale machines but radically transform the whole range of platforms, from desktops to high performance computing (HPC) clouds. The ability to efficiently exploit the forthcoming 100x boost of computational performance will have a tremendous impact on scientific discoveries/economic benefits based on computational science, reaching almost every field of research. However, all the foreseen exascale growth in computational power will be delivered by increasing hardware parallelism (in distinct forms), and the efficient exploitation of these resources will not be a simple task. HPC architectures will combine general-purpose fat cores, fine-grain many-cores accelerators (GPUs, DSPs, FPGAs, Intel MIC, etc.), and multiple-level disruptive-technology memories, with high non-uniformity as common denominator [11]. This (inevitable) trend challenges algorithm/software design. Traditional bulk-synchronous message passing interface (MPI) approaches are likely to face significant performance obstacles. Significant progress is already being made by MPI+X [12] (with X=OpenMP, CUDA, OpenCL, OmpSs, Kokkos, etc.) hybrid execution models. Going a step further, asynchronous many-task execution models (e.g., Charm++[13], Legion [14], or HPX [15]) and their supporting run-time systems hold great promise [16].

Traditionally, researchers in the field of scientific computing used to develop codes with a very reduced number of developers, e.g., a university department, and a limited life span. The software engineering behind scientific codes was poor. Codes were rigid and non-extensible, and developed for a target application and a specific numerical method. However, the increasing levels of complexity both in terms of algorithms and hardware make the development of scientific software that can efficiently run state-of-the-art numerical algorithms on HPC resources a real challenge. Considering to start from scratch a project of this kind has an ever increasing level of complexity. Furthermore, due to the huge resources required to carry out such a project, it is natural to develop a framework that will be resilient to new algorithmic and hardware trends, in order to maximize life time, and to be applicable to a broad range of applications. In this sense, object-oriented (OO) programming, which provides modularity of codes and data-hiding, is the key for the software design of flexible and scalable (in terms of developers) projects.

There is a number of open source OO FE libraries available through the Internet, e.g., deal.II [17, 18], FEniCS [19], GRINS [20], Nektar++ [21], MOOSE [22], MFEM [23], FreeFem++ [24], and DUNE [25]. In general, these libraries aim to provide all the machinery required to simulate complex problems governed by partial differential equations using FE techniques. In any case, every library has its main goal and distinctive features. Some libraries, like FreeFem++ or FEniCS, have extremely simple user interfaces. FEniCS has its own domain specific language for weak forms to automatically generate the corresponding FE code (preventing -adaptivity) and includes a collection of Python wrappers to provide user-friendly access to the services of the library. Other sophisticated libraries like deal.II or DUNE have a slightly more demanding learning curve. In general, parallel adaptivity is at most partially supported; as far as we know, none of the libraries above have support for parallel -adaptivity, unless DG methods are being used. Some libraries are restricted to a particular cell topology, e.g., deal.II is limited to hexahedral/quadrilateral (n-cubes) meshes, while FEniCS only supports simulations on triangular/tetrahedral (n-simplices) meshes.

In general, these libraries provide modules for some of the different steps in the simulation pipeline, which involves the set-up of the mesh, the construction of the FE space, the integration and assembly of the weak form, the solution of the resulting linear system, and the visualization of the computed solution. The solution of the linear system is clearly segregated from the discretization step in all the scientific software libraries described above (for parallel computations); the linear system is transferred to a general-purpose sparse linear algebra library, mainly PETSc [26, 27, 28], Hypre [29], and Trilinos [30, 31]. As a result, the coupling between the discretization step and the linear solver step is somehow weak, since they rely on general purpose solvers, which usually involve simple interfaces. The strong point of these general purpose numerical linear algebra libraries is to be problem-independent, but it also limits their performance for specific applications, since they cannot fully exploit the underlying properties of the PDE operator and the numerical discretization.111A paradigmatic example is the design of scalable solvers for the discretization of the Maxwell equations using edge elements, which involve the discretization of additional operators (discrete gradients) and changes of basis at the reference FE level [32]. This segregation has a clear impact on the type of methods to be used. This black-box approach to general-purpose linear solvers has favoured the use of algebraic multigrid methods, the de facto linear solver [29]. On the other hand, geometric multigrid methods and domain decomposition (DD) methods, which are very specific to mesh-based PDE solvers, are not common, even though they can be superior to algebraic methods in many cases. A geometric multigrid method that exploits the -adaptive structure of the FE space is included in deal.II, but it can only be used in the serial case. In parallel scenarios, DD methods have certainly evolved during the last decade. Modern DD methods do not (necessarily) rely on a static condensation of the internal variables, which requires sparse direct methods for the local subdomain problems. Instead, inexact solvers can be used, e.g., multigrid methods, and linear complexity DD preconditioners can be defined (see [33, 34]). The definition of two-level DD methods resembles the one of FE methods, by exchanging the FE and subdomain concepts, and their definition is strongly related to the one of multiscale FEs [35]. Furthermore, multilevel extensions can be naturally defined. In short, state-of-the-art multilevel DD methods can be understood (in their inexact version) as a non-conforming multigrid method. Even though the mathematical theory of the DD methods is very sound, high performance implementations are quite recent (see [36, 37, 38]). On the other hand, we are not aware of any general purpose FE code that integrates a DD algorithm in the solution workflow. DD methods require sub-assembled matrices to be used, and are not supported by the majority of the existing advanced OO FE libraries. Analogously, the use of block-preconditioning is in general poorly supported, because it involves the discretization of additional operators to define the approximated Schur complement, and the corresponding block-based assembly of matrices.

On the other hand, based on the supercomputing trends, the segregation between time discretization, linearization, space discretization, and linear system solve, will progressively blur. As an example, nonlinear preconditioning and parallel-in-time solvers are two natural ways to attain the higher levels of concurrency of the forthcoming exascale supercomputers [39, 36]. These facts will complicate even more the rigid workflow of current advanced FE libraries. In this sense, current efforts in PETSc to provide nonlinear preconditioning interfaces can be found in [40], relying on call-back functions, and the XBraid solver [41] aims to provide time-parallelism in a non-intrusive way.

## 2. The Fempar project

In this work, we present FEMPAR, an OO FE framework for the solution of PDEs, designed from inception to be highly scalable on supercomputers and to easily handle complex multiphysics problems. The first public release of FEMPAR has almost 300K lines of code written in (mostly) OO Fortran and makes intensive use of the features defined in the 2003 and 2008 standards of the language. The source code that is complementary to this work corresponds to the first public release of FEMPAR, i.e., version 1.0.0. It is available at a git repository [42]. In particular, the first public release was assigned the git tag FEMPAR-1.0.0, in accordance with the “Semantic Versioning” system.222Available at http://semver.org/.

FEMPAR is very rich in terms of FE technology. In particular, it includes not only Lagrangian FEs, but also curl- and div-conforming ones, e.g., Nédélec (edge) and Raviart-Thomas FEs. The library supports n-cube and n-simplex meshes, and arbitrary high-order bases for all the FEs included. Continuous and discontinuous spaces can be used, providing all the machinery for the integration of DG facet (i.e., edges in 2D and faces in 3D) terms. Recently, in a beta version of the code, B-splines have also been added, together with the support for cut cell methods (using XFEM-type techniques) and -adaptivity, but we will not discuss these developments for the sake of brevity.

Moreover, FEMPAR has been developed with the aim to provide a framework that will allow developers to implement complex techniques that are not well-suited in the traditional segregated workflow commented above. FEMPAR also provides a highly scalable built-in numerical linear algebra module based on state-of-the-art domain decomposition solvers. FEMPAR can provide partially assembled matrices, required for DD solvers; the multilevel BDDC solver in FEMPAR has scaled up to almost half a million cores and 1.75 million MPI tasks (subdomains) in the JUQUEEN Supercomputer [34, 37]. It includes an abstract framework to construct applications and preconditioners based on multilevel nonoverlapping partitions. Even though every block within the library preserves modularity, the interface between discretization and numerical linear algebra modules within FEMPAR is very rich and focused on PDE-based linear systems. In the path to the exascale, FEMPAR has been designed to permit an asynchronous implementation of multilevel methods, both in terms of multiphysics FEs and multilevel solvers, which have been exploited, e.g., in [37]. It is a unique feature that is not available in other similar libraries. The library also allows the user to define blocks in multiphysics applications, that can be used to easily implement complex block preconditioners [43, 44, 45]. All these blocks are very customizable, which has already been used to develop scalable DD solvers for electromagnetics problems and block preconditioners for multiphysics problems, e.g., magnetohydrodynamics [44]. These distinctive features of FEMPAR, however, are not discussed in this article but in a forthcoming one. A general discussion of the main ingredients of our implementation of the discretization step using FE-like approximations is first necessary, which is the purpose of this work.

FEMPAR has already been successfully used in a wide set of applications by the authors of the library: simulation of turbulent flows and stabilized FE methods [46, 47, 48, 49], magnetohydrodynamics [50, 51, 52, 53, 54], monotonic FEs [55, 56, 57, 58, 59], unfitted FEs and embedded boundary methods [60], and additive manufacturing simulations [61]. It has also been used for the highly efficient implementation of DD solvers [62, 34, 63, 37, 64, 65, 66, 39] and block preconditioning techniques [44].

This work is more than an overview article with the main features of the library. It is a detailed description of the software abstractions being used within FEMPAR to develop an efficient, modular, and extensible implementation of FE methods and supporting modules in a broad sense. To this end, we enrich the discussion with code snippets that describe data structures, bindings, and examples of use.333The code snippets are written in advanced OO Fortran 200X [67]. There is a close relationship between these language features and those available in the C++ language [68] and we established some code style rules to emphasize it. In particular, Fortran modules in FEMPAR are always named with the suffix _names, to indicate the analogy with namespaces in C++. Derived types, analog to C structs or C++ classes, are always named with _t to distinguish them from instances. However it should be kept in mind that, whereas structs in C++ are passive data containers and classes are used to carry also methods, Fortran derived data types are used in both cases since the introduction in the 2003 standard of the so called type-bound procedures. This document is intended to be used as a guide for new FEMPAR developers that want to get familiarized with its software abstractions. But it can also be a useful tool for developers of FE codes that want to learn how to implement FE methods in an advanced OO framework. In any case, due to the size of the library itself, many details cannot be exposed, to keep a reasonable article length. The article can be read in different ways, since it is not necessary to fully understand all the preceding sections to grasp the main ideas of a section. For instance, the section about the abstract implementation of polytopes in arbitrary dimensions and its related algorithms is quite technical and a reader that is not particularly interested in the internal design of this type and its bindings implementations can skip it. Experienced FE researchers can skip the short section with the basics of FE methods, and only look at this one (if needed) when referred in subsequent sections.

The article is organized as follows. In Sect. 3 we present a concise mathematical description of the FE framework. The main mathematical abstractions are expressed in software by means of a set of derived data types and their associated TBPs, which are described in subsequent sections. In particular, the main software abstractions in FEMPAR and their roles in the solution of the problem are:

• The polytope, which describes a set of admissible geometries and permits the automatic, dimension-independent generation of reference cells and structured domains. The mathematics underlying the polytope are presented in Sect. 3.14, while its software implementation in Sect. 4.

• The polynomial abstraction and related data types, which are presented in Sect. 3.4 and Sect. 5, respectively. These sections describe how shape functions bases can be generated for arbitrary orders and for n-cube and n-simplex topologies.

• The reference FE in Sect. 6

, which describes the reference cell and defines a set of basis functions and degrees of freedom on each cell.

• The triangulation in Sect. 7, which represents a discrete approximation of the physical domain .

• A set of tools required to perform numerical integration (e.g., quadratures and geometrical maps) produced by the reference FE and described in Sect. 8 for cell integrals and in Sect. 9 for facet integrals.

• The FE space described in Sect. 10, built from a triangulation and a set of reference FEs, which represents a global space of FE functions.

• The discrete integration, an abstract class to be extended by the user to define an affine FE operator, which describes the numerical integration of the weak form of the problem to be solved, described in Sect. 11.2.

• The linear (affine) operator in Sect. 11, whose root is the solution of the problem at hand, constructed using the FE space and a discrete integration.

• An example of a user driver in Sect. 12, in which the different ingredients previously described are used to simulate a problem governed by PDEs, the Stokes system.

A (very simplified) graphical overview of the main software abstractions in FEMPAR and some of their relationships is shown in Fig. 1.

## 3. The Fe framework

In this section, we briefly introduce all the mathematical abstractions behind the FE method for the discretization of problems governed by PDEs. For a more detailed exposition of the topics, we refer to [69, 70, 71, 72]. The FEs described below (and many other not covered herein) can be formulated and analyzed using the finite element exterior calculus framework [6], which makes use of exterior algebra and exterior calculus concepts. In this framework, one can define FEs, e.g., div and curl-conforming ones, in arbitrary space dimensions, using the concept of differential -forms. However, we have decided not to use such presentation of FE methods to simplify the exposition for readers not familiar with these abstractions.

### 3.1. The boundary value problem in weak form

We are interested in problems governed by PDEs posed in a physical domain with boundary . In practice but we are also interested in for some particular applications (see Sect. 3.14). Let us consider a differential operator , e.g., the Laplace operator , and a force term . Let us also consider a partition of into a Dirichlet boundary and a Neumann boundary , and the corresponding boundary data and . The boundary value problem reads as follows: find such that

 Au(x)=f(x) in Ω,BDu(x)=uD(x) on ΓD,BNu(x)=gN(x) on ΓN. (1)

The operator is a trace operator and is the flux operator. Other boundary conditions, e.g., Robin (mixed) conditions can also be considered. We assume that the unknown in (1

) can be a scalar, vector, or tensor field. (The case of multi-field problems is considered in Sect.

3.11.)

For FE analysis, we must consider the weak form of (1). The weak formulation can be stated in an abstract setting as follows. Let us consider an abstract problem determined by a Banach space (trial space), a reflexive Banach space (test space), a continuous bilinear form , and a continuous linear form . The abstract problem is stated as: find such that

 a(u,v)=ℓ(v), for any v∈Y. (2)

The link between the two formulations is the following. Let be the space of functions with compact support in ; the dual space is the space of distributions. We have that:

 a(u,φ)≐⟨Au,φ⟩Ω,ℓ(φ)≐⟨gN,φ⟩ΓN+⟨f,φ⟩Ω, for any φ∈D(Ω),

where the derivatives are understood in distributional sense. E.g., for the Laplace operator, the bilinear form reads . Furthermore, homogeneous Dirichlet boundary conditions, i.e., on , are usually enforced in a strong way; the functions in satisfy these boundary conditions. The extension to non-homogeneous boundary conditions is straightforward. One can define an arbitrary extension of the Dirichlet data, i.e., on . Next, we define the function with zero trace on and solve (2) for with the right-hand side

 ℓ(v)−a(EuD,v). (3)

Let us consider two classical examples.

###### Example 3.1 (Heat equation).

Let us consider the Poisson problem with on and ; is the outward normal. Let us assume that , , , and . Let us also consider an extension such that on . The weak form of the problem reads as: find such that

 ∫Ωκ∇u0⋅∇vdΩ=∫ΩfvdΩ+∫ΓNgvdΓ−∫Ωκ∇EuD⋅∇vdΩ,for any v∈H10(Ω).

The solution is .

###### Example 3.2 (Stokes problem).

The Stokes problem consists on finding a velocity field and a pressure field such that

 −∇⋅(μϵ(u))+∇p=f,∇⋅u=0, (4)

and (for example) on , where is the strain tensor. The weak form of the problem consists of finding such that

 μ∫Ωϵ(u0):ϵ(v)−∫Ω∇⋅vp+∫Ωq∇⋅u0=∫Ωv⋅f−μ∫Ωϵ(EuD):ϵ(v)−∫Ωq∇⋅EuD, (5)

for any , where is an extension of the Dirichlet data, i.e., on . The solution is .

### 3.2. Space discretization with FEs

Problem (2) is an infinite-dimensional problem. In order to end up with a computable one, we must introduce finite-dimensional subspaces with some approximability properties. We restrict ourselves to FE schemes in a broad sense that involve conforming and non-conforming spaces. Thus, our aim is to explicitly build spaces (and ) with some approximability properties. If the discrete spaces are subspaces of the original ones (conforming), i.e., and , the discrete problem reads as: find such that

 a(uh,vh)=ℓ(vh), for any vh∈Yh. (6)

This is the Petrov-Galerkin problem. In the particular case when , we have a Galerkin problem. The previous problem can be ill-posed for some choices of the FE spaces, e.g., using discrete spaces that do not satisfy the inf-sup condition for indefinite problems [5]. In some cases, judiciously chosen perturbations of and , represented with and respectively, can stabilize the problem and make it stable and optimally convergent, circumventing the inf-sup condition restriction. In the most general case, we can describe any FE space as: find such that

 ah(uh,vh)=ℓh(vh), for any vh∈Yh, (7)

replacing the continuous bilinear form by a general discrete bilinear form. One can also define the affine operator

 Fh(uh)=ah(uh,⋅)−ℓh(⋅)∈Y′h, (8)

and state (7) as: find such that . This statement is the one being used for the practical implementation of FE operators in FEMPAR (see Sect. 11).

In order to define FE spaces, we require a triangulation of the domain into a set of cells. This triangulation is assumed to be conforming, i.e., for two neighbour cells , its intersection is a whole -face () of both cells (note that -face refers to a geometrical entity, e.g. cells, faces, edges and vertices for , see Sect. 3.14). In practice, the cells must be expressed as a particular type of mapping over a set of admissible geometries (polytopes, see Sect. 3.14). Thus, for every element , we assume that there is a reference cell and a diffeomorphism . In what follows, we usually use the notation .

The definition of the functional space also relies on a reference functional space as follows: 1) we define a functional space in the reference cell ; 2) we define a set of functions in the physical cell via a function mapping; 3) we define the global space as the assemble of cell-based spaces plus continuity constraints between cells. In order to present this process, we introduce the concept of reference FE, FE, and FE space, respectively.

### 3.3. The Fe concept in the reference and physical spaces

Using the abstract definition of Ciarlet, a FE is represented by the triplet , where is a compact, connected, Lipschitz subset of , is a vector space of functions, and is a set of linear functionals that form a basis for the dual space . The elements of are the so-called degrees of freedom

of the FE. We denote the number of moments as

. The moments can be written as for . We can also define the basis for such that for . These functions are the so-called shape functions of the FE, and there is a one-to-one mapping between shape functions and DOFs. Given a function , we define the

local interpolator

for the FE at hand as

 πK(v)≐∑a∈NΣσa(v)ϕa. (9)

It is easy to check that the interpolation operator is in fact a projection.

In the reference space, we build reference FEs as follows. First, we consider a bounded set of possible cell geometries, denoted by ; see the definition of polytopes in Sect. 3.14. On , we build a functional space and a set of DOFs . We consider some examples of reference FEs in Sect. 3.8, 3.9, and 3.10.

In the physical space, the FE triplet on a mesh cell relies on: 1) a reference FE , 2) a geometrical mapping such that , and 3) a linear bijective function mapping . The functional space in the physical space is defined as ; we will also use defined as . The set of DOFs in the physical space is defined as . Given the set of shape functions in the reference FE, it is easy to check that are the set of shape functions of the FE in the physical space.

The reference FE space is usually a polynomial space. Thus, the first ingredient is to define bases of polynomials; see Sect. 3.4. The analytical expression of the basis of shape functions is not straightforward for complicated definitions of moments; this topic is covered in Sect. 3.5. After that, we will consider how to build global (and conforming) FE spaces in Sect. 3.6, and how to integrate the bilinear forms in the corresponding weak formulation in Sect. 3.7. We finally provide three examples of FEs in Sect. 3.83.9, and 3.10 .

### 3.4. Construction of polynomial spaces

Local FE spaces are usually polynomial spaces. Given an order and a set of distinct points (nodes) in (we will indistinctly represent nodes by their index or position ), we define the corresponding set of Lagrangian polynomials as:

 ℓkm(x)≐Πn∈Nk∖{m}(x−xs)Πn∈Nk∖{m}(xm−xs). (10)

We can also define the Lagrangian basis . This set of polynomials are a basis for -th order polynomials. We note that , for .

For multi-dimensional spaces, we can define the set of nodes as the Cartesian product of 1D nodes. Given a -tuple order , we define the corresponding set of nodes for n-cubes as: . Analogously, we define the multi-dimensional Lagrange basis

 Lk={ℓkm:m∈Nk}, where ℓkm(x)≐Πdi=1ℓkimi(xi). (11)

Clearly, , for .

This Cartesian product construction leads to a basis for the local FE spaces usually used on n-cubes, i.e., the space of polynomials that are of degree less or equal to with respect to each variable . We can define monomials by a -tuple as , and the polynomial space of order as . We have .

The definition of polynomial spaces on n-simplices is slightly different. It requires the definition of the space of polynomials of degree equal or less than in the variables . It does not involve a full Cartesian product of 1D Lagrange polynomials (or monomials) but a truncated space, i.e., the corresponding polynomial space of order is , with . Analogously as for n-cubes, a basis for the dual space of are the values at the set of nodes . It generates the typical grad-conforming FEs on n-simplices.

### 3.5. Construction of the shape functions basis

The analytical expression of shape functions can become very complicated for high order FEs and non-trivial definitions of DOFs, e.g., for electromagnetic applications. Furthermore, to have a code that provides a basis for an arbitrary high order, an automatic generator of shape functions must be implemented. When the explicit construction of the shape functions is not obvious, we proceed as follows.

Let us consider a FE defined by .444In this section, we do not make difference between reference and physical spaces, e.g., using the symbol. In any case, all the following developments are usually performed at the reference FE level. First, we generate a pre-basis that spans the local FE space , e.g., a Lagrangian polynomial basis (see Sect. 3.4). On the other hand, given the set of local DOFs, we proceed as follows. The shape functions can be written as , where are the elements of the pre-basis. By definition, the shape functions must satisfy for . As a result, let us define . We have (using Einstein’s notation):

 σa(ϕb)=σa(Φbcψc)=σa(ψc)Φbc=δab,

or in compact form, , and thus . As a result, . The shape functions are computed as a linear combination of the pre-basis functions.

### 3.6. Global Fe space and conformity

Finally, we must define the global FE space. Conforming FE spaces are defined as: The main complication in this definition is to enforce the conformity of the FE space, i.e., . In fact, the conformity constraint is the one that motivates the choice of and , and as a consequence, . In practice, the conformity constraint must be re-stated as a continuity constraint over FE DOFs. In general, these constraints are implicitly enforced via a global DOF numbering, even though it is not possible in general for adaptive schemes with non-conforming meshes and/or variable order cells, which require more involved constraints.

Let us define by the Cartesian product of local DOFs for all cells. We define the global DOFs as the quotient space of by an equivalence relation . Using standard notation, given , the equivalence class of with respect to is represented with , and the corresponding quotient set is . The set is the set of global DOF and represents the local-to-global DOF map. We assume that the equivalence relation is such that if two elements are such that , then .555This assumption in fact applies for FEs of any kind, since the local functional spaces are already conforming and do not require an equivalence class at the cell level. Using the one-to-one mapping between moments and shape functions, the same operator allows one to define global shape functions . We assume that the choices above are such that they satisfy the conformity constraint, i.e., .

Let us consider an infinite-dimensional space such that 1) and 2) for every function and global DOF , all the local DOFs are such that , i.e., local DOFs related to the same global DOF are continuous among cells. The global interpolator is defined as:

 πXh(v)≐∑K∈ThπK(v)=∑K∈Th∑b∈NΣKσb(v)ϕbK,for v∈~X. (12)

It is easy to check that it is in fact a projector. In any case, we use projection operator to refer to other projectors that involve the solution of a global FE system, e.g., based on the minimization of the or norm.

Below, we provide details about how to choose the local DOFs , the function map , and the equivalence relation such that the conformity property is satisfied for grad, div, and curl-conforming FE spaces. The case of non-conforming methods, e.g., DG methods, can readily be considered. In this case, the conformity constraint is not required, which leads to much more flexibility in the definition of DOFs. On the other side, these schemes require numerical perturbations of the continuous bilinear and linear forms in (7) that involve integrals over the facets of FEs to weakly enforce the conformity. (Facets are -faces, e.g., faces in 3D and edges in 2D).

Once we have defined a basis for the FE spaces and using the FE machinery presented above, every FE function can be uniquely represented by a vector as . In fact, problem (7) can be re-stated as: find such that

 ah(ϕb,ψa)ub=ℓh(ψa),for % any a∈Nh.

We have ended up with a finite-dimensional linear problem, i.e., a linear system. We note that in general, the trial space moments can be different from the ones of the test space, as soon as the cardinality is the same. In matrix form, the problem can be stated as:

 Solve   Au=f,with Aab≐ah(ϕb,ψa),fa≐ℓh(ψa). (13)

Assuming that the bilinear form can be split into cell contributions as , e.g., by replacing by , the construction of the matrix is implemented through a cell-wise assembly process, as follows:

 A[a][b]=∑K∈Th∑a,b∈NΣKAKab≐∑K∈Th∑a,b∈NΣKaK(ϕbK,ψaK). (14)

The FE affine operator (8) can be represented as , i.e., it can be represented with a matrix and a vector of size .

### 3.7. Numerical integration

In general, the local bilinear form can be stated as:

 aK(ϕbK,ψaK)=∫KF(x)dΩ,

where the evaluation of involves the evaluation of shape function derivatives. Let us represent the Jacobian of the geometrical mapping with . We can rewrite the cell integration in the reference cell, and next consider a quadrature rule defined by a set of points/weights , as follows:

 ∫KF(x)dΩ=∫^KF∘Φ(x)|JK|dΩ=∑^xgp∈QF∘Φ(^xgp)w(^xgp)|JK(^xgp)|. (15)

Here, the main complication is the evaluation of . By construction, the evaluation of this functional only requires the evaluation of for some values of the multi-index (idem for the test functions). Usually, in FEs, since higher-order derivatives would require higher inter-cell continuity. The second derivatives, which only have sense for broken cell-wise integrals, are in fact only needed for some method based on stabilization techniques based on the pointwise evaluation of residuals in the interior of cells [46].

Let us consider the case of zero and first derivatives, i.e., the evaluation of and . The values of the shape functions (times the geometrical mapping) on the quadrature points is determined as follows:

 ϕbK∘ΦK(^xgp)=^Ψ(^ϕb)(^xgp), (16)

whereas shape function gradients are computed as:

 ∇ϕbK∘ΦK(^xgp)=∇(^Ψ(^ϕb)∘Φ−1K)∘ΦK(^xgp)=∇^x^Ψ(^ϕb)(^xgp)J−1K(^xgp), (17)

where we have used some elementary differentiation rules and the inverse function theorem in the last equality; represents the gradient in the reference space. Thus, one only needs to provide the values of the Jacobian, its inverse, and its determinant, from one side, and the value of the shape functions and their gradients in the reference space, on the other side, at all quadrature points, to compute all the entries of the FE matrices; second order derivatives can be treated analogously.

Quadrature rules for being an n-cube can readily be obtained as a tensor product of a 1D quadrature rule, e.g., the Gauss-Legendre quadrature. Symmetric quadrature rules on triangles and tetrahedra for different orders can be found, e.g., in [69]. In any case, to create arbitrarily large quadrature rules for n-simplices, one can consider the so-called Duffy transformation [73, 74].

As it is well known, considering n-cube topologies for , Gauss quadratures with points per direction can integrate exactly order polynomials. E.g., for a Lagrangian reference FE of order and an affine geometrical map, we choose per direction to integrate exactly a mass matrix. For n-simplex meshes, we use either symmetric quadratures (if available) or tensor product rules plus the Duffy transformation [73, 74]. The latter case is based on introducing a change of variables that transform our n-simplex integration domain into an n-cube, and integrate on the n-cube using tensor product quadratures. It is worth noting that this change of variables introduces a non-constant Jacobian. The determinant of the Jacobian is of order at most with respect to each variable. To integrate a mass matrix exactly, we must be able to integrate exactly polynomials of order . Therefore, we need to take to exactly integrate mass matrices.

### 3.8. Grad-conforming FEs: Lagrangian (nodal) elements

In this section, we consider one characterization of the abstract FE technology above. First, we are interested in the so-called nodal FEs, based on Lagrange polynomials and DOFs based on nodal values.

Let us consider the same order for all components, i.e., . When the reference geometry is an n-cube, we define the reference FE space as . The set of nodes can be generated, e.g., from the equidistant Lagrangian nodes. Let us define the bijective mapping from the set of nodes to , i.e., the local node numbering. The set of local DOFs are the nodal values, i.e., , for . Clearly, the reference FE shape functions related to these DOFs are . On the other hand, we simply take .

For n-simplices, we consider the reference FE space spanned by the pre-basis and the set of nodes (see Sect. 3.4). The set of local DOFs are the nodal values. Since the pre-basis elements are not shape functions, we proceed as in Sect. 3.5 to generate the expression of the shape functions basis for arbitrary order reference FEs on n-simplices.

The global FE space is determined by the following equivalence relation. The set of local DOFs for n-cubes is due to the one-to-one mapping between DOFs and nodes; we replace the set of nodes by for n-simplices. Furthermore, we say that iff . The implementation of this equivalence relation, and thus, the global numbering, relies on the ownership relation between n-faces and DOFs (e.g., in 3D we can say whether a DOF belongs to a vertex, edge, or face) and a permutation between the local node numbering in to the one in for nodes on . See Sect. 3.14 for more details. With such global DOF definition, it is easy to check that the global FE space functions are and thus grad-conforming.

Since Lagrangian moments involve point-wise evaluations of functions and for , the interpolator (12) is not defined in such space. Instead, we consider that functions to be interpolated belong, e.g., to the space .

When one has to deal with vector or tensor fields, we can generate them as a Cartesian product of scalar spaces as follows. We define the local FE space and the function map . In the vector case, the local DOFs set is represented with , and iff and . Analogously, shape functions are computed as ; represents the -th canonical basis vector of . We proceed analogously for n-simplices.

The verification that two nodes are in the same position is not straightforward. First, for every node in , we can assign an n-face owner (e.g., a vertex, edge, face, or cell); cell DOFs are not replicated. Given a node of cell that belongs to the n-face , it can be determined by an index with respect to and . Analogously, another node that belongs to the same n-face but cell , is represented by . On the other hand, one can define a permutation mapping

 pF(F,K,K′;⋅), (18)

that, given the local index of a node within the n-face with respect to , it provides the index in the n-face with respect to (see Sect. 3.13 and 3.16 for more details). Thus, iff .

### 3.9. Div-conforming FEs

We present the so-called Raviart-Thomas FEs for vector fields [5]; the implementation of Brezzi-Douglas-Marini FEs is analogous. In this case, the order being used is different at every space dimension. Let us start with Raviart-Thomas FEs on n-cubes. In 2D, the space reads as , whereas in 3D it reads as ; the Raviart-Thomas element can in fact be considered for any dimension. The basis for in 3D is composed of two types of DOFs, boundary and interior DOFs, defined as

 1∥^F0∥∫^F0v⋅n∘Φ^FqdΓ,q∈Pk,1∥^K∥∫^Kv⋅ qdΩ,q∈Q(k−1,k,k)×Q(k,k−1,k)×Q(k,k,k−1), (19)

respectively666The test function spaces in the definition of the moments are always considered with respect to the corresponding domain of integration.; the 2D case is straightforward, replacing the space of shape functions for the interior moments by . The definition of the boundary facets involves mappings from a reference facet to all facets of the FE , i.e., . Every boundary moment can be associated to a function in a Lagrangian space, and thus, a node index. As a result, the boundary DOFs can be indexed with a node in (for ) on the corresponding facet , i.e., . We say that iff and . To check whether holds, we can proceed similarly as for Lagrangian elements. The shape functions are built as in Sect. 3.5. We consider a Lagrangian pre-basis for , and compute the shape functions via a change-of-basis. The function mapping reads as follows:

 ^ΨK(v)≐1|JK|JKv; (20)

the mapping is the so-called contravariant Piola transformation. One can check that the definition of this mapping together with the assembly defined above leads to a global FE space that is div-conforming; i.e., its functions have continuous normal component across inter-cell facets. Thus, [5].

On n-simplices, the reference FE space is , for , and the basis for is composed of the following boundary and interior DOFs:

 1∥^F0∥∫^F0v⋅n∘Φ^FqdΓ,q∈Pk,1∥^K∥∫^Kv⋅ qdΩ,q∈[Pk−1]d. (21)

In this case, the generation of the pre-basis is not a Lagrangian FE space of functions, but it can easily be expressed as the span of vector functions with components in a selected subset of .

### 3.10. Curl-conforming FEs

The weak formulation of electromagnetic problems involve the functional space . Conforming FE spaces for must preserve the continuity of the tangential component of the field. The so-called edge elements (or Nédélec elements) are curl-conforming FEs [72]. As Raviart-Thomas elements, the edge elements pre-basis on n-cubes involves different orders per dimension and per component. In 2D, the space reads as , whereas in 3D it reads as . The basis for is composed of three types of DOFs (in 3D), namely edge, face, and interior DOFs, defined as:

 1∥^E0∥∫^E0(v⋅τ)∘Φ^EqdΛ,∀q∈Pk−1, (22) 1∥^F0∥∫^F0(JT^F(v×n))∘Φ^F⋅qdΓ,∀q∈Q(k−2,k−1)×Q(k−1,k−2), (23) 1∥^K∥∫^Kv⋅qdΩ,∀q∈Q(k−1,k−2,k−2)×Q(k−2,k−1,k−2)×Q(k−2,k−2,k−1), (24)

respectively, where the edge map is defined as the one for the face. The boundary DOFs can be indexed by a triplet , where can be an edge or a face in 3D, following the same ideas as for Raviart-Thomas elements. In this case, the function mapping reads as follows:

 ^ΨK(v)≐J−TKv; (25)

the mapping is the so-called covariant Piola transformation, which leads to a global FE space that is curl-conforming [72], i.e., its functions have continuous tangential component across inter-cell facets.

On n-simplices, the space reads as:

 Vk≐[Pk]d+Sk, where Sk≐{v∈[Pk+1]d:v(x)⋅x=0∀x∈^K}. (26)

The basis for in 3D is composed of the following boundary and interior DOFs:777We note that we can take instead of in the definition of the face moments, since the rows of the Jacobian matrix are the transformation of the axes in the reference face to the actual face of the reference cell and the space of test functions is invariant to rotations.

 1∥^E0∥∫^E0(v⋅τ)∘Φ^EqdΛ,∀q∈Pk−1, (27) 1∥^F0∥∫^F0(JT^F(v×n))∘Φ^F⋅qdΓ,∀q∈[Pk−2]2 (28) 1∥^K∥∫^Kv⋅qdΩ,∀q∈[Pk−3]3. (29)

In 2D, only the first two types of DOFs are required, where the first one is now related to facets (edges in 2D) and the second one are interior DOFs owned by the cell. As for Raviart-Thomas elements, the pre-basis functions are not Lagrangian shape functions, but they can again be expressed as the span of vector functions with components in a selected subset of . We refer to [75] for a discussion about the actual generation of a pre-basis for the space (26) in FEMPAR.

### 3.11. Cartesian product of FEs for multi-field problems

Many problems governed by PDEs involve more than one field, e.g., the Navier-Stokes equations or any multi-physics problem. Let us consider a PDE that involves a set of unknown fields , defined as the Cartesian product of functional spaces. We can proceed as above, and define a FE space for every field space separately, leading to a global FE space defined by composition of FE spaces. To define the global numbering of DOFs in the multi-field case, we consider that two DOFs are equivalent if they are related to the same field and satisfy the equivalence relation of the FE space of this field.

The Cartesian product of FE spaces is enough to define volume-coupling multi-physics problems governed on the same physical domain, i.e., the different physics are defined on the whole domain and coupled through volume terms in the formulation. However, many multi-physics problems are interface-based, i.e., the coupling between different physics that are defined on different subdomains is through transmission conditions on the interface. This is the case, e.g., of fluid-structure problems (see, e.g., [76, 77, 78, 79]). In these cases, different FE spaces could be defined on different parts of the global mesh, i.e., one must describe the set of subdomains of the whole domain in which the corresponding FE spaces are defined.

### 3.12. Non-conforming methods

Up to now, we have considered a global FE space that is conforming, i.e., . Alternatively, one can consider FE schemes that are not conforming. Since the original bilinear form has no sense in general for a non-conforming FE space , one shall consider a stabilized bilinear form that is well-posed (stable and continuous) in the discrete setting. In general, these schemes replace the required inter-cell continuity for conformity by a weak imposition of such continuity. Thus, the inter-cell continuity is imposed weakly through penalty-like terms. DG methods are schemes of this type [71].

In one sense, non-conforming FE spaces are simpler than conforming ones, since the conformity is not required; one has more flexibility in the definition of local DOFs and the equivalence class concept is not needed, since a DOF never belongs to more than one element. However, the bilinear form usually requires the integration of facet terms, i.e., terms of the type:

 ∑F∈Fh∫FF(x)dΩ.

The integration of facet terms is far more complicated than cell terms.

Let us first briefly illustrate a simple application of non-conforming methods, namely the FE discretization of the Poisson problem using the so-called interior penalty (IP) family of DG formulations [71]. Dirichlet boundary conditions constraints, say on the whole boundary of the domain , are to be weakly imposed, as it is natural in such kind of formulations. The global discrete trial space is composed of functions that are continuous within each cell, but discontinuous across cells, i.e., , and the discrete test space . If we denote and as the set of interior and boundary facets of , respectively, the discrete weak form underlying this family of methods reads as: find such that

 (30)

where is a fixed constant that characterizes the particular method at hand, is a facet-wise positive constant referred to as penalty parameter, and denotes the surface of the facet; and should be suitably chosen such that the bilinear form on the left-hand side of (30) is well-posed (stable and continuous) in the discrete setting, and the resulting FE formulation enjoys optimal rates of convergence [71]. Finally, if we denote as and the two cells that share a given facet, then and