Saddle-point problems are ubiquitous in applied mathematics.1 Their importance motivates the development of effective parallel solvers. Block preconditioners and monolithic multigrid methods are established approaches for solving linear (or linearized) saddle-point problems. Block preconditioning is highly effective when the Schur complement of the system is well understood; for the Stokes equations, the Schur complement is spectrally equivalent to a weighted mass matrix, forming the basis for efficient solvers that use multigrid for the viscous term.2, 3 Monolithic methods that apply multigrid to the entire system can also offer superb efficiency if an appropriate relaxation can be devised. For example, Adler et al. 4 proposed a monolithic multigrid method with Braess-Sarazin relaxation for the Stokes equations that provided the fastest time to solution when compared with several block preconditioners and other monolithic multigrid methods. While many block preconditioners have been successfully employed in (massively) parallel computing environments 5, 6, the same cannot be said for monolithic multigrid, whose parallelization was largely absent from the literature until recently 7, 8, 9.
Common approaches to multigrid relaxation for coupled systems are distributive relaxation 10 (which relies on continuum commutativity that may not hold at the discrete level), Braess-Sarazin relaxation, 11, 12 and Vanka relaxation. 13 Based on distributive relaxation, Wang and Chen 14 developed a least squares commutator distributive Gauss-Seidel relaxation for the Stokes equations. Furthermore, this technique has been extended to the Oseen problem by Chen et al. 15 Braess-Sarazin relaxation is known to be highly efficient, and has been applied to nematic liquid crystals,16 magnetohydrodynamics, 12 and other coupled systems. Considering parallel computation, recently, He and MacLachlan presented a local Fourier analysis (LFA) for both distributive weighted Jacobi and Braess-Sarazin relaxations for the Stokes equations discretized by the Marker-and-Cell finite-difference scheme 17 and by mixed finite-element methods, 18 showing the power of LFA for designing efficient algorithms.
Vanka-type relaxation has been used in many contexts, such as for the Navier–Stokes equations, 13, 19 and extended to Vanka–like schemes for other problems or to improve performance. 4, 12, 7, 20, 21, 22 However, Vanka relaxation is typically considered in its multiplicative variant. This seems overly constraining, particularly in comparison to Braess-Sarazin relaxation, which can naturally be done in additive form.17, 18 While multiplicative Vanka relaxation is very efficient, the cost is also high and it is not suitable for parallel computation. We therefore consider additive variants of Vanka-type relaxation in this work.
There are two challenging choices to be made for Vanka relaxation, which are observed to be more critical in the additive setting. First, many choices are possible for the underlying patches within the overlapping Schwarz framework. While we would naturally choose small patches for efficiency or large patches for effectiveness, no general results are known. Secondly, relaxation weights play an important role in ensuring best possible performance of the multigrid algorithm, particularly for additive methods. Thus, there is a need for analysis to inform the algorithmic choices, and LFA seems well-suited. LFA has already been applied to Vanka relaxation in the multiplicative23, 24, 25, 26, 27 and multicoloured28 contexts; here, in contrast, we aim to develop LFA for additive schemes and use it to drive parameter choices in practical experiments for the Stokes equations. To our knowledge, this is the first time that LFA has been applied to additive overlapping Vanka relaxation.
We consider the Stokes equations as a model problem, with both and discretizations. We propose two constructions of the patches for Vanka relaxation, and two approaches to determining relaxation weights. It is shown that using weighting based on patch geometry outperforms a simpler approach. We also find that using small patches with low-degree Chebyshev iterations leads to more efficient multigrid algorithms than with bigger patches or more relaxation steps per iteration, when cost per sweep is accounted for. Although there are no general rules to facilitate the choice of patches or weights, taking advantage of LFA, we can optimize the weights. For validation, our numerical tests are implemented using Firedrake29 and PETSc.30, 31 Numerical experiments are shown to match the LFA predictions for both periodic and Dirichlet boundary conditions. We observe that performance is less sensitive to overestimates of the weights for relaxation schemes considered here, which has also been seen in other works. 4, 12 Last but not least, we compare the cost and performance of relaxation schemes considered here.
This paper is organized as follows. In Section 2, we introduce the and discretizations considered here and the multigrid framework with additive Vanka relaxation for the Stokes equations. In Section 3, we first give an introduction to LFA, then propose an LFA for the Stokes equations with additive Vanka relaxation. In Section 4, two types of overlapping patches are considered, and we validate the LFA predictions with two-grid experiments. Conclusions and remarks are given in Section 5.
2 Discretization and solution of the Stokes equations
2.1 Mixed finite-element discretization of the Stokes equations
In this paper, we consider the Stokes equations,
is the velocity vector,is the scalar pressure of a viscous fluid, and represents a (known) forcing term, together with suitable boundary conditions.
The natural finite-element approximation of Problem (1) when coupled with Dirichlet boundary conditions on on some portion of the domain boundary is: Find and such that
and , are finite-element spaces. Here, satisfies homogeneous Dirichlet boundary conditions in place of any non-homogenous essential boundary conditions on . Problem (2) has a unique solution only when and satisfy an inf-sup condition. 32, 33, 34, 35
If considering an outflow boundary condition, the stress-divergence form of the viscous term should be used instead36. The framework for LFA presented in this paper can easily be extended to this situation; however, the two-grid error-propagation operator (and its LFA representation) depends directly on the stencils of the discretized operators. Thus, considering this form instead of (2) may affect the optimal choice of parameters and the resulting performance of the two-grid method.
Here, we consider two types of stable finite-element methods for the Stokes equations. First, we consider the stable mixed approximation for structured meshes of triangular elements using continuous quadratic approximations for the velocity components and continuous linear approximations for the pressure, the approximation.32 Secondly, we consider the stable approximation for rectangular meshes, using continuous biquadratic approximations for the velocity components and continuous bilinear approximations for the pressure, the (Taylor–Hood) approximation. Both approximations can be represented via nodal basis functions, as illustrated in Figure 1.
Meshes and finite-element degrees of freedom (see definitions in (7)) , with denoting -type and DoFs, denoting -type DoFs, denoting -type DoFs, and denoting -type DoFs. At left, discretization on triangles. At right, discretization on quadrilaterals.
Discretizations of (1) typically lead to linear systems of the form
where corresponds to the discretized vector Laplacian, and is the negative of the discrete divergence operator. If the discretization is naturally unstable, then is the stabilization matrix, otherwise . 32 For the stable and finite-element discretizations considered here, we take .
2.2 Monolithic multigrid for the Stokes equations
) using a monolithic multigrid iteration applied to the full system collectively with a suitable (coupled) relaxation method. As is typical for geometric multigrid, a relaxation technique is employed to quickly damp all oscillatory components of the error. Subsequently, a coarse-grid correction scheme, where a projected problem is solved on a coarser grid and the solution is interpolated as an error correction to the fine-grid approximation, is used to damp the smooth components of the error. In order to describe monolithic multigrid, assume we have two meshes, with fine-grid meshsizeand coarse-grid meshsize (often, , by doubling the meshsize in each spatial direction).
For a general nonsingular linear system, , we consider a stationary iteration as the relaxation scheme. Given an approximation, , to that can be inverted easily, the approximate solution is updated via the iteration
The matrix is the error propagation operator for relaxation. With restriction and interpolation operators, and , respectively, and coarse-grid matrix , the two-grid error propagation operator corresponding to the relaxation scheme in (4) can be written as
where is called the coarse-grid correction operator.
The Jacobi, Gauss-Seidel, and Richardson schemes are often used for relaxation, particularly for discretizations of scalar PDEs. For the restriction operator, , there are many choices, which depend on the problem under consideration. Here, we focus on choices of tied to the mesh and the particular discretization scheme used to generate . The coarse-grid operator, can be the Galerkin operator, , or the natural rediscretization operator (or any other choice). The interpolation operator, , is usually taken to be the conjugate transpose of , with scaling depending on the discretization scheme and the dimension of the considered problem. For more details on the choice of multigrid components, see 37, 38, 39.
If we solve the coarse-grid problem recursively by the two-grid method, then we obtain a multigrid method. Over the past decades, a variety of types of multigrid methods have been developed, including and –cycles 38. In this paper, we focus on using additive Vanka-type relaxation in combination with a monolithic multigrid method to solve (3). This means that is constructed by the Vanka approach, and updates both components of the solution to (3) at the same time in the relaxation scheme. Only two-grid schemes are considered.
2.2.1 Overlapping Schwarz relaxation
Here, we present the multiplicative and additive Schwarz approaches to solve Let the degrees of freedom (DoFs) of be the set , and , be subsets of unknowns with . Let be the restriction operator mapping from vectors over the set of all unknowns, , to vectors whose unknowns consist of the DoFs in . Then is the restriction of to the -th block of DoFs. Moreover, let for be a diagonal weight matrix for each block , where is the dimension of . Then, the multiplicative and additive Schwarz iterations are presented in Algorithm 1.
Multiplicative Schwarz iteration:
. For ,
Additive Schwarz iteration:
The error-propagation operator for the multiplicative Schwarz procedure can be written as
and, for the additive Schwarz procedure, it is
More details about this algebraic viewpoint on the multiplicative and additive Schwarz iterations can be found, for example, in the work of Saad. 40 Overlapping multiplicative Schwarz approaches have been used as the relaxation scheme for the Stokes equations,4, 13, 7, 20, 21, 22, 23, 24 which we refer to as multiplicative Vanka relaxation. Extending this, we will refer to such overlapping additive Schwarz approaches as additive Vanka relaxation. In this paper, we focus on using additive Vanka as a relaxation scheme within monolithic multigrid for the Stokes equations. Some key questions in doing this are
How should the subsets be chosen?
How should be chosen?
In what follows, we consider uniform meshes for the domain, . We will use the pressure DoFs to “seed” the sets, , so that (away from the boundary) all sets will have the same structure and size . In this paper, we use local Fourier analysis to guide the choice of and other aspects of the relaxation scheme.
3 Local Fourier Analysis
3.1 Definitions and notations
We refer to and as the -, -, -, and -type points on the grid , respectively, see Figure 1. The coarse grids, , are defined similarly. Note that in much of the literature, LFA is applied to discretizations on . Here, we consider the more general case as needed for and finite elements.
Let be a scalar Toeplitz operator defined by its stencil acting on as follows,
with constant coefficients , where is a function in . Here, is a finite index set. Because is formally diagonalized by the Fourier modes , where and , we use as a Fourier basis with (or any pair of intervals with length ).
For smoothing and two-grid analysis, we have to distinguish high and low frequency components on with respect to .37 Note that for any ,
if and only if . This means that only those frequency components, , with are distinguishable on . Thus, high and low frequencies for standard coarsening () are given by
We call the symbol of . Note that for all functions ,
For a relaxation scheme, represented by matrix for operator , the error-propagation operator for relaxation can be written as
where represents parameters within . A typical relaxation scheme often reduces high-frequency error components quickly, but is slow to reduce low-frequency errors. Thus, it is natural to define the smoothing factor as follows.
The error-propagation symbol, , for relaxation scheme on the infinite grid satisfies
for all , and the corresponding smoothing factor for is given by
In many cases, the LFA smoothing factor offers a good prediction of actual two-grid performance, and we can optimize the smoothing factor with respect to the parameters, , to obtain an efficient algorithm. However, this is generally not true for higher-order finite-element approximations.18, 42, 24 Thus, we next introduce two-grid LFA, which still offers good predictions of performance in this setting.
In many applications of LFA, we consider a system operator rather than the discretization of a scalar PDE, and is a block smoother. However, the definition of symbol and smoothing factor presented here can be extended to a system easily. Details on these extensions will be presented in section 3.3.
3.2 Two-grid LFA
In general, LFA smoothing analysis gives a good prediction for the actual multigrid performance, under the assumption that we have an “ideal” coarse-grid-correction operator that annihilates low-frequency error components and leaves high-frequency components unchanged. However, in our setting, this assumption about ideal coarse-grid correction (CGC) does not hold (due to the discretization 42), but the two-grid LFA convergence factor still offers useful predictions.
To apply LFA to the two-grid operator, (5), and calculate the two-grid convergence factor, we need to analyse how the operators and act on the Fourier components . From (9), we know that values of coincide on for four values of , known as harmonic frequencies. Let
For a given , we define the four-dimensional harmonic space
Under standard assumptions, the space is invariant under the two-grid operator . 37, 41 We use the ordering of for the four harmonics in the following, although, as with any invariant subspace, the ordering of the basis elements is unimportant.
Inserting the representations of into (5), we obtain the Fourier representation of two-grid error-propagation operator as
in which stands for the block diagonal matrix with diagonal blocks, , and .
The asymptotic two-grid convergence factor, , is defined as
In practical use, we typically consider a discrete form of , denoted by , resulting from sampling over only a finite set of frequencies. In many cases, provides a sharp prediction of actual two-grid performance. It is well known, for example, that LFA gives the exact two-grid convergence factor for problems with periodic boundary conditions.38 The calculation of is much cheaper, however, than direct calculation of from (5). More importantly, since is a function of the parameters, , arising from the relaxation scheme (or the coarse-grid correction), we can optimise to achieve an optimally efficient algorithm. In our setting, such parameters appear in the diagonal scaling matrices mentioned in Section 2.2.1. One of our goals in this paper is to use LFA to optimise the two-grid convergence factor of multigrid when using such relaxation schemes.
Next, we provide details on LFA for additive Vanka relaxation for the Stokes equations. Considering practical use, we first examine the stable discretization, as is easily generated using general-purpose FEM tools on simplicial meshes, such as Firedrake29 and FEniCS.43, 44 We will also show LFA predictions for additive Vanka relaxation for the discretization. The details of the symbols for the discretization of the Stokes equations are presented in the work of He and MacLachlan 18, so we only present details of these symbols for the case herein.
3.3 LFA for
In what follows, we consider the discretized Stokes equations, which read
For the discretization, the degrees of freedom for velocity are located on , containing four types of meshpoints as shown at the left of Figure 1. The Laplace operator in (11) is defined by its weak form restricted to the finite-element basis functions. Here, we use the standard nodal basis and, consequently, can write this in stencil form, extending (8), with taken to be a finite index set of values, with , , , and . With this, the (scalar) discrete Laplace operator is naturally treated as a block operator, and the Fourier representation of each block can be calculated based on Definition 3.1, with the Fourier bases adapted to account for the staggering of the mesh points. Thus, the symbols of and are matrices. Similarly to the Laplace operator, both terms in the gradient, and , can be treated as ()-block operators. Then, the symbols of and are matrices, calculated based on Definition 3.1 adapted for the mesh staggering. The symbols of and are the conjugate transposes of those of and , respectively. Accordingly, is a matrix for the discretization.
We denote the symbols of the finite-element discretizations of the Stokes equations as
Next, we discuss the stencils and symbols for the operators in (11). For the Laplace operator, the stencil can be split into four types which correspond to the -, -, -, and -type points, shown in Figure 2. For the -type, the stencil is a rotation of that of -type, so we do not include it.
Rewriting the stencils shown in Figure 2, we can write the four stencils of as follows,
Note that each stencil connects multiple types of meshpoints, so we further split each stencil into four substencils based on the type of DoFs. Taking as an example, we see it connects three of the four types of meshpoints. Thus, can be written as , where
By standard calculation based on Definition 3.1, we have
Similarly, can be treated in this way. Thus, the symbol of and can be written as
Similarly to the stencil of the Laplacian operator, the stencils of and can be split into four types of substencil, respectively. Figure 3 shows the stencils of the gradient, that is, the pressure-to-velocity unknowns (-, -, - and -type) connections, for the pressure unknown located at the middle of the hexagon.
Thus, the stencil of shown in Figure 3 can be written as . However, here, we calculate the stencils of and its symbols given by
Similarly to , the symbol of the stencil of can be written as
Rodrigo et al. 45
presented a framework for LFA for edge-based discretizations on triangular grids. They consider an expression for the Fourier transform in a non-orthogonal coordinate system in space and frequency variables to adapt to arbitrary structured triangular meshes. This idea can be applied to the discretization considered here. However, for the triangle mesh considered here, it is not necessary to use non-orthogonal coordinates. In the framework of LFA provided here, we use a different Fourier basis to be consistent with the different types of stencil located on different types of grid-points, which simplifies the calculation.
3.4 LFA representation of Grid-transfer operators
Here, we use the standard finite-element interpolation operators and their transposes for restriction. In the following, we discuss the stencils and symbols of these restriction and interpolation operators.
To derive symbols for the grid-transfer operators, we first consider an arbitrary restriction operator characterized by a constant coefficient stencil . Then, an infinite grid function is transferred to the coarse grid, , in the following way
Taking to be the Fourier mode, , we have
with , which is called the symbol of . However, since we consider discretizations on staggered meshes, where different “types” of variables interact in the interpolation and restriction operators, the symbol definition for the restriction operator acting on , where , must be modified.
Similarly to the stencils of , the restriction operator for the components of velocity can also be decomposed based on the partitioning of the DoFs associated with the -, -, -, and -type meshpoints. Each of these restriction operators connects between all four types of meshpoints, and we partition each restriction operator into four blocks based on the DoFs. For , where , by standard calculation,42 (12) becomes
While appears in the above formulation, it serves only to indicate which type of DoF is acting on, since
Thus, it is natural to give the following general definition of a restriction symbol on a staggered mesh. We call the restriction symbol of . We emphasize that we must first split the restriction operator into the different types of DoFs that it restricts from and to before we can apply Definition 3.4.
We first consider restriction to the -type DoFs of a function, which can be split into four blocks,
The -to- connection is
where the denotes the position (on the coarse grid) at which the discrete operator is applied. From Definition 3.4, . The -to- connections yield the stencil
By standard calculation, we have
Similarly, the -to- connection has the stencil
with its symbol
The -to- connection has the stencil
with its symbol