Convergence analysis of inexact two-grid methods: A theoretical framework

07/24/2020 ∙ by Xuefeng Xu, et al. ∙ Chinese Academy of Science Purdue University 0

Multigrid methods are among the most efficient iterative techniques for solving large-scale linear systems that arise from discretized partial differential equations. As a foundation for multigrid analysis, two-grid theory plays an important role in understanding and designing multigrid methods. Convergence analysis of exact two-grid methods (i.e., the Galerkin coarse-grid system is solved exactly) has been well developed: the convergence factor of exact two-grid methods can be characterized by an identity. However, convergence theory of inexact ones (i.e., the coarse-grid problem is solved approximately) is still less mature. In this paper, a theoretical framework for the convergence analysis of inexact two-grid methods is developed. More specifically, two-sided bounds for the energy norm of the error propagation matrix are established under different approximation conditions, from which one can readily get the identity for the convergence factor of exact two-grid methods.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Multigrid is a popular and effective solver for systems of linear equations stemming from discretized partial differential equations. For a large class of linear systems, it has been proved to possess uniform convergence with (nearly) optimal complexity (i.e., it requires about work for a linear system with unknowns); see, e.g., [7, 23, 24]. The fundamental module of multigrid is a two-grid scheme, which involves two alternate processes: the smoothing (or local relaxation) step and the coarse-grid correction step. The optimality is achieved when the smoothing and coarse-grid correction steps are complementary.

Typically, the smoothing step is a stationary iterative procedure, such as the (weighted) Jacobi-type and Gauss–Seidel-type iterations. These classical methods are generally effective to eliminate the high-frequency (i.e., oscillatory) error components, whereas the low-frequency (i.e., smooth) parts cannot be effectively eliminated [7, 23]. To remedy this defect, the coarse-grid correction step is designed to reduce the low-frequency error components by solving a coarse problem with much fewer unknowns (the number of these unknowns is denoted by ). The coarse-grid correction step involves two intergrid operators that transfer information between fine- and coarse-grids: a restriction matrix

that restricts the fine-grid residual to the coarse-grid; a prolongation (or interpolation) matrix

with full column rank that extends the correction computed on the coarse-grid to the fine one. Usually, is taken to be the transpose of (as considered in this paper). The Galerkin coarse-grid matrix is then defined as , which gives the coarse representation of the fine-grid matrix .

Most of the existing literature on two-grid theory (see, e.g., [10, 35, 18, 30, 6]) focus on exact two-grid methods with some exceptions like [17, 24]. A powerful identity has been established to characterize the convergence factor of exact two-grid methods [29, 10]. To design a well converged two-grid method, it is not necessary to solve the coarse problem exactly, especially when the problem size is still large. Multigrid is typically a recursive call (e.g., the V- and W-cycles) of the two-grid scheme and hence can be treated as an inexact two-grid scheme. As is well known, two-grid convergence is often sufficient to assess the convergence of the W-cycle multigrid methods; see, e.g., [11, 23]. With the aid of the hierarchical basis idea [2] and the minimization property of Schur complement (see, e.g., [1, Theorem 3.8]), Notay [17]

derived a convergence estimate for inexact two-grid methods. Based on this estimate, he also showed that, if the convergence factor of exact two-grid methods is uniformly bounded by

, then the convergence factor of the corresponding W-cycle multigrid method is bounded by .

Besides theoretical considerations, two-grid theory can also guide the design of multigrid algorithms. The implementation of multigrid scheme on large-scale parallel machines is still a challenging topic, especially in the era of exascale computing. For instance, stencil sizes (the number of nonzero entries in a row) of the standard Galerkin coarse-grid matrices tend to increase further down in the multilevel hierarchy of algebraic multigrid methods [5, 3, 19], which will increase the communication costs. As problem size increases and the number of levels grows, the overall efficiency of parallel algebraic multigrid methods may decrease dramatically. To maintain multigrid convergence and improve parallel efficiency, some sparse approximation strategies for have been proposed; see, e.g., [4, 22, 21, 8]. Motivated by the convergence analysis in [17], Falgout and Schroder [8] proposed a non-Galerkin coarsening strategy to improve the parallel performance of algebraic multigrid algorithms.

In this paper, we present a systematic convergence analysis of inexact two-grid methods. Two-sided bounds for the energy norm of the error propagation matrix are established in a purely algebraic manner. Our main results include the following three types of estimates.

  • The first one (3.3) is a general convergence estimate, which slightly improves the existing one in [17, Theorem 2.2]. This estimate is valid for any symmetric and positive definite (SPD) coarse-grid matrix . In practice, we are more interested in the situation that is a suitable approximation to , which motivates the next two estimates.

  • The second one (3.10) is established under the approximation condition


    where is a symmetrized smoother defined by (2.6), , and . Clearly, the condition (1.1) measures how far deviates from by reference to the restricted smoother (it can be viewed as an approximation to ).

  • The third one (3.44) is established under the “relative error” condition


    where and . A special case of the condition (1.2) () appeared in [24, Page 145].

It is worth mentioning that our estimates generalize the identity for the convergence factor of exact two-grid methods [10].

The rest of this paper is organized as follows. In Section 2, we first introduce some fundamental matrices involved in the analysis of two-grid methods, and then present the identity for the convergence factor of exact two-grid methods. In Section 3, we establish the convergence theory of inexact two-grid methods, which mainly contains three types of estimates. In Section 4, we give some concluding remarks.

2. Preliminaries

In this section, we introduce some algebraic properties of two-grid methods, which play an important role in the convergence analysis of inexact two-grid methods. For convenience, we first list some notation used in the subsequent discussions.

  • denotes the identity matrix (or when its size is clear from context).

  • denotes the -th eigenvalue of a matrix (assuming that the eigenvalues are algebraically arranged in the same order throughout this paper).

  • , , and stand for the smallest eigenvalue, the smallest positive eigenvalue, and the largest eigenvalue of a matrix, respectively.

  • denotes the spectrum of a matrix.

  • denotes the spectral radius of a matrix.

  • denotes the spectral norm of a matrix.

  • denotes the spectral condition number of a matrix.

  • denotes the energy norm induced by an SPD matrix : for any , ; for any , .

2.1. Two-grid methods

Consider solving the linear system


where is SPD, , and . Given an initial guess and a nonsingular matrix , we perform the smoothing process


where is called a smoother and is the residual at the -th iteration. Let . From (2.2), we have

which yields



If , we deduce from (2.3) that, for any initial error

, the error vector

tends to zero as . Since

a sufficient and necessary condition for the iteration (2.2) to be -convergent (i.e., ) is that is SPD.

In view of the -convergent smoother , we define


which is often referred to as a symmetrized smoother. It is easy to check that


Interchanging the roles of and in (2.4) yields another symmetrized smoother


which satisfies


According to (2.5) and (2.7), we deduce that both and are symmetric and positive semidefinite (SPSD).

The following lemma provides two useful relations between the symmetrized smoothers and (see [18, Lemma 1]).

Lemma 2.1.

Let and be defined by (2.4) and (2.6), respectively. Then


Let be a prolongation (or interpolation) matrix with full column rank, where is the number of coarse variables. Let be a restriction matrix. The Galerkin coarse-grid matrix is then denoted by . For a given initial guess , the standard two-grid scheme (i.e., the presmoothing and postsmoothing steps are performed in a symmetric way) for solving (2.1) can be described as Algorithm 1. If the coarse-grid matrix is chosen as , then Algorithm 1 is called an exact two-grid method; otherwise, it is called an inexact two-grid method.

1:Presmoothing: is SPD
2:Restriction: has full column rank
3:Coarse-grid correction: is SPD
Algorithm 1  Two-grid method

From Algorithm 1, we have



is called the iteration matrix (or error propagation matrix) of Algorithm 1. It can be rewritten as




Obviously, is an SPD matrix, which is called inexact two-grid preconditioner. From (2.11), we deduce that


2.2. Convergence of exact two-grid methods

The convergence theory of exact two-grid methods has been well studied in the literature. For readers interested in its algebraic analysis, we refer to [24, 13, 18] and the references therein.

For the special case , we denote the iteration matrix by


which can be written as




It is easy to see that is an SPD matrix, which is called exact two-grid preconditioner.

The following theorem gives an identity for the convergence factor of Algorithm 1 with  [10, Theorem 4.3], which is the so-called two-level XZ-identity [29, 35].

Theorem 2.2.

Let be defined by (2.6), and define


The convergence factor of exact two-grid methods can be characterized as




The matrix defined by (2.17) is an -orthogonal projection (i.e., is orthogonal with respect to the inner product ) onto the coarse space . Similarly, we define a useful -orthogonal projection onto :


For a fixed smoother (e.g., the weighted Jacobi or Gauss–Seidel smoother), an optimal interpolation can be obtained by minimizing

. Unfortunately, the optimal interpolation is typically expensive to compute, because it requires explicit knowledge of eigenvectors corresponding to small eigenvalues of the eigenvalue problem

; see [30, 6] for details.

To maintain two-grid convergence and design an interpolation with simple structure, one can minimize an upper bound of . Let , where and . Obviously, is a projection onto . Then

which, together with (2.18), yields

By minimizing over all interpolations, one can obtain the so-called ideal interpolation [9, 33], which provides new insights for designing an interpolation with sparse or simple structure (see, e.g., [15, 16, 33, 14]). In particular, if is taken to be , then . Hence, the ideal interpolation can be viewed as a “relaxation” of the optimal one. Furthermore, a quantitative relation between and can be found in [33].

3. Convergence analysis

In this section, we establish the convergence theory of inexact two-grid methods. More specifically, two-sided bounds for the energy norm of the iteration matrix are derived under different approximation conditions.

3.1. Convergence estimate of the first kind

The first estimate (see (3.3) below) is a general convergence result, which does not need any additional conditions on except for its positive definiteness.

We first prove an important lemma, which gives two relations between the extreme eigenvalues of and .

Lemma 3.1.





From (2.12) and (2.16), we have

Analogously, it holds that


which yields

Thus, the inequality (3.1) holds. The inequality (3.2) can be proved similarly. ∎

Remark 3.2.

It is easy to see that

From (3.1) and (3.2), we deduce that

which are the results derived by Notay [17, Theorem 2.2]. It is worth noting that the specific form of is not used in the proof of Lemma 3.1. As the results in [17], here does not have to be the Galerkin-type.

The expression (2.14) implies that is an SPSD matrix and

Since , the matrix is also SPSD and

Due to

it follows that

Hence, the estimates (3.1) and (3.2) become

which, together with (2.13), yield the following convergence estimate.

Theorem 3.3.

The convergence factor of Algorithm 1 satisfies that


The only assumption on the coarse-grid matrix is its positive definiteness. Hence, the estimate (3.3) is valid for any SPD matrix . Nevertheless, to design a well converged two-grid method, we are more interested in the situation that is a suitable approximation to . In what follows, we focus on the convergence analysis of Algorithm 1 under general approximation conditions. These conditions arise from measuring the difference between and .

3.2. Convergence estimate of the second kind

In light of (2.12), we can derive the following explicit expression for .

Lemma 3.4.

The inexact two-grid preconditioner can be expressed as


Using (2.12) and the Sherman–Morrison–Woodbury formula [20, 25, 31], we obtain

By (2.8) and (2.9), we have


The relation (2.9) implies that


Combining (3.5) and (3.6), we can arrive at the expression (3.4) immediately. ∎

Remark 3.5.

In particular, if , we get from (3.4) that


from which one can easily see that is SPSD.

The following lemma provides some useful eigenvalue identities, which play an important role in the subsequent convergence analysis.

Lemma 3.6.

The extreme eigenvalues of and have the following properties:


Since is an SPSD matrix and is an -orthogonal projection, we have

which yields

Similarly, we have

Due to

there exists a nonsingular matrix such that

Let be partitioned into the block form

where , , , and . Straightforward computations yield

Hence, the identities (3.8a), (3.8c), and (3.8d) hold.

In addition, using (2.7), (3.7), and the relation , we obtain

which yields the identity (3.8b). ∎

We are now in a position to present the convergence estimate of the second kind, which is based on characterizing the difference by reference to .

Theorem 3.7.

Let and . If the coarse-grid matrix satisfies






The proof is divided into two parts: the first part follows directly from (3.3); the second one is based on (2.13), (3.4), and Lemma 3.6.

Part I: From (3.9), we deduce that and are SPSD matrices. Hence,

An application of (3.3) yields


Part II: The relation (2.13) can be rewritten as


In order to establish two-sided bounds for , we need to estimate the extreme eigenvalues and . By (3.4), we have

which leads to


where we have used the relation (2.7).

(i) The positive semidefiniteness of implies that

is SPSD. Since is also SPSD, the matrix

is SPSD. This leads to, for any ,


In particular, we have


Using the Weyl’s theorem in matrix theory (see, e.g., [12, Theorem 4.3.1]), (3.8a), and (3.8d), we obtain


In view of (3.13), (3.15), and (3.16), it holds that