Multigrid is a popular and effective solver for systems of linear equations stemming from discretized partial differential equations. For a large class of linear systems, it has been proved to possess uniform convergence with (nearly) optimal complexity (i.e., it requires about work for a linear system with unknowns); see, e.g., [7, 23, 24]. The fundamental module of multigrid is a two-grid scheme, which involves two alternate processes: the smoothing (or local relaxation) step and the coarse-grid correction step. The optimality is achieved when the smoothing and coarse-grid correction steps are complementary.
Typically, the smoothing step is a stationary iterative procedure, such as the (weighted) Jacobi-type and Gauss–Seidel-type iterations. These classical methods are generally effective to eliminate the high-frequency (i.e., oscillatory) error components, whereas the low-frequency (i.e., smooth) parts cannot be effectively eliminated [7, 23]. To remedy this defect, the coarse-grid correction step is designed to reduce the low-frequency error components by solving a coarse problem with much fewer unknowns (the number of these unknowns is denoted by ). The coarse-grid correction step involves two intergrid operators that transfer information between fine- and coarse-grids: a restriction matrix
that restricts the fine-grid residual to the coarse-grid; a prolongation (or interpolation) matrixwith full column rank that extends the correction computed on the coarse-grid to the fine one. Usually, is taken to be the transpose of (as considered in this paper). The Galerkin coarse-grid matrix is then defined as , which gives the coarse representation of the fine-grid matrix .
Most of the existing literature on two-grid theory (see, e.g., [10, 35, 18, 30, 6]) focus on exact two-grid methods with some exceptions like [17, 24]. A powerful identity has been established to characterize the convergence factor of exact two-grid methods [29, 10]. To design a well converged two-grid method, it is not necessary to solve the coarse problem exactly, especially when the problem size is still large. Multigrid is typically a recursive call (e.g., the V- and W-cycles) of the two-grid scheme and hence can be treated as an inexact two-grid scheme. As is well known, two-grid convergence is often sufficient to assess the convergence of the W-cycle multigrid methods; see, e.g., [11, 23]. With the aid of the hierarchical basis idea  and the minimization property of Schur complement (see, e.g., [1, Theorem 3.8]), Notay 
derived a convergence estimate for inexact two-grid methods. Based on this estimate, he also showed that, if the convergence factor of exact two-grid methods is uniformly bounded by, then the convergence factor of the corresponding W-cycle multigrid method is bounded by .
Besides theoretical considerations, two-grid theory can also guide the design of multigrid algorithms. The implementation of multigrid scheme on large-scale parallel machines is still a challenging topic, especially in the era of exascale computing. For instance, stencil sizes (the number of nonzero entries in a row) of the standard Galerkin coarse-grid matrices tend to increase further down in the multilevel hierarchy of algebraic multigrid methods [5, 3, 19], which will increase the communication costs. As problem size increases and the number of levels grows, the overall efficiency of parallel algebraic multigrid methods may decrease dramatically. To maintain multigrid convergence and improve parallel efficiency, some sparse approximation strategies for have been proposed; see, e.g., [4, 22, 21, 8]. Motivated by the convergence analysis in , Falgout and Schroder  proposed a non-Galerkin coarsening strategy to improve the parallel performance of algebraic multigrid algorithms.
In this paper, we present a systematic convergence analysis of inexact two-grid methods. Two-sided bounds for the energy norm of the error propagation matrix are established in a purely algebraic manner. Our main results include the following three types of estimates.
The first one (3.3) is a general convergence estimate, which slightly improves the existing one in [17, Theorem 2.2]. This estimate is valid for any symmetric and positive definite (SPD) coarse-grid matrix . In practice, we are more interested in the situation that is a suitable approximation to , which motivates the next two estimates.
It is worth mentioning that our estimates generalize the identity for the convergence factor of exact two-grid methods .
The rest of this paper is organized as follows. In Section 2, we first introduce some fundamental matrices involved in the analysis of two-grid methods, and then present the identity for the convergence factor of exact two-grid methods. In Section 3, we establish the convergence theory of inexact two-grid methods, which mainly contains three types of estimates. In Section 4, we give some concluding remarks.
In this section, we introduce some algebraic properties of two-grid methods, which play an important role in the convergence analysis of inexact two-grid methods. For convenience, we first list some notation used in the subsequent discussions.
denotes the identity matrix (or when its size is clear from context).
denotes the -th eigenvalue of a matrix (assuming that the eigenvalues are algebraically arranged in the same order throughout this paper).
, , and stand for the smallest eigenvalue, the smallest positive eigenvalue, and the largest eigenvalue of a matrix, respectively.
denotes the spectrum of a matrix.
denotes the spectral radius of a matrix.
denotes the spectral norm of a matrix.
denotes the spectral condition number of a matrix.
denotes the energy norm induced by an SPD matrix : for any , ; for any , .
2.1. Two-grid methods
Consider solving the linear system
where is SPD, , and . Given an initial guess and a nonsingular matrix , we perform the smoothing process
where is called a smoother and is the residual at the -th iteration. Let . From (2.2), we have
If , we deduce from (2.3) that, for any initial error
, the error vectortends to zero as . Since
a sufficient and necessary condition for the iteration (2.2) to be -convergent (i.e., ) is that is SPD.
In view of the -convergent smoother , we define
which is often referred to as a symmetrized smoother. It is easy to check that
Interchanging the roles of and in (2.4) yields another symmetrized smoother
The following lemma provides two useful relations between the symmetrized smoothers and (see [18, Lemma 1]).
Let be a prolongation (or interpolation) matrix with full column rank, where is the number of coarse variables. Let be a restriction matrix. The Galerkin coarse-grid matrix is then denoted by . For a given initial guess , the standard two-grid scheme (i.e., the presmoothing and postsmoothing steps are performed in a symmetric way) for solving (2.1) can be described as Algorithm 1. If the coarse-grid matrix is chosen as , then Algorithm 1 is called an exact two-grid method; otherwise, it is called an inexact two-grid method.
2.2. Convergence of exact two-grid methods
For the special case , we denote the iteration matrix by
which can be written as
It is easy to see that is an SPD matrix, which is called exact two-grid preconditioner.
Let be defined by (2.6), and define
The convergence factor of exact two-grid methods can be characterized as
The matrix defined by (2.17) is an -orthogonal projection (i.e., is orthogonal with respect to the inner product ) onto the coarse space . Similarly, we define a useful -orthogonal projection onto :
For a fixed smoother (e.g., the weighted Jacobi or Gauss–Seidel smoother), an optimal interpolation can be obtained by minimizing
. Unfortunately, the optimal interpolation is typically expensive to compute, because it requires explicit knowledge of eigenvectors corresponding to small eigenvalues of the eigenvalue problem; see [30, 6] for details.
To maintain two-grid convergence and design an interpolation with simple structure, one can minimize an upper bound of . Let , where and . Obviously, is a projection onto . Then
which, together with (2.18), yields
By minimizing over all interpolations, one can obtain the so-called ideal interpolation [9, 33], which provides new insights for designing an interpolation with sparse or simple structure (see, e.g., [15, 16, 33, 14]). In particular, if is taken to be , then . Hence, the ideal interpolation can be viewed as a “relaxation” of the optimal one. Furthermore, a quantitative relation between and can be found in .
3. Convergence analysis
In this section, we establish the convergence theory of inexact two-grid methods. More specifically, two-sided bounds for the energy norm of the iteration matrix are derived under different approximation conditions.
3.1. Convergence estimate of the first kind
The first estimate (see (3.3) below) is a general convergence result, which does not need any additional conditions on except for its positive definiteness.
We first prove an important lemma, which gives two relations between the extreme eigenvalues of and .
The expression (2.14) implies that is an SPSD matrix and
Since , the matrix is also SPSD and
it follows that
which, together with (2.13), yield the following convergence estimate.
The convergence factor of Algorithm 1 satisfies that
The only assumption on the coarse-grid matrix is its positive definiteness. Hence, the estimate (3.3) is valid for any SPD matrix . Nevertheless, to design a well converged two-grid method, we are more interested in the situation that is a suitable approximation to . In what follows, we focus on the convergence analysis of Algorithm 1 under general approximation conditions. These conditions arise from measuring the difference between and .
3.2. Convergence estimate of the second kind
In light of (2.12), we can derive the following explicit expression for .
The inexact two-grid preconditioner can be expressed as
In particular, if , we get from (3.4) that
from which one can easily see that is SPSD.
The following lemma provides some useful eigenvalue identities, which play an important role in the subsequent convergence analysis.
The extreme eigenvalues of and have the following properties:
Since is an SPSD matrix and is an -orthogonal projection, we have
Similarly, we have
there exists a nonsingular matrix such that
Let be partitioned into the block form
where , , , and . Straightforward computations yield
We are now in a position to present the convergence estimate of the second kind, which is based on characterizing the difference by reference to .
Let and . If the coarse-grid matrix satisfies
Part I: From (3.9), we deduce that and are SPSD matrices. Hence,
An application of (3.3) yields
Part II: The relation (2.13) can be rewritten as
In order to establish two-sided bounds for , we need to estimate the extreme eigenvalues and . By (3.4), we have
which leads to
where we have used the relation (2.7).
(i) The positive semidefiniteness of implies that
is SPSD. Since is also SPSD, the matrix
is SPSD. This leads to, for any ,
In particular, we have