As highlighted in a recent SIAM News article , there is growing interest in the use of variable precision floating point arithmetic in numerical algorithms. In this paper, we describe how variable precision arithmetic can be exploited in the iterative solver GMRES. We show that the precision of some floating point operations carried out in the algorithm can be reduced as the iterations proceed, without affecting the convergence rate or final accuracy achieved by the iterates.
There is already a literature on the use of inexact matrix-vector products in GMRES and other Krylov subspace methods; see, e.g., [19, 6, 3, 7, 8] and the references therein. This work is not a simple extension of such results. To illustrate, when performing inexact matrix-vector products in GMRES, one obtains an inexact Arnoldi relation
On the other hand, if only inexact inner products are performed, the Arnoldi relation continues to hold exactly, but the orthogonality of the Arnoldi vectors is lost:
Thus, to understand the convergence behaviour and maximum attainable accuracy of GMRES implemented in variable precision arithmetic, it is absolutely necessary to understand the resulting loss of orthogonality in the Arnoldi vectors. We adapt techniques used in the rounding-error analysis of the Modified Gram-Schmidt (MGS) algorithm (see [1, 2] or  for a more recent survey) and of the MGS-GMRES algorithm (see [5, 9, 14]). We also introduce some new analysis techniques. For example, we show that (2) is equivalent to an exact Arnoldi relation in a non-standard inner product, and we analyze the convergence of GMRES with variable precision arithmetic in terms of exact GMRES in this inner product. For more results relating to GMRES in non-standard inner products, see, e.g., [10, 18] and the references therein.
We focus on inexact inner products and matrix-vector products (as opposed to the other saxpy operations involved in the algorithm) because these are the two most time-consuming operations in parallel computations. The rest of the paper is organized as follows. We start with a brief discussion of GMRES in non-standard inner products in Section 2. Next, in Section 3, we analyze GMRES with inexact inner products. We then show how inexact matrix-vector products can be incorporated into this setting in Section 4. Some numerical illustrations are presented in Section 5.
2 GMRES in weighted inner products
Shown below is the Arnoldi algorithm, with denoting the standard Euclidean inner product.
After steps of the algorithm are performed in exact arithmetic, the output is and upper-Hessenberg such that
The columns of form an orthonormal basis for the Krylov subspace . In GMRES, we restrict to this subspace: , where is the solution of
It follows that
Any given symmetric positive definite matrix defines a weighted inner product and associated norm . Suppose we use this inner product instead of the standard Euclidean inner product in the Arnoldi algorithm. We use tildes to denote the resulting computed quantities. After steps, the result is and upper-Hessenberg such that
The columns of form a -orthonormal basis for . Let , where is the solution of
We denote the above algorithm -GMRES.
The following lemma shows that if is small, the Euclidean norm of the residual vector in -GMRES converges at essentially the same rate as in standard GMRES. The result is known; see e.g. . We include a proof for completeness.
Let and denote the iterates computed by standard GMRES and -GMRES, respectively, with corresponding residual vectors and . Then
Both and lie in the same Krylov subspace, . Because is chosen to minimize the Euclidean norm of the residual in , while minimizes the -norm of the residual in ,
Additionally, because for any vector
from which (4) follows. ∎
3 GMRES with inexact inner products
3.1 Recovering orthogonality
We will show that the standard GMRES algorithm implemented with inexact inner products is equivalent to -GMRES implemented exactly, for some well-conditioned matrix . To this end, we need the following theorem.
Consider a given matrix of rank such that
If for some , then there exists a matrix such that is symmetric positive definite and
In other words, the columns of are exactly orthonormal in an inner product defined by . Furthermore,
Equation (6) is equivalent to the linear matrix equation
It is straightforward to verify that one matrix satisfying this equation is
which implies that the matrix is positive definite. From the above and (8), provided ,
from which (7) follows. ∎
Note that remains small even for values of close to . For example, suppose , indicating an extremely severe loss of orthogonality. Then , so still has exactly orthonormal columns in an inner product defined by a very well-conditioned matrix.
Paige and his coauthors [2, 13, 17] have developed an alternative measure of loss of orthogonality. Given with normalized columns, the measure is , where and is the strictly upper-triangular part of . Additionally, orthogonality can be recovered by augmentation: the matrix has orthonormal columns. This measure was used in the groundbreaking rounding error analysis of the MGS-GMRES algorithm . In the present paper, under the condition , we use the measure and recover orthogonality in the inner product. For future reference, Paige’s approach is likely to be the most appropriate for analyzing the Lanczos and conjugate gradient algorithms, in which orthogonality is quickly lost and long before convergence.
3.2 Bounding the loss of orthogonality
Suppose the inner products in the Arnoldi algorithm are computed inexactly, i.e., line 6 in Algorithm 1 is replaced by
with bounded by some tolerance. We use tildes to denote the resulting computed quantities. It is straightforward to show that despite the inexact inner products in (10), the relation continues to hold exactly (under the assumption that all other operations besides the inner products are performed exactly). On the other hand, the orthogonality of the Arnoldi vectors is lost. We have
The relation between each and the overall loss of orthogonality is very difficult to understand. To simplify the analysis we suppose that each is normalized exactly. (This is not an uncommon assumption; see, e.g.,  and .) Under this simplification, we have
i.e., contains the strictly upper-triangular part of . Define
Following Björck’s seminal rounding error analysis of MGS , it can be shown that
For completeness, a proof of (14) is provided in the appendix. Note that, assuming GMRES has not terminated by step , i.e., for , then must be invertible. Using (14), the following theorem shows how the convergence of GMRES with inexact inner products relates to that of exact GMRES. The idea is similar to [14, Section 5], in which the quantity must be bounded, where is a matrix containing rounding errors.
The Arnoldi algorithm implemented with inexact inner products has computed an -orthonormal basis for . The iterate is the same as the iterate that would have been obtained by running -GMRES exactly. The result follows from Lemma 1. ∎
3.3 A strategy for bounding the
The challenge in applying Theorem 2 is bounding the tolerances at step to ensure that (15) holds for all subsequent iterations . Theorem 3 below leads to a practical strategy for bounding the . We will use
to denote the residual computed in the GMRES subproblem at step . We will use the known fact that for ,
This follows from
Additionally, in order to understand how increases as the residual norm decreases, we will need the following rather technical lemma, which is essentially a special case of [15, Theorem 4.1]. We defer its proof to the appendix.
Let and be the least squares solution and residual vector of
Given , let be any nonsingular matrix such that
for any and any positive numbers such that , then at step either (16) holds with , or
implying that GMRES has converged to a relative residual of .
where is an upper-triangular matrix containing only ones in its upper-triangular part, so that , and . Then in (15),
Let denote the first row of , so that . For any we have
Therefore, in (23),
Theorem 3 can be interpreted as follows. If at all steps of GMRES the inner products are computed inaccurately with tolerances in (21), then convergence at the same rate as exact GMRES is achieved until a relative residual of essentially is reached. Notice that is inversely proportional to the residual norm. This allows the inner products to be computed more and more inaccurately as as the iterations proceed.
If no more than iterations are to be performed, we can let (although more elaborate choices for could be considered; see for example ). Then the factor in (21) can be absorbed along with the in (22).
One important difficulty with (21) is that is required to pick at the start of step , but is not available until the final step . A similar problem occurs in GMRES with inexact matrix-vector products; see [19, 6] and the comments in Section 4. In our experience, is often possible to replace in (21) by , without significantly affecting the convergence of GMRES. This leads to following:
This prevents potential early stagnation of the residual norm, but is often unnecessarily stringent. (It goes without saying that if the conservative threshold is less than , where is the machine precision, then the criterion is vacuous: according to this criterion no inexact inner products can be carried out at iteration .) Numerical examples are given in Section 5.
4 Incorporating inexact matrix-vector products
As mentioned in the introduction, there is already a literature on the use of inexact matrix-vector products in GMRES. These results are obtained by assuming that the Arnoldi vectors are orthonormal and analyzing the inexact Arnoldi relation
The purpose of this section is to show that the framework used in  and  to analyze inexact matrix-vector products in GMRES is still valid when the orthogonality of the Arnoldi vectors is lost, i.e., under the inexact Arnoldi relation
This settles a question left open in [19, Section 6].
4.1 Bounding the residual gap
As in previous sections, we use to denote the computed GMRES iterate, with for the actual residual vector and for the residual vector updated in the GMRES iterations. From
From the fact that the columns of are orthonormal as well as (9), we obtain
Suppose the matrix-vector products in the Arnoldi algorithm are computed inexactly, i.e., line 4 in Algorithm 1 is replaced by
where for some given tolerance . Then in (26),
The following theorem bounds the residual gap at step in terms of the tolerances and , for .
4.2 A strategy for picking the
Theorem 4 suggests the following strategy for picking the tolerances that bound the level of inexactness in the matrix-vector products in (29). Similarly to Theorem 3, let be any positive numbers such that . If for all steps ,
Interestingly, this result is independent of . Similarly to (21), the criterion for picking at step involves that is only available at the final step . A large number of numerical experiments [6, 3] indicate that can often be replaced by . Absorbing the factor into in (32) and replacing by or by leads, respectively, to the same aggressive and conservative thresholds for as we obtained for in (24) and in (25). This suggests that matrix-vector products and inner products in GMRES can be computed with the same level of inexactness. We illustrate this with numerical examples in the next section.
5 Numerical examples
We illustrate our results with a few numerical examples.
We run GMRES with different matrices and right-hand sides ,
and compute the inner products and matrix-vector products inexactly
as in (10) and (29).
We pick randomly,
uniformly distributed between to be a matrix of independent standard
normal random variables, scaled to have norm
randomly, uniformly distributed betweenand , and pick
to be a matrix of independent standard normal random variables, scaled to have norm. Thus we have
for chosen tolerances and . Throughout we use the same level of inexactness for inner products and matrix-vector products, i.e., .
In our first example, is the Grcar matrix of order . This is a highly non-normal Toeplitz matrix. The right hand side is . Results are shown in Figure 1. The solid green curve is the relative residual . For reference, the dashed blue curve is the relative residual if GMRES is run in double precision. The full magenta curve corresponds to the loss of orthogonality in (11). The black dotted curve is the chosen tolerance .
In Example 1(a),
The large increase in the inexactness of the inner products at iterations and immediately leads to a large increase in . This clearly illustrates the connection between the inexactness of the inner products and the loss of orthogonality in the Arnoldi vectors. As proven in Theorem 2, until , the residual norm is the same as it would have been had all computations been performed in double precision. Due to its large increases at iterations and , approaches , and the residual norm starts to stagnate, long before a backward stable solution is obtained.
In Example 1(b), the tolerances are chosen according to the aggressive criterion (24) with . With this judicious choice, does not reach , and the residual norm does not stagnate, until a backward stable solution is obtained.
In our second example, is the matrix from the SuiteSparse matrix collection . This is a matrix with condition number . The right hand side is once again .
Results are shown in Figure 2. In Example 2(a), tolerances are chosen according to the aggressive threshhold (24) with . In this more ill-conditioned problem, the residual norm starts to stagnate before a backward stable solution is obtained. In Example 2(b), the tolerances are chosen according to the conservative threshhold (25) with , and there is no more such stagnation. Because of these lower tolerances, the inner products and matrix-vector products have to be performed in double precision until about iteration 200. This example illustrates the tradeoff between the level of inexactness and the maximum attainable accuracy. If one requires a backward stable solution, the more ill-conditioned the matrix is, the less opportunity there is for performing floating-point operations inexactly in GMRES.
We have shown how inner products can be performed inexactly in MGS-GMRES without affecting the convergence or final achievable accuracy of the algorithm. We have also shown that a known framework for inexact matrix-vector products is still valid despite the loss of orthogonality in the Arnoldi vectors. It would be interesting to investigate the impact of scaling or preconditioning on these results. Additionally, in future work, we plan to address the question of how much computational savings can be achieved by this approach on massively parallel computer architectures.
Appendix A Proof of (14)
In line 7 of Algorithm 1, in the th pass of the inner loop at step , we have
for and with . Writing this equation for to , we have