The generalized singular value decomposition (GSVD) of a matrix pair was first introduced by van Loan van1976generalizing and then developed by Paige and Saunders paige1981towards . It has become a standard decomposition and an important computational tool golub2012matrix , and has been widely used in a wide range of contexts, e.g., solutions of discrete linear ill-posed problems hansen1998rank , weighted or generalized least squares problems bjorck1996numerical , information retrieval howland2003structure , linear discriminant analysis park2005relationship , and many others betcke2008generalized ; chu1987singular ; golub2012matrix ; kaagstrom1984generalized ; vanhuffel .
Let () and () be of full rank, i.e., . The GSVD of is
where is nonsingular, and are orthonormal, and positive numbers and satisfy , . Such is called a GSVD component of with the generalized singular value , the left generalized singular vectors and , and the right generalized singular vector , . Denote the generalized singular value matrix of by
Throughout this paper, we also refer the scalar pair as the generalized singular value of . Particularly, we will denote by and the largest and smallest generalized singular values of , respectively. Obviously, the generalized singular values of the pair are , the reciprocals of those of , and their generalized singular vectors are the same as those of .
For a prescribed target , assume that the generalized singular values of are labeled by
Specifically, if we are interested in the smallest generalized singular values of and/or the associated left and right generalized singular vectors, we assume in (1.3), so that the generalized singular values are labeled in increasing order; if we are interested in the largest generalized singular values of and/or the corresponding generalized singular vectors, we assume in (1.3), so that the generalized singular values are labeled in decreasing order. In both cases, the GSVD components are called extreme (smallest or largest) GSVD components of . Otherwise they are called interior GSVD components of if the given is inside the spectrum of the generalized singular values of . We will abbreviate any one of the desired GSVD components as or with the subscripts dropped.
For a large and possibly sparse matrix pair , one kind of approach to compute desired GSVD components works on the pair directly. Zha zha1996 proposes a joint bidiagonalization method to compute the extreme generalized singular values and the associated generalized singular vectors , which is a generalization of Lanczos bidiagonalization type methods jia2003implicitly ; jia2010 for computing a partial ordinary SVD of when . The main bottleneck of this method is that a large-scale least squares problem with the coefficient matrix must be solved at each step of the joint bidiagonalization. Jia and Yang jiayang2018 has made a further analysis on this method and its variant, and provided more theoretical supports for its rationale.
Another kind of commonly used approach formulates the GSVD as a generalized eigenvalue problem, solves it using an eigensolver parlett1998symmetric ; saad2011numerical ; stewart2001matrix , and recovers the desired GSVD components. There are two types of formulations. The first one is to apply an eigensolver to the cross product matrix pair to compute the corresponding eigenpairs and then recover the desired GSVD components from the computed eigenpairs zwaan2017generalized . However, because of the squaring of the generalized singular values of , for small, the eigenvalues of are much smaller. As a consequence, the smallest generalized singular values may be recovered much less accurately and even may have no accuracy jia2006 . Therefore, we shall not consider such a formulation in this paper. The second type of formulation, which we shall consider in this paper, transforms the GSVD into the generalized eigenvalue problem of the augmented definite matrix pair
or the augmented definite matrix pair
see, e.g., hochstenbach2009jacobi . One then applies an eigensolver to either of them, computes the corresponding generalized eigenpairs, and recovers the desired GSVD components from those computed generalized eigenpairs. As is easily verified, the nonzero eigenvalues of and are and , , respectively; in the next section we will give more details on close connections between the GSVD of and the generalized eigenpairs of and . Therefore, the extreme or interior generalized singular values of become the extreme or interior eigenvalues of and . In principle, we may use a number of projection methods, e.g., Lanczos type methods, to compute the extreme GSVD components via solving the generalized eigenvalue problem of or . For a unified account of projection algorithms, we refer to baiedit2000 . For the computation of interior GSVD components of , we may employ the Jacobi-Davidson type methods hochstenbach2009jacobi , referred as JDGSVD, where at each step a linear system, i.e., the correction equation, is solved iteratively and its approximate solution is used to expand the current searching subspaces. A JDGSVD method deals with the generalized eigenvalue problem of (1.4) or (1.5) and recovers the desired GSVD components from the converged generalized eigenpairs of the chosen augmented matrix pair.
As far as numerical computations are concerned, an important question arises naturally: which of the mathematically equivalent formulations (1.4) and (1.5) is numerically preferable, so that the desired GSVD components can be computed more accurately? In this paper, rather than proposing or developing any algorithm for computing the GSVD, we focus on this issue carefully and suggest a deterministic choice. We first make a sensitivity analysis on the eigenpairs of (1.4) and (1.5
). Based on the results to be obtained, we establish accuracy estimates for the approximate generalized singular values and the left and right generalized singular vectors that are recovered from the obtained approximate eigenpairs. Then by comparing the accuracy of the approximate GSVD components recovered from both approximate generalized eigenpairs, we make a correct choice between (1.4) and (1.5).
This paper is organized as follows. In Section 2 we make a sensitivity analysis on the generalized eigenvalue problems of the augmented matrix pairs and , respectively, and give error bounds for the generalized singular values
and the generalized eigenvectors ofand . In Section 3 we carry out a sensitivity analysis on the approximate generalized singular vectors that are recovered from the approximate eigenpairs of (1.4) and (1.5). Based on the results and analysis, we conclude that (1.5) is preferable to compute the GSVD more accurately when is well conditioned and is ill conditioned, and (1.4) is preferable when is ill conditioned and is well conditioned. In Section 4 we propose a few practical choice strategies on (1.4) and (1.5). In Section 5 we report the numerical experiments. We conclude the paper in Section 6.
Throughout this paper, denote by the 2-norm of a vector or matrix and the condition number of a matrix with and being the largest and smallest singular values of , respectively, and by the transpose of . Denote by
the identity matrix of order, by and the zero matrices of order and , respectively. The subscripts are omitted when there is no confusion. Also denote by the column space or range of . In practice, the matrices and are usually scaled to have (approximately) the same length in 2-norm. When making an analysis, we always assume that and are already normalized such that , which means that and .
2 Perturbation analysis of generalized eigenvalue problems and accuracy of generalized singular values
The generalized eigenvalue decompositions of the augmented matrix pairs and are closely related to the GSVD of in the following way, which is straightforward to verify.
Let the GSVD of be defined by (1.1) with the generalized singular values defined by (1.2). Let and be such that and are orthogonal. Then the matrix pairs and defined by (1.4) and (1.5) have the generalized eigenvalue decompositions
with , and
with and . Moreover, the columns of the eigenvector matrices and are - and -orthonormal, respectively, i.e.,
Lemma 2.1 illustrates that the GSVD component of corresponds to the generalized eigenpair
of the augmented matrix pair with the eigenvector satisfying and and the generalized eigenpair
of the augmented matrix pair with the eigenvector satisfying and . Therefore, the GSVD of is mathematically equivalent to the generalized eigenvalue problems of (1.4) and (1.5). In order to obtain , one can compute the generalized eigenpair of or the generalized eigenpair of by applying a generalized eigensolver to (1.4) or (1.5), respectively, and then recover the desired GSVD component.
However, in numerical computations, we can obtain only approximate eigenpairs of (1.4) and (1.5), and thus recover only approximate GSVD components of . Therefore, when backward stable eigensolvers solve the generalized eigenvalue problems of (1.4) and (1.5) with the computed eigenpairs whose residuals have about the same size, a natural and central concern is: which of the computed eigenpairs of (1.4) and (1.5) will yield more accurate approximations to the desired GSVD components of , that is, which of (1.4) and (1.5) is numerically preferable to compute the GSVD components more accurately?
To this end, we need to carefully estimate the accuracy of the computed eigenpairs and that of the recovered GSVD components. Given a generalized eigensolver applied to (1.4) and (1.5), let and be the computed approximations to and , respectively. Then and are the exact eigenpairs of some perturbed matrix pairs
respectively, where the perturbations satisfy
for small. Typically, , the level of machine precision , when the QZ algorithm golub2012matrix ; parlett1998symmetric ; stewart2001matrix is used for small to medium sized problems, or when iterative projection algorithms are used for large problems. Here in (2.7), to distinguish from the exact augmented matrices defined in (1.4) and (1.5), we have used the bold letters to denote the perturbed matrices. Notice that the assumption made in Section 1 means . As a consequence, (2.8) is equivalent to
In what follows, we will analyze how accurate the computed eigenpairs and are for a given small .
2.1 Accuracy of the generalized singular values
Stewart and Sun stewart1990matrix use a chordal metric to measure the distance between the approximate and exact eigenvalues of a regular matrix pair. Let and be the eigenvalues of and . Then the chordal distance between them is
Stewart and Sun (stewart1990matrix, , p. 313-316) prove the error bound
Replacing by and (, ) by (, in (2.11), and exploiting the invariance of the chordal distance under reciprocal, i.e.,
we have the error bound
Let and be the eigenpairs of and corresponding to the GSVD component of . Assume that their approximations and are the generalized eigenpairs of the perturbed and , respectively, where the perturbations satisfy (2.9). If is sufficiently small, the following error estimates hold within :
Notice from (2.9) that the perturbation terms in the right hand sides of both (2.13) and (2.14) are equally small and bounded by . Theorem 2.2 illustrates that the accuracy of the approximate generalized singular value and that of are determined by and , and by and , respectively. Apparently, a large could severely impair the accuracy of both and . Fortunately, the following bounds show that must be modest under some mild conditions.
Let be the right generalized singular vector matrix of as defined in (1.1) and be an arbitrary column of . Then
where the superscript denotes the Moore-Penrose generalized inverse of a matrix, and
Particularly, with , the lower and upper bounds for are and , respectively.
Lemma 2.3 indicates that provided that one of and is well conditioned, must be modest. In applications, to our best knowledge, there seems no case that both and are ill conditioned. Therefore, without loss of generality, we will assume that at least one of and is well conditioned. Then we have . Under this assumption, the stacked matrix must be well conditioned, too (stewart1990matrix, , Theorem 4.4).
Moreover, Theorem 2.4 of hansen1989regularization shows that provided is well conditioned, the singular values of and those of behave like and , correspondingly: the ratios of the singular values of and (resp. those of the singular values of and ), when labeled by the same order, are bounded from below and above by and , respectively. As a consequence, it is straightforward to justify the following basic properties, which will play a vital role in analyzing the results in this paper.
Assume that at least one of and is well conditioned.
If both and are well conditioned, no and are small. In this case, all the generalized singular values of are neither large nor small.
If or is ill conditioned, there must be some small or , that is, some generalized singular values must be small or large. Moreover, the small generalized singular values for ill conditioned and the large for ill conditioned.
If is ill conditioned and is well conditioned, all the cannot be large but some of them are small; if is well conditioned and is ill conditioned, all the cannot be small but some of them are large.
Recall from (2.16) that and notice that . It is easily verified that
Therefore, the numerators of the first factors in the right hand sides of (2.13) and (2.14) are modest and comparable, and it is their denominators that decide the size of the bounds in (2.13) and (2.14). As a consequence, in terms of Theorem 2.2 and Property 2.4, we can draw the following conclusions for the accurate computation of :
2.2 Accuracy of the generalized eigenvectors
In terms of the angles between the approximate and exact eigenvectors, we present the following accuracy estimates for the approximate eigenvectors of the symmetric definite matrix pairs in (1.4) and (1.5).
Let and be the eigenvalues of and with the corresponding eigenvector matrices and satisfying and , where the positive integer is the multiplicity of and . Assume that and are the eigenvalues of the perturbed matrix pairs and with the perturbations satisfying (2.8), and that the corresponding (unnormalized) eigenvectors are written as and with and satisfying and , where and with and are the eigenvectors of and corresponding to and , respectively. Then the following bounds hold within :
where the are the eigenvalues of other than , and the are the eigenvalues of other than .
Since and are the eigenpairs of and , respectively, we have and Combining these two relations, we obtain
where , and is a second order small term, i.e., . Let the generalized eigenvalue decomposition of be written by (2.1) and the eigenvalue matrix and eigenvector matrix be partitioned as and . Then we have
Therefore, the assumption means for some . Premultiplying the two hand sides of (2.19) and noticing that
and , we obtain
Since is nonsingular, we have
where is a second order small term. As a consequence, taking norms on the two hand sides of the above equation, we obtain
within , where the are the eigenvalues of other than .
By definition, the sine of the angle between and satisfies
within . Notice that is positive definite and satisfies . We have
Following the same derivations as above, we obtain (2.18). ∎
Theorem 2.5 gives accuracy estimates for the approximate generalized eigenvectors of the matrix pairs and . It presents the results in the form of the augmented matrix pairs. For our use in the GSVD context, substituting the definitions of and in (1.4) and (1.5) as well as their eigenvectors in (2.2) and (2.3) into Theorem 2.5, we can present the results in terms of the generalized singular values of and the matrices and directly.
For and , it is direct from Theorem 2.5 that
holds within . Since the eigenvalues of are and zeros, we have
where the are the generalized singular values of other than . On the other hand, by definition (1.4) of and the assumption , we have
Notice that the eigenvalues of are and zeros. Following the same derivations as above, we obtain
which proves (2.24). ∎
Denote and with being the generalized singular values of other than . Then
To prove this theorem, we need the following lemma.
Define and for . Then
Proof of Theorem 2.7.
Denote by and the generalized singular values of that minimize and