# On choices of formulations of computing the generalized singular value decomposition of a matrix pair

For the computation of the generalized singular value decomposition (GSVD) of a matrix pair (A,B) of full column rank, the GSVD is commonly formulated as two mathematically equivalent generalized eigenvalue problems, so that a generalized eigensolver can be applied to one of them and the desired GSVD components are then recovered from the computed generalized eigenpairs. Our concern in this paper is which formulation of the generalized eigenvalue problems is preferable to compute the desired GSVD components more accurately. A detailed perturbation analysis is made on the two formulations and show how to make a suitable choice between them. Numerical experiments illustrate the obtained results.

## Authors

• 3 publications
• 8 publications
04/05/2020

### New Formulation and Computation for Generalized Singular Values of Grassman Matrix Pair

In this paper, we derive new model formulations for computing generalize...
01/09/2022

### Two harmonic Jacobi–Davidson methods for computing a partial generalized singular value decomposition of a large matrix pair

Two harmonic extraction based Jacobi–Davidson (JD) type algorithms are p...
04/29/2020

### A cross-product free Jacobi-Davidson type method for computing a partial generalized singular value decomposition (GSVD) of a large matrix pair

A Cross-Product Free (CPF) Jacobi-Davidson (JD) type method is proposed ...
01/10/2020

### The joint bidiagonalization process with partial reorthogonalization

The joint bidiagonalization(JBD) process is a useful algorithm for the c...
08/31/2019

### Implicit Hari–Zimmermann algorithm for the generalized SVD on the GPUs

A parallel, blocked, one-sided Hari–Zimmermann algorithm for the general...
11/30/2018

### Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices

This paper is concerned with a curious phenomenon in spectral estimation...
02/12/2020

### Towards a more robust algorithm for computing the restricted singular value decomposition

A new algorithm to compute the restricted singular value decomposition o...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The generalized singular value decomposition (GSVD) of a matrix pair was first introduced by van Loan van1976generalizing and then developed by Paige and Saunders paige1981towards . It has become a standard decomposition and an important computational tool golub2012matrix , and has been widely used in a wide range of contexts, e.g., solutions of discrete linear ill-posed problems hansen1998rank , weighted or generalized least squares problems bjorck1996numerical , information retrieval howland2003structure , linear discriminant analysis park2005relationship , and many others betcke2008generalized ; chu1987singular ; golub2012matrix ; kaagstrom1984generalized ; vanhuffel .

Let () and () be of full rank, i.e., . The GSVD of is

 (1.1)

where is nonsingular, and are orthonormal, and positive numbers and satisfy , . Such is called a GSVD component of with the generalized singular value , the left generalized singular vectors and , and the right generalized singular vector , . Denote the generalized singular value matrix of by

 Σ=CS−1=diag{σ1,…,σn}. (1.2)

Throughout this paper, we also refer the scalar pair as the generalized singular value of . Particularly, we will denote by and the largest and smallest generalized singular values of , respectively. Obviously, the generalized singular values of the pair are , the reciprocals of those of , and their generalized singular vectors are the same as those of .

For a prescribed target , assume that the generalized singular values of are labeled by

 |σ1−τ|≤|σ2−τ|≤⋯≤|σℓ−τ|<|σℓ+1−τ|≤⋯≤|σn−τ|. (1.3)

Specifically, if we are interested in the smallest generalized singular values of and/or the associated left and right generalized singular vectors, we assume in (1.3), so that the generalized singular values are labeled in increasing order; if we are interested in the largest generalized singular values of and/or the corresponding generalized singular vectors, we assume in (1.3), so that the generalized singular values are labeled in decreasing order. In both cases, the GSVD components are called extreme (smallest or largest) GSVD components of . Otherwise they are called interior GSVD components of if the given is inside the spectrum of the generalized singular values of . We will abbreviate any one of the desired GSVD components as or with the subscripts dropped.

For a large and possibly sparse matrix pair , one kind of approach to compute desired GSVD components works on the pair directly. Zha zha1996 proposes a joint bidiagonalization method to compute the extreme generalized singular values and the associated generalized singular vectors , which is a generalization of Lanczos bidiagonalization type methods jia2003implicitly ; jia2010 for computing a partial ordinary SVD of when . The main bottleneck of this method is that a large-scale least squares problem with the coefficient matrix must be solved at each step of the joint bidiagonalization. Jia and Yang jiayang2018 has made a further analysis on this method and its variant, and provided more theoretical supports for its rationale.

Another kind of commonly used approach formulates the GSVD as a generalized eigenvalue problem, solves it using an eigensolver parlett1998symmetric ; saad2011numerical ; stewart2001matrix , and recovers the desired GSVD components. There are two types of formulations. The first one is to apply an eigensolver to the cross product matrix pair to compute the corresponding eigenpairs and then recover the desired GSVD components from the computed eigenpairs zwaan2017generalized . However, because of the squaring of the generalized singular values of , for small, the eigenvalues of are much smaller. As a consequence, the smallest generalized singular values may be recovered much less accurately and even may have no accuracy jia2006 . Therefore, we shall not consider such a formulation in this paper. The second type of formulation, which we shall consider in this paper, transforms the GSVD into the generalized eigenvalue problem of the augmented definite matrix pair

 (ˆA,ˆB):=([AAT],[IBTB]), (1.4)

or the augmented definite matrix pair

 (˜B,˜A):=([BBT],[IATA]); (1.5)

see, e.g., hochstenbach2009jacobi . One then applies an eigensolver to either of them, computes the corresponding generalized eigenpairs, and recovers the desired GSVD components from those computed generalized eigenpairs. As is easily verified, the nonzero eigenvalues of and are and , , respectively; in the next section we will give more details on close connections between the GSVD of and the generalized eigenpairs of and . Therefore, the extreme or interior generalized singular values of become the extreme or interior eigenvalues of and . In principle, we may use a number of projection methods, e.g., Lanczos type methods, to compute the extreme GSVD components via solving the generalized eigenvalue problem of or . For a unified account of projection algorithms, we refer to baiedit2000 . For the computation of interior GSVD components of , we may employ the Jacobi-Davidson type methods hochstenbach2009jacobi , referred as JDGSVD, where at each step a linear system, i.e., the correction equation, is solved iteratively and its approximate solution is used to expand the current searching subspaces. A JDGSVD method deals with the generalized eigenvalue problem of (1.4) or (1.5) and recovers the desired GSVD components from the converged generalized eigenpairs of the chosen augmented matrix pair.

As far as numerical computations are concerned, an important question arises naturally: which of the mathematically equivalent formulations (1.4) and (1.5) is numerically preferable, so that the desired GSVD components can be computed more accurately? In this paper, rather than proposing or developing any algorithm for computing the GSVD, we focus on this issue carefully and suggest a deterministic choice. We first make a sensitivity analysis on the eigenpairs of (1.4) and (1.5

). Based on the results to be obtained, we establish accuracy estimates for the approximate generalized singular values and the left and right generalized singular vectors that are recovered from the obtained approximate eigenpairs. Then by comparing the accuracy of the approximate GSVD components recovered from both approximate generalized eigenpairs, we make a correct choice between (

1.4) and (1.5).

This paper is organized as follows. In Section 2 we make a sensitivity analysis on the generalized eigenvalue problems of the augmented matrix pairs and , respectively, and give error bounds for the generalized singular values

and the generalized eigenvectors of

and . In Section 3 we carry out a sensitivity analysis on the approximate generalized singular vectors that are recovered from the approximate eigenpairs of (1.4) and (1.5). Based on the results and analysis, we conclude that (1.5) is preferable to compute the GSVD more accurately when is well conditioned and is ill conditioned, and (1.4) is preferable when is ill conditioned and is well conditioned. In Section 4 we propose a few practical choice strategies on (1.4) and (1.5). In Section 5 we report the numerical experiments. We conclude the paper in Section 6.

Throughout this paper, denote by the 2-norm of a vector or matrix and the condition number of a matrix with and being the largest and smallest singular values of , respectively, and by the transpose of . Denote by

the identity matrix of order

, by and the zero matrices of order and , respectively. The subscripts are omitted when there is no confusion. Also denote by the column space or range of . In practice, the matrices and are usually scaled to have (approximately) the same length in 2-norm. When making an analysis, we always assume that and are already normalized such that , which means that and .

## 2 Perturbation analysis of generalized eigenvalue problems and accuracy of generalized singular values

The generalized eigenvalue decompositions of the augmented matrix pairs and are closely related to the GSVD of in the following way, which is straightforward to verify.

###### Lemma 2.1.

Let the GSVD of be defined by (1.1) with the generalized singular values defined by (1.2). Let and be such that and are orthogonal. Then the matrix pairs and defined by (1.4) and (1.5) have the generalized eigenvalue decompositions

 ˆAY=ˆBYˆΣand˜BZ=˜AZ˜Λ, (2.1)

respectively, where

 ˆΣ=[Σ−Σ0],Y=⎡⎢⎣1√2U1√2UU⊥1√2W−1√2W0⎤⎥⎦ (2.2)

with , and

 ˜Λ=[Λ−Λ0],Z=⎡⎢⎣1√2V1√2VV⊥1√2W′−1√2W′0⎤⎥⎦ (2.3)

with and . Moreover, the columns of the eigenvector matrices and are - and -orthonormal, respectively, i.e.,

 YTˆBY=Im+n,ZT˜AZ=Ip+n. (2.4)

Lemma 2.1 illustrates that the GSVD component of corresponds to the generalized eigenpair

 (σ,y):=(αβ,1√2[ux/β]) (2.5)

of the augmented matrix pair with the eigenvector satisfying and and the generalized eigenpair

 (2.6)

of the augmented matrix pair with the eigenvector satisfying and . Therefore, the GSVD of is mathematically equivalent to the generalized eigenvalue problems of (1.4) and (1.5). In order to obtain , one can compute the generalized eigenpair of or the generalized eigenpair of by applying a generalized eigensolver to (1.4) or (1.5), respectively, and then recover the desired GSVD component.

However, in numerical computations, we can obtain only approximate eigenpairs of (1.4) and (1.5), and thus recover only approximate GSVD components of . Therefore, when backward stable eigensolvers solve the generalized eigenvalue problems of (1.4) and (1.5) with the computed eigenpairs whose residuals have about the same size, a natural and central concern is: which of the computed eigenpairs of (1.4) and (1.5) will yield more accurate approximations to the desired GSVD components of , that is, which of (1.4) and (1.5) is numerically preferable to compute the GSVD components more accurately?

To this end, we need to carefully estimate the accuracy of the computed eigenpairs and that of the recovered GSVD components. Given a generalized eigensolver applied to (1.4) and (1.5), let and be the computed approximations to and , respectively. Then and are the exact eigenpairs of some perturbed matrix pairs

 (ˆA,ˆB)=(ˆA+ˆE,ˆB+ˆF)and(˜B,˜% A)=(˜B+˜F,˜A+˜E), (2.7)

respectively, where the perturbations satisfy

 ∥ˆE∥≤∥ˆA∥ϵ,∥ˆF∥≤∥ˆB∥ϵand∥˜F∥≤∥˜B∥ϵ,∥˜E∥≤∥˜A∥ϵ (2.8)

for small. Typically, , the level of machine precision , when the QZ algorithm golub2012matrix ; parlett1998symmetric ; stewart2001matrix is used for small to medium sized problems, or when iterative projection algorithms are used for large problems. Here in (2.7), to distinguish from the exact augmented matrices defined in (1.4) and (1.5), we have used the bold letters to denote the perturbed matrices. Notice that the assumption made in Section 1 means . As a consequence, (2.8) is equivalent to

 max{∥ˆE∥,∥ˆF∥,∥˜E∥,∥˜F∥}≤ϵ. (2.9)

In what follows, we will analyze how accurate the computed eigenpairs and are for a given small .

### 2.1 Accuracy of the generalized singular values

Stewart and Sun stewart1990matrix use a chordal metric to measure the distance between the approximate and exact eigenvalues of a regular matrix pair. Let and be the eigenvalues of and . Then the chordal distance between them is

 X(ˆσ,σ)=|ˆσ−σ|√1+ˆσ2√1+σ2. (2.10)

Stewart and Sun (stewart1990matrix, , p. 313-316) prove the error bound

 X(ˆσ,σ)≤∥y∥2√(yTˆAy)2+(yTˆBy)2√∥ˆE∥2+∥ˆF∥2 (2.11)

within .

Replacing by and (, ) by (, in (2.11), and exploiting the invariance of the chordal distance under reciprocal, i.e.,

 X(˜σ−1,σ−1)=X(˜σ,σ),

we have the error bound

 X(˜σ,σ)≤∥z∥2√(zT˜Az)2+(zT˜Bz)2√∥˜E∥2+∥˜F∥2. (2.12)

Combining (2.11) and (2.12) with Lemma 2.1, we can present the following results.

###### Theorem 2.2.

Let and be the eigenpairs of and corresponding to the GSVD component of . Assume that their approximations and are the generalized eigenpairs of the perturbed and , respectively, where the perturbations satisfy (2.9). If is sufficiently small, the following error estimates hold within :

 X(ˆσ,σ) ≤ ∥x∥2+β22β√∥ˆE∥2+∥ˆF∥2, (2.13) X(˜σ,σ) ≤ ∥x∥2+α22α√∥˜E∥2+∥˜F∥2. (2.14)
###### Proof.

We only give a proof of (2.13), and the proof of (2.14) is similar. From Lemma 2.1, notice that the eigenvector of satisfies and . From , and , we have

 ∥y∥2√(yTˆAy)2+(yTˆBy)2=12∥u∥2+∥x∥2β2√1+σ2=∥x∥2+β22β,

from which and (2.11) it follows that (2.13) holds within . ∎

Notice from (2.9) that the perturbation terms in the right hand sides of both (2.13) and (2.14) are equally small and bounded by . Theorem 2.2 illustrates that the accuracy of the approximate generalized singular value and that of are determined by and , and by and , respectively. Apparently, a large could severely impair the accuracy of both and . Fortunately, the following bounds show that must be modest under some mild conditions.

###### Lemma 2.3.

Let be the right generalized singular vector matrix of as defined in (1.1) and be an arbitrary column of . Then

 ∥X∥≤min{∥A†∥,∥B†∥}and∥X−1∥≤√∥A∥2+∥B∥2, (2.15)

where the superscript denotes the Moore-Penrose generalized inverse of a matrix, and

 1√∥A∥2+∥B∥2≤∥x∥≤min{∥A†∥,∥B†∥}. (2.16)

Particularly, with , the lower and upper bounds for are and , respectively.

###### Proof.

The bounds in (2.15) and the upper bound for in (2.16) are from Theorem 2.3 of hansen1989regularization . We only need to prove the lower bound for in (2.16). By definition, we have and . Therefore,

 ∥Ax∥2+∥Bx∥2=α2∥u∥2+β2∥v∥2=α2+β2=1,

from which and it follows that

 1√∥A∥2+∥B∥2≤∥x∥.

Lemma 2.3 indicates that provided that one of and is well conditioned, must be modest. In applications, to our best knowledge, there seems no case that both and are ill conditioned. Therefore, without loss of generality, we will assume that at least one of and is well conditioned. Then we have . Under this assumption, the stacked matrix must be well conditioned, too (stewart1990matrix, , Theorem 4.4).

Moreover, Theorem 2.4 of hansen1989regularization shows that provided is well conditioned, the singular values of and those of behave like and , correspondingly: the ratios of the singular values of and (resp. those of the singular values of and ), when labeled by the same order, are bounded from below and above by and , respectively. As a consequence, it is straightforward to justify the following basic properties, which will play a vital role in analyzing the results in this paper.

###### Property 2.4.

Assume that at least one of and is well conditioned.

• If both and are well conditioned, no and are small. In this case, all the generalized singular values of are neither large nor small.

• If or is ill conditioned, there must be some small or , that is, some generalized singular values must be small or large. Moreover, the small generalized singular values for ill conditioned and the large for ill conditioned.

• If is ill conditioned and is well conditioned, all the cannot be large but some of them are small; if is well conditioned and is ill conditioned, all the cannot be small but some of them are large.

Recall from (2.16) that and notice that . It is easily verified that

 13<∥x∥2+β2∥x∥2+α2<3.

Therefore, the numerators of the first factors in the right hand sides of (2.13) and (2.14) are modest and comparable, and it is their denominators that decide the size of the bounds in (2.13) and (2.14). As a consequence, in terms of Theorem 2.2 and Property 2.4, we can draw the following conclusions for the accurate computation of :

• For and well conditioned, both (1.4) and (1.5) work well.

• If is well conditioned but is ill conditioned, (1.5) is preferable to (1.4).

• If is ill conditioned but is well conditioned, (1.4) is better than (1.5).

### 2.2 Accuracy of the generalized eigenvectors

In terms of the angles between the approximate and exact eigenvectors, we present the following accuracy estimates for the approximate eigenvectors of the symmetric definite matrix pairs in (1.4) and (1.5).

###### Theorem 2.5.

Let and be the eigenvalues of and with the corresponding eigenvector matrices and satisfying and , where the positive integer is the multiplicity of and . Assume that and are the eigenvalues of the perturbed matrix pairs and with the perturbations satisfying (2.8), and that the corresponding (unnormalized) eigenvectors are written as and with and satisfying and , where and with and are the eigenvectors of and corresponding to and , respectively. Then the following bounds hold within :

 sin∠(y+Δy,y) ≤ ∥ˆB−1∥minμi≠σ|μi−σ|√∥ˆE∥2+σ2∥ˆF∥2, (2.17) sin∠(z+Δz,z) ≤ ∥˜A−1∥σminνi≠1σ|νi−1σ|√∥˜E∥2+σ2∥˜F∥2, (2.18)

where the are the eigenvalues of other than , and the are the eigenvalues of other than .

###### Proof.

Since and are the eigenpairs of and , respectively, we have and Combining these two relations, we obtain

 (ˆA−σˆB)Δy=−(ˆE−σˆF)(y+Δy)+ΔσˆBy+Δ2y, (2.19)

where , and is a second order small term, i.e., . Let the generalized eigenvalue decomposition of be written by (2.1) and the eigenvalue matrix and eigenvector matrix be partitioned as and . Then we have

 YT2ˆAY2=ˆΣ2,YT2ˆBY2=Im+n−k,YT2BY1=0.

Therefore, the assumption means for some . Premultiplying the two hand sides of (2.19) and noticing that

 YT2(ˆA−σˆB)Δy=YT2(ˆA−σˆB)Y2h=(ˆΣ2−σI)h

and , we obtain

 (ˆΣ2−σI)h=−YT2(ˆE−σˆF)(y+Δy)+YT2Δ2y.

Since is nonsingular, we have

 h=−(ˆΣ2−σI)−1YT2(ˆE−σˆF)(y+Δy)+Δ2h,

where is a second order small term. As a consequence, taking norms on the two hand sides of the above equation, we obtain

 ∥h∥≤∥Y2∥minμi≠σ|μi−σ|∥ˆE−σˆF∥∥y+Δy∥ (2.20)

within , where the are the eigenvalues of other than .

By definition, the sine of the angle between and satisfies

 sin∠(y+Δy,y)=∥(I−yyTyTy)(y+Δy)∥∥y+Δy∥≤∥Δy∥∥y+Δy∥. (2.21)

Substituting and (2.20) into (2.21) yields

 sin∠(y+Δy,y)≤∥Y2∥2minμi≠σ|μi−σ|√∥ˆE∥2+σ2∥ˆF∥2 (2.22)

within . Notice that is positive definite and satisfies . We have

 ∥Y2∥2 = ∥YT2Y2∥=∥YT2(ˆB)12(ˆB)−1(ˆB)12Y2∥ ≤ ∥ˆB−1∥∥(ˆB)12Y2∥2=∥ˆB−1∥,

applying which to (2.22) gives (2.17).

Following the same derivations as above, we obtain (2.18). ∎

Theorem 2.5 gives accuracy estimates for the approximate generalized eigenvectors of the matrix pairs and . It presents the results in the form of the augmented matrix pairs. For our use in the GSVD context, substituting the definitions of and in (1.4) and (1.5) as well as their eigenvectors in (2.2) and (2.3) into Theorem 2.5, we can present the results in terms of the generalized singular values of and the matrices and directly.

###### Theorem 2.6.

With the notations of Theorem 2.2 and Theorem 2.5, let and be the normalized approximations to and , respectively. Then the following results hold within :

 sin∠(ˆy,y) ≤ κ2(B)σminσi≠σ{|1−σiσ|,1}√∥ˆE∥2+σ2∥ˆF∥2, (2.23) sin∠(˜z,z) ≤ κ2(A)  minσi≠σ{|1−σσi|,1}√∥˜E∥2+σ2∥˜F∥2, (2.24)

where the are the generalized singular values of other than .

###### Proof.

For and , it is direct from Theorem 2.5 that

 sin∠(ˆy,y)=sin∠(y+Δy,y)≤∥ˆB−1∥minμi≠σ|μi−σ|√∥ˆE∥2+σ2∥ˆF∥2 (2.25)

holds within . Since the eigenvalues of are and zeros, we have

 minμi≠σ|μi−σ|=minσi≠σ{|σi−σ|,σ}=σminσi≠σ{∣∣1−σiσ∣∣,1}, (2.26)

where the are the generalized singular values of other than . On the other hand, by definition (1.4) of and the assumption , we have

 ∥ˆB−1∥=σ−1min(ˆB)=σ−2min(B)=κ2(B). (2.27)

Applying (2.26) and (2.27) to (2.25), we obtain (2.23).

Notice that the eigenvalues of are and zeros. Following the same derivations as above, we obtain

 sin∠(˜z,z) ≤ κ2(A)σminσi≠σ{|1σi−1σ|,1σ}√∥˜E∥2+σ2∥˜F∥2 = κ2(A)minσi≠σ{|1−σσi|,1}√∥˜E∥2+σ2∥˜F∥2,

which proves (2.24). ∎

For the minima in (2.23) and (2.24), we present the following result.

###### Theorem 2.7.

Denote and with being the generalized singular values of other than . Then

 12≤γ2γ1≤2. (2.28)

To prove this theorem, we need the following lemma.

###### Lemma 2.8.

Define and for . Then

 12≤g(t)f(t)≤2. (2.29)
###### Proof.

We classify the interval of

as three cases:

• if , then , and

• if , then , and

• if , then , and

Summarizing the above establishes (2.29). ∎

###### Proof of Theorem 2.7.

Denote by and the generalized singular values of that minimize and