A FEAST SVDsolver for the computation of singular value decompositions of large matrices based on the Chebyshev–Jackson series expansion

01/09/2022
by   Zhongxiao Jia, et al.
Tsinghua University
0

The FEAST eigensolver is extended to the computation of the singular triplets of a large matrix A with the singular values in a given interval. It is subspace iteration in nature applied to an approximate spectral projector associated with the cross-product matrix A^TA and constructs approximate left and right singular subspaces corresponding to the desired singular values, onto which A is projected to obtain approximations to the desired singular triplets. Approximate spectral projectors are constructed using the Chebyshev–Jackson series expansion other than contour integration and quadrature rules, and they are proven to be always symmetric positive semi-definite with the eigenvalues in [0,1]. Compact estimates are established for pointwise approximation errors of a specific step function that corresponds to the exact spectral projector, the accuracy of the approximate spectral projector, the number of desired singular triplets,the distance between the desired right singular subspace and the subspace generated each iteration, and the convergence of the FEAST SVDsolver. Practical selection strategies are proposed for the series degree and the subspace dimension. Numerical experiments illustrate that the FEAST SVDsolver is robust and efficient.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

07/05/2021

Dominant subspace and low-rank approximations from block Krylov subspaces without a gap

In this work we obtain results related to the approximation of h-dimensi...
12/18/2019

Cross product-free matrix pencils for computing generalized singular values

It is well known that the generalized (or quotient) singular values of a...
03/14/2022

A note on the eigenvalues, singular values, and eigenvectors of Toeplitz and Hankel matrices

In a series of recent papers the spectral behavior of the matrix sequenc...
10/22/2020

Fast Approximate CoSimRanks via Random Projections

Given a graph G with n nodes, and two nodes u,v in G, the CoSim-Rank val...
05/13/2022

Corner asymptotics of the magnetic potential in the eddy-current model

In this paper, we describe the magnetic potential in the vicinity of a c...
04/10/2020

Optimal Subspace Expansion for Matrix Eigenvalue Problems

In this paper, we consider the optimal subspace expansion problem for th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Matrix singular value decomposition (SVD) problems play a crucial role in many applications. For small to moderate problems, very efficient and robust SVD algorithms and softwares have been well developed and widely used [golub2013matrix, stewart2001matrix]

. They are often called direct SVD solvers and compute the entire singular values and/or singular vectors using predictable iterations. In this paper, we consider the following partial SVD problem: Given a matrix

with and a real interval contained in the singular spectrum of , determine the singular triplets with the singular values counting multiplicities, where

Over the past nearly two decades, a new class of numerical methods has emerged for computing the eigenvalues of a large matrix in a given region and/or the associated eigenvectors, and they are based on contour integration and rational filtering. Among them, representatives are the Sakurai–Sugiura (SS) method

[sakurai2003projection] and the FEAST eigensolver [polizzi2009density], which fall into the category of Rayleigh–Ritz projection methods. The SS method and subsequent variants [ikegami2010contour, ikegami2010filter, imakura2014SSarnoldi, sakurai2016, sakurai2007cirr] have resulted in the z-Pares package [Futamura2014Online]

that handles the large Hermitian and non-Hermitian matrix eigenvalue problems. The original SS method is the SS-Hankel method, and its variants include the SS-RR (Rayleigh–Ritz projection) method and SS-Arnoldi method as well as their block variants. The SS-Hankel method computes certain moments constructed by the contour integral with an integral domain containing all the desired eigenvalues to form small Hankel matrices or matrix pairs and may be numerically unstable. In computations, the contour integral is approximated by a numerical quadrature, and the moments are computed approximately. The SS method and its variants are essentially Krylov or block Krylov subspace based methods, realize the Rayleigh–Ritz projection onto them, and compute Ritz approximations

[sakurai2016]. The SS-RR method computes an orthonormal basis of the underlying subspace and projects the large matrix or matrix pair onto it, and the SS-Arnoldi method exploits the Arnoldi process to generate an orthonormal basis of the subspace and forms the projection matrix. We refer the reader to [sakurai2016] for a summary of these methods.

The FEAST eigensolver [guttel2015zolotarev, kestyn2016feast, peter2014feast, polizzi2009density], first introduced by Polizzi [polizzi2009density] in 2009, has led to the development of the FEAST numerical library [polizzi2020feast]. Unlike the SS method and its variants, this eigensolver works on subspaces of a fixed dimension and uses subspace iteration [golub2013matrix, parlett1998symmetric, saad2011numerical, stewart2001matrix] to generate a sequence of subspaces, onto which the Rayleigh–Ritz projection of the original matrix or matrix pair is realized and the Ritz approximations are computed.

The SS method and the FEAST eigensolver are common in that they both use the contour integral and a numerical quadrature to construct a good approximate spectral projector associated with all the eigenvalues in the region of interest. This involves solutions of certain linear systems with shifted coefficient matrices, where the shifts are the nodes of the numerical quadrature used. Particularly, for the FEAST eigensolver, at each iteration one needs to solve several, i.e., the subspace dimension times the number of nodes, large linear systems. These linear systems are typically indefinite, and are ill conditioned whenever a node is close to some eigenvalue, so that Krylov subspace iterative methods, e.g., the GMRES or MINRES method [saad2003], may be inefficient.

Another possible approach is to construct an approximate spectral projector by the Chebyshev or Chebyshev–Jackson series expansion, as done in, e.g., [di2016efficient]. However, such approximation approach has received little attention, compared with rational approximations based on the contour integral and numerical quadratures, and it is also secondary and a by-product in [di2016efficient]. Among others, a fundamental reason for this situation that accuracy estimates lacks for the Chebyshev–Jackson series and approximate spectral projector.

It is well known from, e.g., [mason2002chebyshev] that the Chebyshev series expansion is the best squares approximation to a given function with respect to the Chebyshev -norm. The function in our concern is a specific step function , and the researchers in [di2016efficient] derive a quantitative error estimate for the Chebyshev series approximation to this specific function in the Chebyshev -norm. Qualitatively, it is well known that the pointwise convergence of the Chebyshev series to holds. Specially, the series at a discontinuity or jump point converges to the mean , where (resp. ) means the limit of from the right (resp. left). However, just like the rational approximation to the step function, it is the pointwise quantitative error of the series that matters and is critically needed. Unfortunately, such error estimates cannot be obtained from the ones with respect to the Chebyshev -norm. Remarkably, for the step function , it is shown in, e.g., [di2016efficient] that Jackson coefficients [rivlin1981introduction] can considerably dampen Gibbs oscillations and it is better to exploit the resulting Chebyshev–Jackson series. However, just as the Chebyshev series, pointwise quantitative error estimates lack for this series. As a consequence, there is no result on the convergence of the FEAST eigensolver when using the Chebyshev–Jackson series to construct an approximate spectral projector, and nothing is known on the convergence rate of each desired eigenvalue, a reliable determination of subspace dimension and a proper selection of series degree.

Since the SVD of is mathematically equivalent to the eigendecomposition of its cross-product matrix , in principle, the afore-mentioned approaches can be adapted to the computation of the singular triplets of associated with the singular values in a given interval . In this paper, for such a partial SVD problem, we will focus on constructing an approximate spectral projector associated with by exploiting the Chebyshev–Jackson series expansion. We apply subspace iteration to the approximate spectral projector constructed and, using a certain reasonable approach, generate a sequence of approximate left and right singular spaces corresponding to . However, instead of explicitly dealing with the eigenvalue problem of , we work on directly and project onto the left and right subspaces generated and compute the Ritz approximations to the desired singular triplets. We call the resulting algorithm a FEAST SVDsolver.

We will make a detailed analysis on the pointwise convergence of the Chebyshev–Jackson series and establish quantitative pointwise error estimates for the series. We make full use of these results to estimate the accuracy of the approximate spectral projector, the distance between the desired right singular subspace and the approximate right singular subspace generated, and the convergence rate of each Ritz approximation computed by our FEAST SVDsolver. In the meantime, we show how to exploit randomized trace estimation results to reliably estimate the number of desired singular triplets with . With this result, we are able to propose practical and robust selection strategies for the subspace dimension and series degree.

Precisely, we prove that the value of the Chebyshev–Jackson series always lies in . By this property, we are able to prove that the approximate spectral projectors constructed are unconditionally symmetric positive semi-definite (SPSD) and their eigenvalues always lie in . This is a unique property that the approximate spectral projectors constructed by some commonly used numerical quadratures, such as the Gauss–Legendre quadrature rule and Zolotarev quadrature rule, may not possess even when they are very good approximations to the spectral projector, as is clearly seen from, e.g., [guttel2015zolotarev]. An exception is the trapezoid quadrature rule, which generates an SPSD approximate spectral projector for real symmetric eigenvalue problems when all the shifted linear systems are solved accurately [guttel2015zolotarev]. Since 1989 (cf. the Hutchinson’s paper [hutchinson1989stochastic]), several algorithms have been proposed to reliably estimate the trace of a matrix by Monte Carlo methods [bekas2007estimator, iitaka2004random, wong2004computing]. As it turns out, the positive semi-definiteness is important and attractive since it enables us to the Rademacher random estimations in [avron2011randomized, Cortinovis2021onrandom, roosta2015improved] to reliably predict the number of desired singular triplets and propose a practical selection strategy for the subspace dimension in the FEAST SVDsolver.

Cortinovis and Kressner [Cortinovis2021onrandom] have derived randomized trace estimates for a symmetric indefinite matrix with Rademacher or Gaussian random vectors. Contrary to the SPSD case, their bound is for absolute rather than relative errors of the trace, and the number of samples can be much larger than that in the SPSD case; see the elaborations after Lemma 2.1 in this paper and Remark 4, Corollary 2 of [Cortinovis2021onrandom] for Rademacher random vectors. Therefore, a reliable and efficient use of the bound in the indefinite case is far from that for the SPSD case.

All the theoretical results and algorithms in this paper are either directly applicable or trivially adaptable to the real symmetric and complex Hermitian matrix eigenvalue problems, once we replace in this paper by a given symmetric matrix itself and the Rayleigh–Ritz projection for the SVD problem by that for the eigenvalue problem.

The paper is organized as follows. In section 2, we briefly review some preliminaries, the subspace iteration applied to an approximate spectral projector, and some results to be used in the paper. In section 3, we establish the quantitative pointwise convergence results on the Chebyshev–Jackson series expansion. Then we propose the FEAST SVDsolver in section 4 to compute the desired singular triplets of . We establish the accuracy estimates for the approximate spectral projector and subspace dimension, and present a number of convergence results on the FEAST SVDsolver. In section 5, we count the computational cost of the FEAST SVDsolver. In section 6, we report numerical experiments to illustrate the performance of the FEAST SVDsolver, where we consider robust and practical selections of the subspace dimension and the degree of Chebyshev–Jackson series. Finally, we conclude the paper in section 7.

We mention that for the concerning SVD problem of a matrix with we simply apply the FEAST SVDsolver to . Also, all the analysis and results in this paper work on a complex trivially with the transpose replaced by the conjugate transpose.

Throughout this paper, denote by the 2-norm of a vector or matrix, by

the identity matrix of order

with dropped whenever it is clear from the context, and by and the largest and smallest singular values of a matrix , respectively.

2. Preliminaries and a basic algorithm

Denote by

the cross-product matrix of . Let

be the SVD of with the diagonals ’s of being the singular values and the columns of and being the corresponding left and right singular vectors of ; see [golub2013matrix]. Then

(2.1)

is the eigendecomposition of . We mention that at this moment we do not label the order of the singular values ’s.

Given an interval with being the smallest singular value of , suppose we are interested in all the singular values of and/or the corresponding left and right singular vectors. Define

(2.2)

where consists of the columns of corresponding to the eigenvalues of in the open interval and consists of the columns of corresponding to the eigenvalues of that equal or . Notice that if neither of nor is a singular value of , then is the standard spectral projector of associated with its eigenvalues . If either or or both them are singular values, then is called a generalized spectral projector associated with all the . The factor is necessary, and it corresponds to the step function to be introduced later that is approximated by the Chebyshev–Jackson series expansion in this paper or by a rational function in the context of the contour integral. In the sequel, for brevity we simply call the spectral projector associated with .

For an approximate singular triplet of , its absolute residual is

(2.3)

and the size of will be used to judge the convergence of and design a stopping criterion.

Algorithm 1 is an algorithmic framework of our FEAST SVDsolver to be developed in this paper, where is an approximation to . It is a subspace iteration applied to for computing the desired singular triplets of and is an adaptation of the FEAST eigensolver to our SVD problem, where the right subspace is and the left subspace . creftypecap 7 mathematically realizes the Rayleigh–Ritz projection of with respect to the subspace , that is, the are the Ritz pairs of with respect to , and the corresponding are the left Ritz vectors that approximate the left singular vectors of associated with the desired singular values.

1:  Input: The interval , the approximate spectral projector , and a -dimensional subspace with .
2:  Output: converged Ritz triplets .
3:  .
4:  while not converged do
5:     .
6:     Form the projection subspace: , and construct the approximate right singular subspace and approximate left singular subspace .
7:     The Rayleigh–Ritz projection: find with satisfying .
8:     Compute the residual norms (2.3) of Ritz triplets for all the .
9:  end while
Algorithm 1 The basic FEAST SVDsolver: Subspace iteration on the approximate spectral projector for the partial SVD of .

If defined by (2.2) and the subspace dimension , then under a certain mild condition on the initial subspace , Algorithm 1 finds the desired singular triplets in one iteration since and are the exact left and right singular subspace of associated with all the . We will focus on a number of issues on this algorithm in the sequel.

The following lemma is about how to estimate the trace of an implicit matrix. By implicit we mean that the matrix of interest is not available explicitly and only matrix-vector products for any appropriate vectors are available. The results are based on the Monte–Carlo simulation and are stated as follows [avron2011randomized, Cortinovis2021onrandom].

Lemma 2.1.

Let be an symmetric matrix. Define , where the components of the random vectors

are independent and identically distributed Rademacher random variables, i.e.,

. Then the expectation

, variance

, and for .

Moreover, if is positive semi-definite, then for .

From this lemma, we can see that the estimate for is absolute rather than relative error of when is indefinite. This means that a reliable relative error estimate of depends on its size in the indefinite case. For the number of , more appealing is relative error of in the context of the FEAST eigensolver and our FEAST SVDsolver as well as the SS-methods and its variants. The result in the lemma does not enable us to reliably estimate the trace when is indefinite, which, in turn, makes us hard to reliably estimate the number of desired singular values in a given interval. Besides, for the same , though its implication is different in the indefinite and SPSD cases, the analysis and comparison in Remark 4 and Corollary 2 of [Cortinovis2021onrandom] indicate that the sample number needed for the indefinite case is proportional to , the Frobenius norm of , and could be times larger than it is in the SPSD case for the same , meaning that an effective absolute error estimate may be much more costly.

As a matter of fact, we should distinguish the SPSD and indefinite cases and treat them separately. For indefinite, a good choice of the absolute error bound depends critically on the size of the unknown , and we have no way but empirically choose as some modest number rather than small, say ; but in the SPSD case, to make sense, we must choose a reasonably small relative error bound , say , independent of the size of . If we take , the bounds for the corresponding sample numbers are almost the same. Unfortunately, this choice is impossible in practice since or a good estimate for it is unknown.

3. The Chebyshev–Jackson series expansion of a specific step function

For an interval , let be a step function defined by

(3.1)

where and are discontinuity points of , and equal the means

Suppose that is approximately expanded as the Chebyshev–Jackson polynomial series of degree :

(3.2)

where is the -degree Chebyshev polynomial of the first kind [mason2002chebyshev]:

the Fourier coefficients , are

(3.3)

the Jackson damping factors are

(3.4)

with

(3.5)

It is shown in [di2016efficient, jay1999electronic] and [rivlin1981introduction, pp.20] that

(3.6)

Define

(3.7)

which is a function with period . Then is an even step function and

(3.8)

where

Let

(3.9)

Since , by (3.2) we have

(3.10)

Next we quantitatively analyze the pointwise convergence of to . To this end, we need two lemmas.

Lemma 1.4 of [rivlin1981introduction, pp.17-18] shows that if is continuous on and has period then

The above equality obviously holds when is replaced by our step function with period . Therefore, we have

Making use of (3.8) and the fact that and

are even and odd functions, respectively, we obtain

Consequently, we have proved the following lemma.

Lemma 3.1.

Let and be defined as (3.7) and (3.9), respectively. Then

(3.11)

where

Lemma 3.2.

For , it holds that .

Proof.

By (3.4), it is known from [rivlin1981introduction, pp.20] that

where are defined by (3.5), is the imaginary unit, and is the natural constant. Since is also nonnegative, it follows from (3.11) that . On the other hand,

(3.12)

Therefore,

In what follows we establish quantitative results on how fast pointwisely converges to . Such kind of results is very appealing and plays a critical role in estimating accuracy of the approximate spectral projector to be constructed in terms of and its eigenvalues, which will be used to analyze the convergence of the resulting FEAST SVDsolver and estimate the number of the desired .

We first consider the pointwise convergence of to when .

Theorem 3.3.

Let and be defined as (3.7) and (3.9), respectively. For , and , define

Then

(3.13)
Proof.

According to (3.7) and (3.8), we have

(3.14)

For any given , define the function

We classify

as , and , and consider each case separately. Note that

Therefore, for any given and , if , then by (3.8) we have . As a result, we obtain

On the other hand, means that

Combining the above two results yields

Exploiting (3.12), we obtain

Therefore,

(3.15)

Making use of a result in [rivlin1981introduction, Section 1.1.2, pp.19-21] and (3.6) gives

(3.16)

Therefore, it follows from this relation and (3.15) that

The previous result requires that , the discontinuity points of the step function . If is equal to or , we need to make a special analysis and show how and converge to .

Theorem 3.4.

Let and be defined as (3.7) and (3.9), respectively. Then for and it holds that

(3.17)
(3.18)
Proof.

We first consider the case that . Define the functions

For , we have . Therefore, from (3.8) and (3.14), we obtain , showing that

On the other hand, means that

Combining the above two results yields

For , we have . Therefore, from (3.8), we must have , i.e.,

On the other hand, means that

The above two results show that

Since is an even function and (cf. (3.12)), we have

Therefore,

Exploiting (3.16), we obtain

from which it follows that

that is, (3.17) holds by noticing that .

Now we consider the case that . Define the function

For , we have . Therefore, from (3.8), we have , i.e.,

On the other hand, means that

Combining the above two results, we obtain

For , we have . Therefore, from (3.8), we have , i.e.,

On the other hand, means that

The above two results show that

Since , similarly to , we have

By (3.16), we have

Therefore, we obtain

which is (3.18) by noticing that . ∎

By definition (3.2) of and (3.10), Theorem 3.3 and Theorem 3.4 quantitatively shows how fast pointwisely converges to for by taking and that the approximation accuracy of to is proportional to , that is, the convergence rate of to is as least as fast as for . Furthermore, Theorem 3.3 shows that the convergence may be slow if is small, that is, is close to or or, equivalently, is close to or ; Theorem 3.4 indicates that may converge to slowly if or , that is, may converge to slowly if or . Similarly, the convergence of to may be slow if or , that is, the convergence of to may be slow if or .

4. The FEAST SVDsolver based on the cross-product matrix

4.1. Approximate spectral projector and its accuracy

We use the linear transformation

(4.1)

to map the spectrum interval