# Randomized block Krylov space methods for trace and log-determinant estimators

We present randomized algorithms based on block Krylov space method for estimating the trace and log-determinant of Hermitian positive semi-definite matrices. Using the properties of Chebyshev polynomial and Gaussian random matrix, we provide the error analysis of the proposed estimators and obtain the expectation and concentration error bounds. These bounds improve the corresponding ones given in the literature. Numerical experiments are presented to illustrate the performance of the algorithms and to test the error bounds.

## Authors

• 17 publications
• 2 publications
05/20/2020

### On randomized trace estimates for indefinite matrices with an application to determinants

Randomized trace estimation is a popular and well studied technique that...
03/18/2021

04/14/2020

### Norm and trace estimation with random rank-one vectors

A few matrix-vector multiplications with random vectors are often suffic...
05/13/2020

### A remark on approximating permanents of positive definite matrices

Let A be an n × n positive definite Hermitian matrix with all eigenvalue...
10/25/2019

### Randomized residual-based error estimators for the Proper Generalized Decomposition approximation of parametrized problems

This paper introduces a novel error estimator for the Proper Generalized...
11/12/2020

### Quantum algorithms for spectral sums

We propose and analyze new quantum algorithms for estimating the most co...
12/10/2020

### A generalised log-determinant regularizer for online semi-definite programming and its applications

We consider a variant of online semi-definite programming problem (OSDP)...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Computing the trace and the log-determinant of Hermitian positive semi-definite matrices finds many applications in various problems such as inverse problem haber , generalized cross validation golub , spatial statistics zhang2008 , and so on. Naturally, it is a straightforward problem if the matrices are explicitly defined and we can access the individual entries. For example, a standard approach for computing the determinant of Hermitian positive definite matrices is to leverage its LU decomposition or Cholesky decomposition (higham, , Section 14.6)

. However, in big data age, it is difficult or expensive to explicitly access the individual entries or we can only access the matrix through matrix vector products. We will focus on the latter case in this paper. For this case, seeking the estimators with high accuracy for trace and log-determinant will be of great interest.

For the trace of Hermitian positive semi-definite matrix , the popular and simple estimator is the Monte Carlo estimator proposed by Hutchinson hutchinson :

 Tr(A)≈1NN∑i=1ciATci,

where is the sample size, and are the independently and identically distributed (i.i.d.) Rademacher random vectors. Hutchinson hutchinson showed that this estimator is unbiased. Later, by replacing the i.i.d. random vectors

with Gaussian random vectors, random unit vectors, or columns of the identity matrix sampled uniformly, some scholars produced some other unbiased estimators

avron ; roosta , where Avron and Toledo avron first considered the number of Monte Carlo samples with which a relative error

with probability

can be achieved and defined an (, ) estimator:

 P[∥Tr(A)−1NN∑i=1ciATci∥⩽ϵTr(A)]⩾1−δ.

Some lower bounds for for different choice of the random vectors were provided in avron , which were improved by Roosta-Khorasani and Ascher roosta . In 2017, Lin lin proposed two new trace estimators from the view of the randomized low-rank approximation of the matrix with order :

 Tr(A)≈Tr(AΩ(Ω∗AΩ)†(AΩ)∗),
 Tr(A)≈Tr(AΩ((AΩ)∗AΩ)†((AΩ)∗A(AΩ))((AΩ)∗AΩ)†(AΩ)∗),

where is a Gaussian random matrix of size with and, for a matrix , denotes its Moore-Penrose inverse. The author mainly investigated the first estimator and found that the method can be much faster than the Monte Carlo estimator. However, there was no formal error analysis for this estimator provided in lin . Later, Saibaba et al. saibaba also presented a new trace estimator based on randomized low-rank approximation and provided detailed error analysis to validate the theoretical reliability of the estimator.

For the log-determinant of Hermitian positive definite matrix , a popular approach is to combine the identity with the Monte Carlo estimators for trace introduced above. With this idea, Barry and Pace barry first proposed the Monte Carlo estimator of log-determinant for large sparse matrix. Later, Zhang and Leithead zhang generalized the estimator to general Hermitian positive definite matrix. However, both of the above two papers didn’t provide the rigorous error analysis of these estimators. Recently, Boutsidis et al. boutsidis continued the above work and investigated the error analysis in detail based on the results from avron . In the above log-determinant estimators barry ; zhang ; boutsidis , the Taylor expansion was used to expand . Pace and LeSage pace first introduced Chebyshev approximation to approximate , however, they didn’t combine the approximation with Monte Carlo estimators for trace and there was no formal error analysis. These works were done by Han et al. han2015 and they also generalized the method to trace estimator for matrix function han2017 . In addition, some scholars also used Cauchy integral formula or spline to expand and to devise the estimators for log-determinant aune ; anitescu . Recently, based on randomized low-rank approximation of matrix, Saibaba et al. saibaba proposed a new log-determinant estimator without using Taylor expansion or Chebyshev approximation, and discussed the error analysis for this estimator in detail.

For the more accurate trace and log-determinant estimators given in saibaba , a main and attractive feature is that they took advantage of randomized subspace iteration algorithm, which has been extensively studied and found many applications halko ; gittens ; gu2015 . In recent years, some scholars found that the randomized block Krylov space methods have more advantage compared randomized subspace iteration algorithm musco ; drineas ; yuan

. For example, the former has faster eigenvalue convergence rate when the target matrix has a large eigenvalue gap whose location is known. As a result, the randomized block Krylov space method receives more and more attention from the points of view of theory and applications

musco ; drineas ; yuan ; feng2018 . Inspired by the advantage of the randomized block Krylov space method and the work of Saibaba et al. saibaba , we consider the new estimators for trace and log-determinant of Hermitian positive definite matrices and their error analysis in the present paper. The obtained error bounds for these estimators will be tighter than the corresponding ones given in saibaba in most of cases.

The rest of this paper is organized as follows. Section 2 presents some preliminaries. In Section 3, we provide the main algorithms and the error analysis of our estimators. The comparisons between our results with the ones from saibaba are also discussed in this section. Section 4 is devoted to numerical experiments to illustrate our new randomized estimators and to test the error bounds. Finally, the concluding remarks of the whole paper are presented.

## 2 Preliminaries

In this section, we first clarify our assumptions. Then, we review some results on Chebyshev polynomials and the algorithms from saibaba .

### 2.1 Assumptions

Throughout this paper, we assume that is a Hermitian positive semi-definite matrix and has the following eigenvalue decomposition:

 A=UΛU∗,Λ=diag(λ1⋅⋅⋅λn)∈Rn×n, (1)

where is unitary, the eigenvalues satisfy , and we assume there is a gap in these eigenvalues: . As done in saibaba , we partition and as follows

 Λ=(Λ1Λ2),U=(U1U2),

where , , , and .

Given a number and a Gaussian random matrix with , set

 Kq=(AΩ,A2Ω,⋯,AqΩ).

Like drineas , we write

 Kq=range(AΩ,A2Ω,⋯,AqΩ),

call it the block Krylov space in and , and assume . That is, has full column rank. It is known that the elements of the Krylov subspace can be expressed in terms of matrices li2015 ; drineas , where is a polynomial of degree . Considering the eigenvalue decomposition of in (1), it is easy to verify that

 K=ϕ(A)Ω=Uϕ(Λ)U∗Ω,

where

 ϕ(Λ)=diag(ϕ(λ1),ϕ(λ2),⋯,ϕ(λn))=(ϕ(Λ1)ϕ(Λ2)).

Like gu2015 ; saibaba , we denote and hence

 ˆΩ=(U∗1ΩU∗2Ω)=(ˆΩ1ˆΩ2),

where and . We assume that , and hence its Moore-Penrose inverse satisfies .

### 2.2 Chebyshev polynomials

The th degree Chebyshev polynomial is recursively defined as follows

 T0(x)≡1;T1(x)≡x;Tq(x)≡2qTq−1(x)−Tq−2(x).

They can also be expressed as

By Chebyshev polynomials, we construct a polynomial with degree , which will play an important role in our error analysis for algorithms.

 f(x)=Tq−1(2x−λk+1λk+1)Tq−1(2λk−λk+1λk+1), (2)

where and are the eigenvalues of . By the properties of Chebyshev polynomials musco ; drineas , we can check that the polynomial has the following properties:

1. , when ;

2. , when .

From the property 1, it follows that is nonsingular, and

 ∥∥f−1(Λ1)∥∥2≤1. (3)

From the property 2, we have that

 ∥f(Λ2)∥2≤T−1q−1(2λk−λk+1λk+1). (4)

### 2.3 Randomized subspace iteration algorithm

The following algorithm was proposed by Saibaba et al. saibaba . Based on the algorithm, the authors presented the estimators of trace and log-determinant: and .

From this algorithm, it is easy to find that the information in , is discarded when computing . Collecting these information and using them for computing is one of the main motivations of this study, and is also an important topic in the field of randomized algorithm musco ; drineas ; yuan ; feng2018 .

## 3 Algorithms and error analysis

In this section, we first introduce our algorithms for estimating the trace and log-determinant, and then present the error bounds of these new estimators and their proofs.

### 3.1 Algorithms

Comparing with Algorithm 1, we can find that the randomized block Krylov space method collects the information discarded in Algorithm 1 and hence will be more accurate in theory. Numerical experiments in Section 4 also confirm this result. Moreover, the complexity of Algorithm 2 is only a little higher than that for Algorithm 1. This is because the main parts of complexity of the two algorithms, i.e., the complexity in step 1, are the same. Only in steps 2 and 3, the complexity of our algorithm increases. The factor in complexity is increased to . Since is a small number, the total complexity doesn’t change very much. In addition, similar to the discussions in saibaba , Algorithm 2 is only an idealized version and the idealized block Krylov space iteration can be numerically unstable. In practice, we can alternate matrix products and QR factorizations to tackle this problem (saad, , Algorithm 5.2).

### 3.2 Error bounds

###### Theorem 3.1

(Expectation bounds) Let be computed by Algorithm 2 and furthermore, let . Then

 0≤E[Tr(A)−Tr(T)]≤(1+λk+1λkT−2q−1(2λk−λk+1λk+1)Cge)Tr(Λ2)

and

 0 ≤ E[logdet(In+A)−logdet(Iql+T)] ≤ logdet(In−k+λk+1λkT−2q−1(2λk−λk+1λk+1)CgeΛ2)+logdet(In−k+Λ2),

where with .

###### Theorem 3.2

(Concentration bounds) Let be computed by Algorithm 2 and furthermore, let . If , then with probability at least

 0≤Tr(A)−Tr(T)≤(1+λk+1λkT−2q−1(2λk−λk+1λk+1)Cg)Tr(Λ2)

and

 0 ≤ logdet(In+A)−logdet(Iql+T) ≤ logdet(In−k+λk+1λkT−2q−1(2λk−λk+1λk+1)CgΛ2)+logdet(In−k+Λ2),

where with .

In saibaba , Saibaba et al. presented the following error bounds for estimators of trace and log-determinant produced by Algorithm 1.

###### Theorem 3.3

(Expectation bounds) saibaba Let be computed by Algorithm 1 and furthermore, let . Then

 0≤E[Tr(A)−Tr(T)]≤(1+(λk+1λk)2q−1Cge)Tr(Λ2)

and

 0 ≤ E[logdet(In+A)−logdet(Il+T)] ≤ logdet(In−k+(λk+1λk)2q−1CgeΛ2)+logdet(In−k+Λ2).
###### Theorem 3.4

(Concentration bounds) saibaba Let be computed by Algorithm 1 and furthermore, let . If , then with probability at least

 0≤Tr(A)−Tr(T)≤(1+(λk+1λk)2q−1Cg)Tr(Λ2)

and

 0 ≤ logdet(In+A)−logdet(Il+T) ≤ logdet(In−k+(λk+1λk)2q−1CgΛ2)+logdet(In−k+Λ2).

Note that

 2λk−λk+1λk+1≥λkλk+1>1,

and increases faster than when and . Thus, the term

 T−2q−1((2λk−λk+1)/λk+1)

in our bounds is smaller than in the bounds in Theorems 3.3 and 3.4 when . However, it must be pointed out that the order of the matrix in the bounds in Theorems 3.3 and 3.4 is not . If we set the order to be , i.e., set , the terms and in Theorems 3.3 and 3.4 will become small and hence the bounds will be reduced. In this case, our bounds can’t be always tighter than the corresponding ones in Theorems 3.3 and 3.4 when . However, numerical experiments show that in most of cases, our bounds are tighter. The following are some simple examples. We set , and , and , respectively. Upon some computations, we have Table 1, where and are derived from and , respectively, by replacing with .

### 3.3 Proofs of Theorems 3.1 and 3.2

We only prove the structural (deterministic) error bounds for trace and log-determinant estimators, i.e., we consider to be any matrix satisfying assumptions given in Section 2.1. The final results, i.e., the expectation and concentration error bounds in Theorems 3.1 and 3.2, can be derived immediately as done in (saibaba, , Sections 4.1.1 and 4.1.2) by combining the structural error bounds in Theorems 3.5 and 3.6 given below and (saibaba, , Lemmas 4 and 5). We first do some preparation for these proofs. A useful lemma is listed as follows.

###### Lemma 1

gu2015 Assume that is nonsingular, and the thin QR factorizations of and are and , respectively. Then

 QQ∗=˜Q˜Q∗.

This simple result plays an important role in deriving error bounds because we can choose a special to achieve the useful information of range of . This technique was proposed by Gu gu2015 . Now we introduce how to find the special . Using the notation introduced in Section 2.1, we write as follows,

 K=Uϕ(Σ)ˆΩ=U(ϕ(Λ1)ϕ(Λ2))(ˆΩ1ˆΩ2)=U(ϕ(Λ1)ˆΩ1ϕ(Λ2)ˆΩ2). (5)

To make the last block matrix in (5) be simplified, we choose a matrix in the following form:

 X=(ˆΩ†1ϕ−1(Λ1),X2)∈Cl×l,

where is such that is nonsingular and . Thus,

 KX=U(Ik0H1H2),

where

 (6)

In accord with the above block form, we write the thin factorization of in the following form:

 KX=˜Q˜R=(˜Q1,˜Q2)(˜R11˜R12˜R22)=U(Ik0H1H2). (7)

As a result, we have the following thin QR factorization

 U(IkH1)=˜Q1˜R11, (8)

which will be used in our error analysis.

Furthermore, we also need the following three results.

###### Lemma 2

Suppose . Then, for any Hermitian positive semi-definite matrix , the following inequality holds

 Tr(PNA)⩽Tr(PMA),

where and are the orthogonal projections on and , respectively.

Proof From the proof of (halko, , Proposition 8.5), we know , where denotes the Löwner partial order (horn13, , Definition 7.7.1). Then, by the known conjugation rule (see e.g., (horn13, , Theorem 7.7.2)),

 A12PNA12⪯A12PMA12.

Further, by the properties of Löwner partial order and trace, we have

 Tr(A12PNA12)⪯Tr(A12PMA12),  i.e., Tr(PNA)⩽Tr(PMA).

###### Lemma 3

For any two Hermitian positive semi-definite matrices and with the same order, the following inequalities hold

 Tr(AB)⩽Tr(A)λmax(B)⩽Tr(A)Tr(B), Tr(AB)⩽λmax(A)Tr(B)⩽Tr(A)Tr(B),

where and are the largest eigenvalues of and , respectively.

Proof These inequalities are well-known results and can be derived from, e.g., von Neumann trace theorem (horn13, , Theorem 7.4.1.1) directly. ∎

###### Lemma 4

(ouell, , Corollary 2.1) If and , then

 det(Im±BC)=det(In±CB).

#### 3.3.1 Structural bounds for trace estimator

###### Theorem 3.5

Let be computed by Algorithm 2. Then

 0≤Tr(A)−Tr(T)≤(1+T−1q−1(2λk−λk+1λk+1)∥∥∥ˆΩ2ˆΩ†1∥∥∥2)Tr(Λ2). (9)

When , the following bound is tighter,

 0≤Tr(A)−Tr(T)≤(1+λk+1λkT−2q−1(2λk−λk+1λk+1)∥∥∥ˆΩ2ˆΩ†1∥∥∥22)Tr(Λ2). (10)

Proof The lower bound has been proven in (saibaba, , Lemma 1). In the following, we show that the upper bounds hold.

Since is an element of , we have

 range(K)⊂Kq.

Thus, by Lemma 1, we get

 range(˜Q)=range(Q)⊆range(Qq), (11)

which together with Lemma 2 and (7) implies

 Tr(A)−Tr(T) = Tr(A)−Tr(Q∗qAQq)=Tr(A)−Tr(QqQ∗qA) (12) ≤ Tr(A)−Tr(QQ∗A)=Tr(A)−Tr(˜Q˜Q∗A) ≤ Tr(A)−Tr(˜Q1˜Q∗1A)=Tr(A)−Tr(˜Q∗1A˜Q1).

From (8), we have

 ˜Q1=U(IkH1)˜R−111  and  ˜R∗11˜R11=Ik+H∗1H1. (13)

As a result,

 ˜Q∗1A˜Q1 = (14) = (˜R∗11)−1(Λ1+H∗1Λ2H1)˜R−111,

and hence

 Tr(˜Q∗1A˜Q1) = Tr((Λ1+H∗1Λ2H1)(˜R∗11˜R11)−1)=Tr((Λ1+H∗1Λ2H1)(Ik+H∗1H1)−1) (15) = Tr(Λ1(Ik+H∗1H1)−1)+Tr(Λ2H1(Ik+H∗1H1)−1H∗1).

Substituting (15) into (12) and noting and Lemma 3 gives

 Tr(A)−Tr(T) ≤ Tr(Λ1(Ik−(Ik+H∗1H1)−1))+Tr(Λ2(Ik−H1(Ik+H∗1H1)−1H∗1)) (16) ≤ Tr(Λ1(Ik−(Ik+H∗1H1)−1)) + Tr(Λ2)λmax(Ik−H1(Ik+H∗1H1)−1H∗1) ≤ Tr(Λ1(Ik−(Ik+H∗1H1)−1))+Tr(Λ2).

Note that

 (Ik+H∗1H1)−1=Ik−H∗1(In−k+H1H∗1)−1H1.

Then

 Tr(Λ1(Ik−(Ik+H∗1H1)−1))=Tr(Λ1H∗1(In−k+H1H∗1)−1H1).

Further, setting , where is defined in (2), and considering in (6), von Neumann trace theorem (horn13, , Theorem 7.4.1.1)

, and singular value inequalities

(horn, , Theorem 3.3.14), we have

 Tr(Λ1(Ik − (Ik+H∗1H1)−1)) = Tr(Λ1Λ−11f−1(Λ1)(ˆΩ2ˆΩ†1)∗Λ2f(Λ2)(In−k+H1H∗1)−1H1) ≤ k∑j=1σj(f−1(Λ1)(ˆΩ2ˆΩ†1)∗Λ2f(Λ2))σj((In−k+H1H∗1)−1H1) ≤ ≤

which combined with (3) and (4) leads to

 Tr(Λ1(Ik−(Ik+H∗1H1)−1)) ≤T−1q−1(2λk−λk+1λk+1)∥∥∥ˆΩ2ˆΩ†1∥∥∥2∥∥(In−k+H1H∗1)−1H1∥∥2Tr(Λ2). (17)

Furthermore, from saibaba , we have

 ∥∥(In−k+H1H∗1)−1H1∥∥2≤1, (18)

or

 ∥∥(In−k+H1H∗1)−1H1∥∥2 ≤ ∥H1∥2≤∥Λ2f(Λ2)∥2∥∥Λ−11f−1(Λ1)∥∥2∥∥∥ˆΩ2ˆΩ†1∥∥∥2 (19) ≤ λk+1λkT−1q−1(2λk−λk+1λk+1)∥∥∥ˆΩ2ˆΩ†1∥∥∥2.

Thus, combining (16), (3.3.1), (18), and (19), we derive the desired upper bounds (9) and (10). ∎

#### 3.3.2 Structural bounds for log-determiant estimator

###### Theorem 3.6

Let be computed by Algorithm 2. Then

 0 ≤ logdet(In+A)−logdet(Iql+T) (20) ≤ logdet(In−k+ηΛ2)+logdet(In−k+Λ2),

where .

Proof The lower bound has been derived in (saibaba, , Lemma 2). It is sufficient to show that the upper bound holds.

From (11), it follows that there is an orthonormal matrix such that , and hence

 Q∗AQ=Y∗Q∗qAQqY=Y∗TY.

Thus, by the proved lower bound, i.e.,

 logdet(In+A)−logdet(Iql+T)⩾0, (21)

we have

 logdet(Iql+T)−logdet(Il+Y∗TY)⩾0.

That is,

 logdet(Iql+Q∗qAQq)−logdet(Il+Q∗AQ)⩾0.

Thus,

 logdet(In+A)−logdet(Iql+T)≤logdet(In+A)−logdet(Il+Q∗AQ).

By Lemmas 4 and 1, it is seen that

 logdet(In+A)−logdet(Iql+T) ≤ logdet(In+A)−logdet(Il+QQ∗A) (22) = logdet(In+A)−logdet(Il+˜Q˜Q∗A) = logdet(In+A)−logdet(Il+˜Q∗A˜Q).

From (7), we can check that and is orthonormal. Thus, by (21) again, we have

 logdet(Il+˜Q∗A˜Q)⩾logdet(Ik+˜Q∗1A˜Q1),

which together with (22) gives

 logdet(In+A)−logdet(Iql+T) ≤logdet(In+A)−logdet(Ik+˜Q∗1A˜Q1). (23)

Setting and considering (14), we get

 M=(˜R∗11)−1(Λ1+H∗1Λ2H1)˜R−111.

Thus, by Lemma 4 and noting (13), we have

 logdet(Ik+M)=logdet(Ik+M1)=logdet(Ik+M2), (24)

where

 M1 =(Λ1+H