The global extended-rational Arnoldi method for matrix function approximation

The numerical computation of matrix functions such as f(A)V, where A is an n× n large and sparse square matrix, V is an n × p block with p≪ n and f is a nonlinear matrix function, arises in various applications such as network analysis (f(t)=exp(t) or f(t)=t^3), machine learning (f(t)=log(t)), theory of quantum chromodynamics (f(t)=t^1/2), electronic structure computation, and others. In this work, we propose the use of global extended-rational Arnoldi method for computing approximations of such expressions. The derived method projects the initial problem onto an global extended-rational Krylov subspace RK^e_m(A,V)=span({∏_i=1^m(A-s_iI_n)^-1V,...,(A-s_1I_n)^-1V,V,AV, ...,A^m-1V}) of a low dimension. An adaptive procedure for the selection of shift parameters {s_1,...,s_m} is given. The proposed method is also applied to solve parameter dependent systems. Numerical examples are presented to show the performance of the global extended-rational Arnoldi for these problems.

Authors

• 2 publications
• 1 publication
• 12 publications
02/23/2022

Low-memory Krylov subspace methods for optimal rational matrix function approximation

We describe a Lanczos-based algorithm for approximating the product of a...
01/31/2017

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

This paper proposes an efficient method for computing selected generaliz...
06/01/2022

A barycentric trigonometric Hermite interpolant via an iterative approach

In this paper an interative approach for constructing the Hermite interp...
08/26/2020

Low-rank updates of matrix functions II: Rational Krylov methods

This work develops novel rational Krylov methods for updating a large-sc...
11/17/2020

Solving Symmetric and Positive Definite Second-Order Cone Linear Complementarity Problem by A Rational Krylov Subspace Method

The second-order cone linear complementarity problem (SOCLCP) is a gener...
09/08/2021

Matrix functions via linear systems built from continued fractions

A widely used approach to compute the action f(A)v of a matrix function ...
02/28/2020

Generalized Rational Variable Projection With Application in ECG Compression

In this paper we develop an adaptive transform-domain technique based on...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let be a large and sparse matrix, and let with . We are interested in approximating numerically expressions of the form

 I(f):=f(A)V (1)

where is a function that is defined on the convex hull of the spectrum of . The superscript denotes transposition. The need to evaluate matrix functions of the forms (1) arises in various applications such as in network analysis ESTRADA , machine learning NGO , electronic structure computation BARONI ; SAAD and the solution of ill-posed problems FENU ; HANSEN . When the matrix is a small to meduim size, the matrix function can be determined by the spectral factorization of ; see Higham ; HANSEN , for discussions on several possible definitions of matrix functions. In many applications, the matrix is large that it is impractical to evaluate its spectral factorization. For this case, several projection methods have been developed. These methods consist of projecting the problem (1) onto a Krylov subspace with a small dimension. The projected part of is then used to evaluate by determining the spectral factorization of and then get an approximation of . In the context of approximating the action of a matrix function

on a some vector

, several polynomial methods Beckermann ; DruskinTwo ; SAAD1 ; Hochbruck based on the standard Arnoldi and Lanczos Krylov methods have been proposed. Druskin and Knizhnerman DruskinExtended , have shown that when cannot be approximated accurately by polynomials on the spectrum of , then cannot be approximated accurately by classical methods. They proposed the extended Krylov method for the symmetric case and the process was generalized to the nonsymmetric matrices by Simoncini in SimonciniNew . This method was applied to approximate the solution of the Sylvester, Riccati and Lyapunov equations Agougil ; Heyouni ; SimonciniNew .

Another technique for the evaluation of matrix functions is the rational Arnoldi method. This process was first proposed by Ruhe Ruhe

in the context of computing the eigenvalues and have been used during the last years for the approximation of matrix functions, see.

Guttel ; Pranic ; Druskin ; DruskinSim ; DruskinKnizhnerman ; Knizhnerman . In this paper, we present the global extended-rational Arnoldi method to approximate the matrix function (1). The extended-rational Arnoldi method was proposed and applied to model reduction by Abidi1 . As mentioned in Abidi1 , the extended-rational Krylov subspace (3) is richer than the rational Krylov subspace and represents a generalization of the extended Krylov subspace. We propose an adaptive computation of the shifts to generate an -orthonormal basis for (3) in the case where for definite matrix . This procedure is based on a generalization of the procedure used in Druskin . In addition, we apply the proposed method to solve parameter dependent systems (19) with multiple right hand sides Gu ; SIMONCINI . These parameter systems have numerous applications in control theory, structural dynamics and time-dependent PDEs; see, Feriani .

This paper is organized as follows. In Section , we give some preliminaries and then we introduce the global extended-rational Arnoldi process with some properties. Section describes the application of this process to the approximation of the matrix function given in (1) and solving the parameter systems. We also propose an adaptive computation of the shifts . Finally, some numerical experiments that illustrate the quality of the computed approximations are presented in Section .

2 The global extended-rational Arnoldi method

2.1 Preliminaries and notations

We begin by recalling some notations and definitions that will be used throughout this paper. The Kronecker product satisfies the following properties

1. .

Definition 1

BOUYOULI Partition the matrices and into block columns , and define the -product of the matrices and as

 MT⋄N=[⟨Nj,Mi⟩F]j=1,…,li=1,…,s∈Rs×l. (2)

The following proposition gives some properties satisfied by the above product.

Proposition 1

BellalijJ ; BOUYOULI Let , , , and . Then we have,

1. .

2. .

3. .

4. .

5. .

6. .

2.2 Description of the process

Global Krylov subspace techniques were first proposed in Jbilou for solving linear systems of equations with multiple right hand sides and also for large-scale Lyapunov matrix equations. The global extended-rational Krylov subspace was first introduced in Abidi1 and it is defined as the subspace of spanned by the vectors (blocks)

 V,AV,…,Am−1V,and(A−s1In)−1V,(A−s1In)−1(A−s2In)−1V,…,m∏i=1(A−siIn)−1V.

This subspace is denoted by

 RKem(A,V)=span{V,(A−s1In)−1V,…,Am−1V,m∏i=1(A−siIn)−1V}⊂Rn×p (3)

where are some selected complex parameters all distinct from the eigenvalues of . We notice here that the subspace is a subspace of so the vectors are blocks of size . We assume that all the vectors (blocks) in (3) are linearly independent. Now, we describe the global extended-rational Arnoldi process to generate an -orthonormal basis with for the global extended-rational Krylov subspace , and we derive some algebraic relations related to this process. The block vector ’s are said to be -orthonormal, (with respect to the Frobenius inner product), if

Based on the Gram-Schmidt orthogonalization process, the first two blocks and are computed via the formulas

 V1=Vα1,1,V2=˜V2α2,2,˜V2=(A−s1In)−1V−α1,2V1, (4)

where , and . To compute the block vectors and , for , we use the following formulas

 h2j+1,2j−1V2j+1=AV2j−1−2j∑i=1hi,2j−1Vi=˜V2j+1h2j+2,2jV2j+2=(A−sjIn)−1V2j−2j+1∑i=1hi,2jVi=˜V2j+2 (5)

where the coefficients and are determined so that the vectors satisfy the -orthogonality condition

 V2j+1⊥FV1,…,V2jandV2j+2⊥FV1,…,V2j+1.

Thus, the coefficients and are written as

 hi,2j−1=⟨AV2j−1,Vi⟩Fandhi,2j=⟨(A−sjIn)−1V2j,Vi⟩F (6)

The coefficients and are such that and respectively. Hence,

 h2j+1,2j−1=∥˜V2j+1∥F and h2j+2,2j=∥˜V2j+2∥F (7)

The global extended-rational Arnoldi algorithm is summarized in the following algorithm (Algorithm 1).

The set of shifts is chosen a priori or adaptively during the algorithm. The selection of shifts will be explained later. If all and are numerically nonzero, then Algorithm 1 determines an -orthonormal basis of the global extended-rational Krylov subspace . Algorithm 1 constructs also an upper block Hessenberg matrix . We now introduce the matrix given by

 T2m=VT2m⋄AV2m=[ti,j], (8)

where . is the restriction of the matrix to the extended-rational Krylov subspace .

Proposition 2

Let the matrix , and let . The -orthonormal basis determined by the recursion formulas (5) satisfies, for

 AV2j−1 and AV2j∈span{V1,…,V2j+1} (9)

We notice that in the formulas given by (9), is a block upper Hessenberg matrix with blocks, since and ( for ; ).

Proposition 3

Assume that steps of Algorithm 1 have been run and let , then we have the following relations

 (10)

where the matrix

corresponds to the last two columns of the identity matrix

.

Proof

According to (9), we obtain , then there exists a matrix such that

 AV2m=V2m+2(T⊗Ip).

Using properties of the -product, we obtain

 VT2m+2⋄AV2m =VT2m+2⋄[V2m+2(T⊗Ip)] =(VT2m+2⋄V2m+2)T=T

It follows that . Since is an upper block Hessenberg matrix with blocks and then can be decomposed as follows

Which completes the proof.

The next proposition gives the entries of in terms of the recursion coefficients. This will allow us to compute the entries quite efficiently.

Proposition 4

Let and be the upper block Hessenberg matrices where and are the -th column of and

, respectively. Then the odd columns are such that

 t:,2j−1=h:,2j−1for j=1,…,m. (11)

The even columns satisfy the following relations

 t:,2 =(α1,1+s1α1,2)e1+s1α2,2e2−α1,2h:,1α2,2, (12) and for j=1…,m−1 t:,2j+2 =1h2j+2,2j[sjh2j+2,2je2j+2+e2j−2j+1∑i=1hi,2j(t:,i−sjei)], (13)

where corresponds to the -th column vector of the canonical basis and and are defined from (4).

Proof

We have . Therefore, (11) follows from the expression of in (6). Using (4), we obtain

 α1,2V1+α2,2V2=α1,1(A−s1In)−1V1.

Multiplying this last equality by from the left gives

 α1,2(A−s1In)V1+α2,2(A−s1In)v2=α1,1V1.

Then the vector is written as follows

 AV2=1α2,2[(α1,1+s1α1,2)V1+s1α2,2V2−α1,2AV1].

The relation (12) is obtained by multiplying by with the -product from the left since . The formula (13) is obtained from the expression of for . Thus, multiplying the second equality in (5) by from the left gives

 h2j+2,2j(A−sjIn)V2j+2=V2j−2j+1∑i=1hi,2j(A−sjIn)Vi.

Then,

 AV2j+2=1h2j+2,2j[h2j+2,2jsjV2j+2+V2j−2j+1∑i=1hi,2j(Avi−sjVi)].

The expression (13) is easily obtained by multiplying by with the -product from the left. This concludes the proof of the proposition.

We notice that if is a symmetric matrix, then the restriction matrix in (8) reduces to a symmetric and pentadiagonal matrix with the following nontrivial entries,

 ti,2j−1 =hi,2j−1for i∈{2j−3,…,2j+1}, t1,2 t2,2 =s1−t2,1α1,2/α2,2, t3,2 =−t3,1α1,2/α2,2.

And for

 t2j+1,2j+2 =(sjh2j+1,2j−2j+1∑i=2j−1t2j+1,ihi,2j)/h2j+2,2j, t2j+2,2j+2 =sj−t2j+2,2j+1h2j+1,2j/h2j+2,2j,and t2j+3,2j+2 =−t2j+3,2j+1h2j+1,2j/h2j+2,2j.

3 Application to the approximation of matrix functions

In this section, we will show how to use the global extended-rational Arnoldi algorithm to approximate expression of the form . As in Jbilou , the global extended-rational Krylov subspace defined in (3) can be written as

 RKem(A,V)={X2m∈Rn×p/X2m=V2m(Y2m⊗Ip) where Y2m∈R2m}. (14)

Then the expression can be approximated by

 fer2m:=P2m(f(A)V)=∥V∥FV2m(f(T2m)e1⊗Ip), (15)

where for some .
The matrix is the matrix corresponding to the -orthonormal basis for constructed by applying steps of Algorithm 1 to the pair . is the projection matrix defined by (8) and corresponds to the first column of the identity matrix .

Lemma 1 (Exactness)

Let be the matrix generated by Algorithm 1 and let the matrix as defined by (8). Then for any rational function we have

 P2m(~r2m(A)V)=∥V∥FV2m[(~r2m(T2m)e1)⊗Ip] (16)

In particular, if then the global extended-rational Arnoldi approximation is exact, i.e., we have

 r2m(A)V=∥V∥FV2m[(r2m(T2m)e1)⊗Ip] (17)

where denotes the set of polynomials of degree at most and is the polynomial of degree , whose roots are the components of , i.e., .

Proof

Following the idea in (Guttel, , Lemma 3.1), we consider and we first show by induction that

 P2m(Ajq)=V2m(Tj2m(VT2m⋄q)⊗Ip)for j=0,…,2m. (18)

Assertion (18) is obviously true for . Assume that it is true for some . Then by the definition of a extended-rational Krylov space we have , and therefore

 P2m(Aj+1q) =P2m(AP2m(Ajq))=P2m(AV2m[Tj2m(VT2m⋄q)⊗Ip]) =V2m([VT2m⋄(AV2m[(Tj2m(VT2m⋄q))⊗Ip])⊗Ip) Using the properties of the ⋄-product, we obtain

which establishes (18). By linearity, we obtain

 V=qm(A)q=V2m(qm(T2m)(VT2m⋄q)⊗Ip).

Using the properties of the -product, we obtain

 VT2m⋄q=∥V∥Fq−1m(T2m)e1.

Replacing in (18) completes the proof.

We consider a convex compact set and we define as the set of analytic functions in a neighborhood of equipped with the uniform norm . will denote the convex hull. In CROUZEIX , it was shown that there exists a universal constant such that

 ∥f(B)∥≤C∥f∥L∞(Λ),∀f∈A(Λ),

where , with Based on this inequality, the following result gives an upper bound for the norm of the error where is the approximation given by (15).

Corollary 1

We assume that , and let . Then the global extended-rational Arnoldi approximation defined by (15) satisfies

 ∥f(A)V−fer2m∥F≤2C∥V∥Fminr2m∈Π2m−1/qm∥f−r2m∥L∞(Λ)
Proof

According to (17), we have for every rational function . Thus,

 ∥f(A)V−fer2m∥F =∥f(A)V−∥V∥FV2m(f(T2m)e1⊗Ip)−r2m(A)V+∥V∥FV2m(r2m(T2m)e1⊗Ip)∥F ≤∥V∥F(∥f(A)−r2m(A)∥+∥V2m[(f(T2m−r2m(T2m))e1×Ip]∥F ≤∥V∥F(∥f(A)−r2m(A)∥+∥f(T2m)−r2m(T2m)∥) ≤2C∥V∥F∥f−r2m∥L∞(Λ).

Which completes the proof.

3.1 Shifted linear systems

We consider the solution of the parameterized nonsingular linear systems with multiple right hand sides

 (A−σIn)X=B, (19)

which needs to be solved for many values of , where . The solution may be written as , with is the resolvant function. Then the approximate solutions generated by the global extended-rational algorithm to the pair are obtained as follows

 Z2m(σ)=X2m(σ)−X0(σ)∈RKem(A,R0)

where are the residual block vectors associated to initial guess . By (14) where is determined such that the new residual associated to is -orthogonal to . This yields

 X2m(σ)=X0+V2m(Y2m(σ)⊗Ip)% andVT2m⋄R2m(σ)=0 (20)

Using (20) relations and the following decomposition

 (A−σIn)V2m=V2m[(T2m−σI2m)⊗Ip]+V2m+1([t2m+1,2m−1,t2m+1,2m]ETm⊗Ip), (21)

the reduced linear system can be written as

 (T2m−σI2m)Y2m(σ)=∥R0∥Fe1, (22)

then the approximate solution will be

 X2m=∥R0∥FV2m((T2m−σI2m)−1e1⊗Ip).

This equality is equivalent to (15) when corresponds to the resolvent function. In order to find a good choice of shift parameters in the Algorithm 1, we consider the following function

 f2m,m(λ,s)=fλ1…,λ2m,s1,…,sm(λ,s)=fm,m(λ,s)−1λ−s[g2m(λ)g2m(s)−gm(λ)gm(s)] (23)

where corresponds to the so-called skeleton approximation introduced in the study of Tyrtyshnikov Tyrtyshnikov , i.e.,

 fm,m(λ,s):=[1λ−s1⋯1λ−sm]M−1⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣1λ1−s⋮1λm−s⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦,Mi,j=1λi−sj

are the eigenvalues of , and

 gm(z)=(z−λ1)…(z−λm)(z−s1)…(z−sm)g2m(z)=(z−λ1)…(z−λ2m)(z−s1)…(z−sm)
Proposition 5

The function defined in (23) is an rational function of the first variable , and an rational function of the second variable interpolating at and . Moreover, the relative error of this interpolation is

 ϵ(λ,s)=1−(λ−s)f2m,m(λ,s)=g2m(λ)g2m(s).
Proof

The rational function can be expressed as

 f2m,m(λ,s)=fm,m(λ,s)−1λ−sψ(λ,s)(λ−s1)…(λ−sm)(s−λ1)…(s−λ2m)

where

 ψ(λ,s)=(λ−λ1)…(λ−λm)(s−s1)…(s−sm)[(λ−λm+1)…(λ−λ2m−(s−λm+1)…(s−λ2m)]

Observe that is divisible by , then there exists a function such that . Moreover, is a polynomial function of degree of each variable. Then simplifies to