The global extended-rational Arnoldi method for matrix function approximation

by   A. H. Bentbib, et al.

The numerical computation of matrix functions such as f(A)V, where A is an n× n large and sparse square matrix, V is an n × p block with p≪ n and f is a nonlinear matrix function, arises in various applications such as network analysis (f(t)=exp(t) or f(t)=t^3), machine learning (f(t)=log(t)), theory of quantum chromodynamics (f(t)=t^1/2), electronic structure computation, and others. In this work, we propose the use of global extended-rational Arnoldi method for computing approximations of such expressions. The derived method projects the initial problem onto an global extended-rational Krylov subspace RK^e_m(A,V)=span({∏_i=1^m(A-s_iI_n)^-1V,...,(A-s_1I_n)^-1V,V,AV, ...,A^m-1V}) of a low dimension. An adaptive procedure for the selection of shift parameters {s_1,...,s_m} is given. The proposed method is also applied to solve parameter dependent systems. Numerical examples are presented to show the performance of the global extended-rational Arnoldi for these problems.



page 1

page 2

page 3

page 4


Low-memory Krylov subspace methods for optimal rational matrix function approximation

We describe a Lanczos-based algorithm for approximating the product of a...

Interior Eigensolver for Sparse Hermitian Definite Matrices Based on Zolotarev's Functions

This paper proposes an efficient method for computing selected generaliz...

A barycentric trigonometric Hermite interpolant via an iterative approach

In this paper an interative approach for constructing the Hermite interp...

Low-rank updates of matrix functions II: Rational Krylov methods

This work develops novel rational Krylov methods for updating a large-sc...

Solving Symmetric and Positive Definite Second-Order Cone Linear Complementarity Problem by A Rational Krylov Subspace Method

The second-order cone linear complementarity problem (SOCLCP) is a gener...

Matrix functions via linear systems built from continued fractions

A widely used approach to compute the action f(A)v of a matrix function ...

Generalized Rational Variable Projection With Application in ECG Compression

In this paper we develop an adaptive transform-domain technique based on...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let be a large and sparse matrix, and let with . We are interested in approximating numerically expressions of the form


where is a function that is defined on the convex hull of the spectrum of . The superscript denotes transposition. The need to evaluate matrix functions of the forms (1) arises in various applications such as in network analysis ESTRADA , machine learning NGO , electronic structure computation BARONI ; SAAD and the solution of ill-posed problems FENU ; HANSEN . When the matrix is a small to meduim size, the matrix function can be determined by the spectral factorization of ; see Higham ; HANSEN , for discussions on several possible definitions of matrix functions. In many applications, the matrix is large that it is impractical to evaluate its spectral factorization. For this case, several projection methods have been developed. These methods consist of projecting the problem (1) onto a Krylov subspace with a small dimension. The projected part of is then used to evaluate by determining the spectral factorization of and then get an approximation of . In the context of approximating the action of a matrix function

on a some vector

, several polynomial methods Beckermann ; DruskinTwo ; SAAD1 ; Hochbruck based on the standard Arnoldi and Lanczos Krylov methods have been proposed. Druskin and Knizhnerman DruskinExtended , have shown that when cannot be approximated accurately by polynomials on the spectrum of , then cannot be approximated accurately by classical methods. They proposed the extended Krylov method for the symmetric case and the process was generalized to the nonsymmetric matrices by Simoncini in SimonciniNew . This method was applied to approximate the solution of the Sylvester, Riccati and Lyapunov equations Agougil ; Heyouni ; SimonciniNew .

Another technique for the evaluation of matrix functions is the rational Arnoldi method. This process was first proposed by Ruhe Ruhe

in the context of computing the eigenvalues and have been used during the last years for the approximation of matrix functions, see.

Guttel ; Pranic ; Druskin ; DruskinSim ; DruskinKnizhnerman ; Knizhnerman . In this paper, we present the global extended-rational Arnoldi method to approximate the matrix function (1). The extended-rational Arnoldi method was proposed and applied to model reduction by Abidi1 . As mentioned in Abidi1 , the extended-rational Krylov subspace (3) is richer than the rational Krylov subspace and represents a generalization of the extended Krylov subspace. We propose an adaptive computation of the shifts to generate an -orthonormal basis for (3) in the case where for definite matrix . This procedure is based on a generalization of the procedure used in Druskin . In addition, we apply the proposed method to solve parameter dependent systems (19) with multiple right hand sides Gu ; SIMONCINI . These parameter systems have numerous applications in control theory, structural dynamics and time-dependent PDEs; see, Feriani .

This paper is organized as follows. In Section , we give some preliminaries and then we introduce the global extended-rational Arnoldi process with some properties. Section describes the application of this process to the approximation of the matrix function given in (1) and solving the parameter systems. We also propose an adaptive computation of the shifts . Finally, some numerical experiments that illustrate the quality of the computed approximations are presented in Section .

2 The global extended-rational Arnoldi method

2.1 Preliminaries and notations

We begin by recalling some notations and definitions that will be used throughout this paper. The Kronecker product satisfies the following properties

  1. .

Definition 1

BOUYOULI Partition the matrices and into block columns , and define the -product of the matrices and as


The following proposition gives some properties satisfied by the above product.

Proposition 1

BellalijJ ; BOUYOULI Let , , , and . Then we have,

  1. .

  2. .

  3. .

  4. .

  5. .

  6. .

2.2 Description of the process

Global Krylov subspace techniques were first proposed in Jbilou for solving linear systems of equations with multiple right hand sides and also for large-scale Lyapunov matrix equations. The global extended-rational Krylov subspace was first introduced in Abidi1 and it is defined as the subspace of spanned by the vectors (blocks)

This subspace is denoted by


where are some selected complex parameters all distinct from the eigenvalues of . We notice here that the subspace is a subspace of so the vectors are blocks of size . We assume that all the vectors (blocks) in (3) are linearly independent. Now, we describe the global extended-rational Arnoldi process to generate an -orthonormal basis with for the global extended-rational Krylov subspace , and we derive some algebraic relations related to this process. The block vector ’s are said to be -orthonormal, (with respect to the Frobenius inner product), if

Based on the Gram-Schmidt orthogonalization process, the first two blocks and are computed via the formulas


where , and . To compute the block vectors and , for , we use the following formulas


where the coefficients and are determined so that the vectors satisfy the -orthogonality condition

Thus, the coefficients and are written as


The coefficients and are such that and respectively. Hence,


The global extended-rational Arnoldi algorithm is summarized in the following algorithm (Algorithm 1).

Inputs: Matrix , initial block , and the shifts .

  1. For

    1. .

    2. For

      • endfor

    3. for

      • endfor

    4. endfor

Output: -Orthonormal basis

Algorithm 1 The global extended-rational Arnoldi algorithm (GERA)

The set of shifts is chosen a priori or adaptively during the algorithm. The selection of shifts will be explained later. If all and are numerically nonzero, then Algorithm 1 determines an -orthonormal basis of the global extended-rational Krylov subspace . Algorithm 1 constructs also an upper block Hessenberg matrix . We now introduce the matrix given by


where . is the restriction of the matrix to the extended-rational Krylov subspace .

Proposition 2

Let the matrix , and let . The -orthonormal basis determined by the recursion formulas (5) satisfies, for


We notice that in the formulas given by (9), is a block upper Hessenberg matrix with blocks, since and ( for ; ).

Proposition 3

Assume that steps of Algorithm 1 have been run and let , then we have the following relations


where the matrix

corresponds to the last two columns of the identity matrix



According to (9), we obtain , then there exists a matrix such that

Using properties of the -product, we obtain

It follows that . Since is an upper block Hessenberg matrix with blocks and then can be decomposed as follows

Which completes the proof.

The next proposition gives the entries of in terms of the recursion coefficients. This will allow us to compute the entries quite efficiently.

Proposition 4

Let and be the upper block Hessenberg matrices where and are the -th column of and

, respectively. Then the odd columns are such that


The even columns satisfy the following relations

and for

where corresponds to the -th column vector of the canonical basis and and are defined from (4).


We have . Therefore, (11) follows from the expression of in (6). Using (4), we obtain

Multiplying this last equality by from the left gives

Then the vector is written as follows

The relation (12) is obtained by multiplying by with the -product from the left since . The formula (13) is obtained from the expression of for . Thus, multiplying the second equality in (5) by from the left gives


The expression (13) is easily obtained by multiplying by with the -product from the left. This concludes the proof of the proposition.

We notice that if is a symmetric matrix, then the restriction matrix in (8) reduces to a symmetric and pentadiagonal matrix with the following nontrivial entries,

And for

3 Application to the approximation of matrix functions

In this section, we will show how to use the global extended-rational Arnoldi algorithm to approximate expression of the form . As in Jbilou , the global extended-rational Krylov subspace defined in (3) can be written as


Then the expression can be approximated by


where for some .
The matrix is the matrix corresponding to the -orthonormal basis for constructed by applying steps of Algorithm 1 to the pair . is the projection matrix defined by (8) and corresponds to the first column of the identity matrix .

Lemma 1 (Exactness)

Let be the matrix generated by Algorithm 1 and let the matrix as defined by (8). Then for any rational function we have


In particular, if then the global extended-rational Arnoldi approximation is exact, i.e., we have


where denotes the set of polynomials of degree at most and is the polynomial of degree , whose roots are the components of , i.e., .


Following the idea in (Guttel, , Lemma 3.1), we consider and we first show by induction that


Assertion (18) is obviously true for . Assume that it is true for some . Then by the definition of a extended-rational Krylov space we have , and therefore

Using the properties of the -product, we obtain

which establishes (18). By linearity, we obtain

Using the properties of the -product, we obtain

Replacing in (18) completes the proof.

We consider a convex compact set and we define as the set of analytic functions in a neighborhood of equipped with the uniform norm . will denote the convex hull. In CROUZEIX , it was shown that there exists a universal constant such that

where , with Based on this inequality, the following result gives an upper bound for the norm of the error where is the approximation given by (15).

Corollary 1

We assume that , and let . Then the global extended-rational Arnoldi approximation defined by (15) satisfies


According to (17), we have for every rational function . Thus,

Which completes the proof.

3.1 Shifted linear systems

We consider the solution of the parameterized nonsingular linear systems with multiple right hand sides


which needs to be solved for many values of , where . The solution may be written as , with is the resolvant function. Then the approximate solutions generated by the global extended-rational algorithm to the pair are obtained as follows

where are the residual block vectors associated to initial guess . By (14) where is determined such that the new residual associated to is -orthogonal to . This yields


Using (20) relations and the following decomposition


the reduced linear system can be written as


then the approximate solution will be

This equality is equivalent to (15) when corresponds to the resolvent function. In order to find a good choice of shift parameters in the Algorithm 1, we consider the following function


where corresponds to the so-called skeleton approximation introduced in the study of Tyrtyshnikov Tyrtyshnikov , i.e.,

are the eigenvalues of , and

Proposition 5

The function defined in (23) is an rational function of the first variable , and an rational function of the second variable interpolating at and . Moreover, the relative error of this interpolation is


The rational function can be expressed as


Observe that is divisible by , then there exists a function such that . Moreover, is a polynomial function of degree of each variable. Then simplifies to