Log In Sign Up

A class of GADI methods for time-dependent linear systems with multitask kernel-learning parameter prediction

This paper develops a class of general alternating-direction implicit (GADI) iteration methods for solving time-dependent linear systems (TDLS), including linear differential systems and linear matrix systems. We present a GADI Kronecker product splitting (GADI-KP) method and prove the convergence with weak restrictions. The generalized Kronecker product splitting method and Kronecker product splitting method can be unified in the GADI-KP framework. Then, we use the framework to design an effective preconditioner of Krylov subspace methods for solving TDLS. The GADI-KP method is sensitive to the splitting parameters. Different from traditional theoretical estimate methods, we propose multitask kernel learning Gaussian process regression (GPR) method to predict the relative optimal splitting parameters. This method has solved the multi-parameter optimization in GADI framework and kernel selection in GPR method. Finally, we apply our approach to solve a two-dimensional diffusion equation, a two-dimensional convection-diffusion equation, and a differential Sylvester matrix equation. Numerical experiments illustrate that the GADI-KP framework and its preconditioning form have advantage over efficiency and superiority compared with the existing results.


page 1

page 2

page 3

page 4


A time splitting method for the three-dimensional linear Pauli equation

We present and analyze a numerical method to solve the time-dependent li...

An efficient iterative method for solving parameter-dependent and random diffusion problems

This paper develops and analyzes a general iterative framework for solvi...

Efficient numerical valuation of European options under the two-asset Kou jump-diffusion model

This paper concerns the numerical solution of the two-dimensional time-d...

On the stability and performance of the solution of sparse linear systems by partitioned procedures

In this paper, we present, evaluate and analyse the performance of paral...

HSS(0): an Improved Hermitian/Skew-Hermitian Splitting Iteration

We propose an improved version of the Hermitian/skew-Hermitian splitting...

1 Introduction

1.1 Background

In this paper, we consider the time-dependent linear system (TDLS) of this form


where , is or (), is an initial value, and

is a linear operator. TDLS appears in many branches of science and engineering, such as dynamical systems, quantum mechanics, linear Hamiltonian dynamics, linear kinetic equations, semi-discretization of partial differential equations, differential matrix equation, etc

[18, 13, 28, 24, 19, 2]. Among them, we focus on linear differential systems and linear matrix systems. If is a differential operator on space, TDLS becomes the linear differential system


where with is a bounded and open domain. is the initial condition and is the source term. The differential matrix equation, such as the Sylvester matrix equation, could be the following form


where for each , are large and sparse, and are full rank matrices with . The initial condition is , where .

The TDLS Eq. 1 can be discretized by appropriate approaches, such as linear multistep schemes, Runge-Kutta methods, general linear methods, block implicit methods, boundary value methods (BVMs) [31, 17, 10, 16, 22, 8, 7]. By the time discretization scheme, TDLS Eq. 1 becomes a linear algebraic equation, , where is a block sparse matrix. This linear system is usually large. Directly solving it may be expensive. Alternatively, iteration methods become a better choice. Alternating direction implicit (ADI) methods are a class of useful iteration schemes, which were initially designed to solving partial differential equations [25, 15]. Afterward, the idea of the ADI methods has been extended to more branches, such as including numerical algebra, optimization [29, 23]

. Among these ADI methods, a large class of Hermitian and skew-Hermitian splitting (HSS) type methods have been developed to solve linear systems 

[3, 4, 30, 5]. More recently, a general ADI (GADI) framework has been presented to put most existing ADI methods into a unified framework [21]. In this paper, we are concerned with the development of GADI methods for TDLS.

From the theoretical viewpoint, the convergence of exist GADI methods requires that one splitting matrix is positive definite and the other is positive semi-definite [21]. Similarly, the convergence of some HSS methods needs that the matrix of system is positive definite [3, 4]. For the TDLS, these assumptions may be difficult to meet. Therefore, it requires to designe new GADI methods with more reasonable convergence assumptions.

Further, as splitting schemes, ADI methods require to split the matrix into different parts with splitting parameters. And the convergence and efficiency of ADI methods are very sensitive to splitting parameters. Therefore, choosing the optimal splitting parameters is critical. A common approach is using theoretical analysis to estimate the bound of splitting parameters in a case-by-case way [3, 11]. However, in practice, this method has certain limitation for solving large-scale systems and the efficiency heavily depends on the theoretical bound. Recently, Ref. [21] has presented a data-driven approach, the Gaussian process regression (GPR) method by choosing appropriate kernel, to predict relatively optimal splitting parameters. The GPR has a lot of advantages such as being easy to implement, high precision, and high generalization capability. The GPR method can efficiently predict one splitting parameter one time [21]. However, there may be two or more splitting parameters in GADI methods. And these splitting parameters may have some complicated connections. Independently predicting each splitting parameter would inevitably affect the predicted accuracy of original GPR method. Therefore, it requires to improve the GRP method to predict multi-parameters simultaneously. Another critical component that determines the GPR’s availability is the kernel function [33, 35]. The original GRP method chooses kernel function by the problem’s properties [21], which might produce artificial error. Meanwhile the chosen kernel in a kind of problem may be difficult to extend to other problems. Therefore, how to automatically learn kernel functions from problems is still a challenge in predicting splitting parameters.

1.2 Our work

In this work, we mainly concern developing new and efficient GADI methods to solve TDLS Eq. 1. Our contributions are summarized as follows:

  • Inspired by the idea of GADI framework [21] and Kronecker product splitting iteration method [11, 12], we propose a GADI Kronecker product splitting (GADI-KP) method for solving TDLS Eq. 1. We design a fast algorithm to implement the GADI-KP method using the properties of Kronecker product.

  • We prove that the GADI-KP method converges to the unique solution when all eigenvalues of one splitting matrix have positive real parts and those of another have nonnegative real parts, which does not require the positive definiteness.

  • We present a multitask kernel learning GPR method to predict multiple splitting parameters simultaneously and to automatic learn kernel function.

  • Based on the GADI-KP method, we provide an effective preconditioner to accelerate Krylov subspace methods in solving TDLS Eq. 1. The spectrum of the preconditioner matrix has a tight cluster.

  • We apply our developed methods to some standard TDLSs, including two-dimensional (2D) diffusion and convection-diffusion equations, and the differential Sylvester matrix equation. Numerical results demonstrate the efficiency of our methods.

1.3 Organization

The rest of this paper is organized as follows. In Section 2, we describe the discrete process of TDLS by BVM in detail, and give two concrete applications of linear differential systems and linear matrix systems. In Section 3, we propose the GADI-KP method for TDLS, and prove the convergence property. We present a preconditioning strategy based on the GADI-KP method to speed up Krylov subspace methods. Additionally, to improve the GPR method, we propose the multitask kernel learning GPR method. In Section 4, we show the efficiency and superiority of our methods by solid numerical experiments. In Section 5, we carry out the summary of whole paper and future work prospects.

2 Discretization of TDLS

In this section, we describe the discrete process of TDLS Eq. 1 by BVM, and apply it to linear differential systems and linear matrix systems.

2.1 Notation

Throughout the paper, the sets of complex and real matrices are denoted by and , respectively. If , let and denote the transpose and inverse of . and denote the spectral set and spectral radius of . is an

-order identity matrix.

represents the

-norm of a vector. Let

, the Kronecker product of and is defined by

We recall some properties of Kronecker products. More context about the Kronecker product can refer to [20].

Lemma 1

Let , , then

Here, the ‘’ operator transforms matrices into vectors by stacking columns

2.2 Temporal Discretization

We give a brief description of BVM [8, 9]. For (), the -step BVM is


where is the step size, , , and are parameters. The extra initial and final equations are equationparentequation

where the coefficients and are chosen such that the truncation errors over all node are consistent.

By applying the BVM, the TDLS Eq. 1 is discretized as


where , , , the matrices have the following structures

2.3 Applications

In this work, we consider two types of TDLSs, the linear differential system Eq. 2, and the linear matrix system Eq. 3.

2.3.1 Linear differential system

By appropriately spatial discretization, such as finite difference and finite element methods, we can obtain the ordinary differential equation (ODE) system


where contains approximate values of over spatial grid nodes.

is the degree of freedom of spatial discretization.

and are similar notations corresponding to and . are mass and stiff matrices, respectively. Then we apply BVM to Eq. 7, we can arrive at a linear system


where , , .

2.3.2 Linear matrix system

Now we consider the time-dependent linear matrix system Eq. 3. The equivalent ODE system of Eq. 3 is


where is an identity matrix, , and . Then we can also apply the BVM to LABEL:{eq:ode_sylvester} to obtain a similar linear system as Eq. 8.

3 Our methods

In this section, we propose GADI-KP method, a new splitting iteration method, to solve TDLS Eq. 1. Compared with existing ADI methods, this method can converge to the unique solution under a mild condition. More significantly, to improve the GPR method in multi-parameter prediction, we introduce the multitask kernel learning GPR method. We also present a preconditioning strategy based on the GADI-KP method to accelerate the Krylov subspace methods.

3.1 GADI-KP method

In this subsection, we propose the GADI-KP method and give a fast algorithm of GADI-KP method by using the properties of Kronecker product. The generalized Kronecker product splitting (GKPS) method [11, 12] is equationparentequation


Inspired by the GADI framework [21], we add a viscosity term to the right side of Eq. 10b

Then we obtain the GADI-KP scheme


where , the splitting parameters and . Notice that, when , the GADI-KP method reduces to the GKPS scheme [12], when and , the GADI-KP method becomes the Kronecker product splitting (KPS) approach [11].

Eliminating the intermediate vector , we can write the GADI-KP method in a fixed point form

where the iteration matrix of the GADI-KP method is




It is easy to see that there is a unique splitting


Notice that


Therefore, the GADI-KP method Eq. 11 reduces to

Using the properties of Kronecker product, we can obtain a fast implementation of the GADI-KP method (see Algorithm 1).

0:  Initial guess , splitting parameters and ;
1:  for  until converges do
2:     ;
3:     ;
4:     for  do
5:         (use GMRES);
6:     end for
7:     ;
8:     .
9:  end for
Algorithm 1 GADI-KP method for solving linear system Eq. 8.

For each iteration of the GADI-KP method, the computational complexity by directly solver is . Base on Algorithm 1, the computational complexity can be reduced to .

3.2 Convergence analysis of GADI-KP

In this subsection, we discuss the convergence of GADI-KP method.


Assume that all eigenvalues of have positive real parts and all the eigenvalues of have nonnegative real parts. Then for any , and , the GADI-KP method Eq. 11 is convergent to the unique solution of the linear system Eq. 8. The spectral radius of iteration matrix of GADI-KP method satisfies




Let and be the eigenvalues of the matrices and , respectively. Then by the property of the Kronecker product, the eigenvalues of the iteration matrix Eq. 12 have the following form:

Let and , then

which is equivalent to

where , .

Notice that

Denote , then

Let (), then


If and , we obtain

When , , we have

This implies that the GADI-KP method Eq. 11 converges to the unique solution of linear system Eq. 8.

3.3 Krylov subspace acceleration with GADI-KP preconditioner

In this subsection, we give a preconditioning strategy based on the GADI-KP method to accelerate Krylov subspace methods.

The GADI-KP method can be used as a preconditioner to accelerate Krylov subspace methods such as GMRES [27]. From Eq. 14, the linear system Eq. 8 is equivalent to


where and the matrix is a preconditioner. This equivalent system can be solved by GMRES. Algorithm 2 gives the solving process for Eq. 8 by putting the matrix as a preconditioner in GMRES. Notice that using the GADI-KP preconditioner within GMRES requires solving a linear system at each iteration. We can use Algorithm 1 to economically solve the above preconditioned linear system.

0:  Initial guess , splitting parameters and ;
1:  Calculate preconditioner according to Eq. 13;
2:  for  until converges do
3:     Calculate and ;
4:     for  do
5:        ;
6:        ;
7:        for  do
8:           ;
9:        end for
10:        ;
11:        ;
12:        ;
13:     end for
14:     , where minimizes .
15:  end for
Algorithm 2 GMRES-GADIKP method for solving linear system Eq. 8.

Assume that all the eigenvalues of have positive real parts and all the eigenvalues of have nonnegative real parts, , and . Since , we can naturally conclude that all the eigenvalues of the preconditioned matrix are located in a circle with radius . Compared with the KPS and GKPS preconditioners (see Figure 1), GADI-KP preconditioner can further reduce the condition number of matrix , and the eigenvalue distribution of preconditioned system has a more tighter bound.

(a) Without preconditioner
(b) KPS preconditioner
(c) GKPS preconditioner
(d) GADI-KP preconditioner
Figure 1: The eigenvalue distribution of the matrix with different preconditioners. is the coefficient matrix generated by discretizing 2D diffusion equation with (see Section 4.1 for details).

3.4 Multitask kernel learning GPR

In this subsection, we propose a multitask kernel learning GPR method which improves both simultaneous prediction of multiple splitting parameters and data-driven kernel selection.

3.4.1 Multitask GPR

GADI-type methods have two or more splitting parameters. As we know, the splitting parameters heavily affect the efficiency of GADI approaches. Now we propose a practical multitask GPR method to simultaneously predict these splitting parameters. In the multitask method, we are interested in learning related functions , from training data , with , and . Consider the following noised model to avoid singularity in numerical computation


where denotes the -th output (input) of the -th task,

is the white noise of the

-th task.

Let be the output vector, and latent function be the input vector. The multitask regression problem can be presented as the Gaussian process prior over the latent function

where and are the task and data covariance matrices, respectively [6]. The noised model Eq. 18 becomes

where is a diagonal matrix with . The prediction distribution for the -th task on a new point is


. and are the -th diagonal element and the -th column of , respectively. is a covariances vector between test point and training point , and

denotes the covariance matrix of all training points. Hyperparameters

and appear both in the task and data covariance functions. In order to achieve the hyperparameter estimation, we apply the L-BFGS method to maximize marginal likelihood function in logarithmic form

3.4.2 Kernel learning approach

Choosing an appropriate kernel function determines whether GRP method succeeds. The original GRP method directly selects kernel function from the feature of problems [21]. It might result in artificial error and could be hardly extended to other problems. For example, Fig. 3 shows that the GPR method with manual kernel function will produce a bad regression curve. How to give an automatic way to choose kernel functions is worth investigating. Here we present a data-driven method to learn the kernel function. Give a kernel library that contains basic kernel functions and their multiplicative combinations. For the -th () training task, the required kernel function is the linear combination of library elements

For the training multitask, the weighted matrix is

All weights can be obtained by training the data from concrete TDLS. A similar idea can be also found in pattern discovery [34, 32].


In GADI methods, the training data comes from the smaller systems [21]

. It is a small data set. Learning kernel function by (deep) neural network from small data is difficult. Therefore, we predetermine the kernel library via a priori knowledge instead of directly learning kernel function from data.

4 Numerical experiments

In this section, we present three numerical examples: 2D diffusion and convection-diffusion equations, and the differential Sylvester matrix equation, to show the performance of our proposed GADI-KP, GMRES with GADI-KP preconditioner (GMRES-GADIKP) methods. As a comparison, we also give the experimental data of KPS, GKPS, GMRES, and GMRES with GKPS preconditioner (GMRES-GKPS) methods. All computations are carried out using PyCharm 2020.2 on a Mac laptop with 2.3 GHz Quad Intel Core i5. All tests are started with the zero vector. All iteration methods are terminated if the relative residual error satisfies , where is the -step residual.

In the multitask kernel learning GPR method, the kernel library contains linear, Gaussian, periodic kernels, and their multiplicative combinations. Concretely, three basic kernel functions are

Linear kernel:
Gaussian kernel:
Periodic kernel: