In this paper, we consider the time-dependent linear system (TDLS) of this form
where , is or (), is an initial value, and
is a linear operator. TDLS appears in many branches of science and engineering, such as dynamical systems, quantum mechanics, linear Hamiltonian dynamics, linear kinetic equations, semi-discretization of partial differential equations, differential matrix equation, etc[18, 13, 28, 24, 19, 2]. Among them, we focus on linear differential systems and linear matrix systems. If is a differential operator on space, TDLS becomes the linear differential system
where with is a bounded and open domain. is the initial condition and is the source term. The differential matrix equation, such as the Sylvester matrix equation, could be the following form
where for each , are large and sparse, and are full rank matrices with . The initial condition is , where .
The TDLS Eq. 1 can be discretized by appropriate approaches, such as linear multistep schemes, Runge-Kutta methods, general linear methods, block implicit methods, boundary value methods (BVMs) [31, 17, 10, 16, 22, 8, 7]. By the time discretization scheme, TDLS Eq. 1 becomes a linear algebraic equation, , where is a block sparse matrix. This linear system is usually large. Directly solving it may be expensive. Alternatively, iteration methods become a better choice. Alternating direction implicit (ADI) methods are a class of useful iteration schemes, which were initially designed to solving partial differential equations [25, 15]. Afterward, the idea of the ADI methods has been extended to more branches, such as including numerical algebra, optimization [29, 23]
. Among these ADI methods, a large class of Hermitian and skew-Hermitian splitting (HSS) type methods have been developed to solve linear systems[3, 4, 30, 5]. More recently, a general ADI (GADI) framework has been presented to put most existing ADI methods into a unified framework . In this paper, we are concerned with the development of GADI methods for TDLS.
From the theoretical viewpoint, the convergence of exist GADI methods requires that one splitting matrix is positive definite and the other is positive semi-definite . Similarly, the convergence of some HSS methods needs that the matrix of system is positive definite [3, 4]. For the TDLS, these assumptions may be difficult to meet. Therefore, it requires to designe new GADI methods with more reasonable convergence assumptions.
Further, as splitting schemes, ADI methods require to split the matrix into different parts with splitting parameters. And the convergence and efficiency of ADI methods are very sensitive to splitting parameters. Therefore, choosing the optimal splitting parameters is critical. A common approach is using theoretical analysis to estimate the bound of splitting parameters in a case-by-case way [3, 11]. However, in practice, this method has certain limitation for solving large-scale systems and the efficiency heavily depends on the theoretical bound. Recently, Ref.  has presented a data-driven approach, the Gaussian process regression (GPR) method by choosing appropriate kernel, to predict relatively optimal splitting parameters. The GPR has a lot of advantages such as being easy to implement, high precision, and high generalization capability. The GPR method can efficiently predict one splitting parameter one time . However, there may be two or more splitting parameters in GADI methods. And these splitting parameters may have some complicated connections. Independently predicting each splitting parameter would inevitably affect the predicted accuracy of original GPR method. Therefore, it requires to improve the GRP method to predict multi-parameters simultaneously. Another critical component that determines the GPR’s availability is the kernel function [33, 35]. The original GRP method chooses kernel function by the problem’s properties , which might produce artificial error. Meanwhile the chosen kernel in a kind of problem may be difficult to extend to other problems. Therefore, how to automatically learn kernel functions from problems is still a challenge in predicting splitting parameters.
1.2 Our work
In this work, we mainly concern developing new and efficient GADI methods to solve TDLS Eq. 1. Our contributions are summarized as follows:
We prove that the GADI-KP method converges to the unique solution when all eigenvalues of one splitting matrix have positive real parts and those of another have nonnegative real parts, which does not require the positive definiteness.
We present a multitask kernel learning GPR method to predict multiple splitting parameters simultaneously and to automatic learn kernel function.
Based on the GADI-KP method, we provide an effective preconditioner to accelerate Krylov subspace methods in solving TDLS Eq. 1. The spectrum of the preconditioner matrix has a tight cluster.
We apply our developed methods to some standard TDLSs, including two-dimensional (2D) diffusion and convection-diffusion equations, and the differential Sylvester matrix equation. Numerical results demonstrate the efficiency of our methods.
The rest of this paper is organized as follows. In Section 2, we describe the discrete process of TDLS by BVM in detail, and give two concrete applications of linear differential systems and linear matrix systems. In Section 3, we propose the GADI-KP method for TDLS, and prove the convergence property. We present a preconditioning strategy based on the GADI-KP method to speed up Krylov subspace methods. Additionally, to improve the GPR method, we propose the multitask kernel learning GPR method. In Section 4, we show the efficiency and superiority of our methods by solid numerical experiments. In Section 5, we carry out the summary of whole paper and future work prospects.
2 Discretization of TDLS
In this section, we describe the discrete process of TDLS Eq. 1 by BVM, and apply it to linear differential systems and linear matrix systems.
Throughout the paper, the sets of complex and real matrices are denoted by and , respectively. If , let and denote the transpose and inverse of . and denote the spectral set and spectral radius of . is an
-order identity matrix.represents the
-norm of a vector. Let, the Kronecker product of and is defined by
We recall some properties of Kronecker products. More context about the Kronecker product can refer to .
Let , , then
Here, the ‘’ operator transforms matrices into vectors by stacking columns
2.2 Temporal Discretization
where is the step size, , , and are parameters. The extra initial and final equations are equationparentequation
where the coefficients and are chosen such that the truncation errors over all node are consistent.
By applying the BVM, the TDLS Eq. 1 is discretized as
where , , , the matrices have the following structures
2.3.1 Linear differential system
By appropriately spatial discretization, such as finite difference and finite element methods, we can obtain the ordinary differential equation (ODE) system
where contains approximate values of over spatial grid nodes.
is the degree of freedom of spatial discretization.and are similar notations corresponding to and . are mass and stiff matrices, respectively. Then we apply BVM to Eq. 7, we can arrive at a linear system
where , , .
2.3.2 Linear matrix system
3 Our methods
In this section, we propose GADI-KP method, a new splitting iteration method, to solve TDLS Eq. 1. Compared with existing ADI methods, this method can converge to the unique solution under a mild condition. More significantly, to improve the GPR method in multi-parameter prediction, we introduce the multitask kernel learning GPR method. We also present a preconditioning strategy based on the GADI-KP method to accelerate the Krylov subspace methods.
3.1 GADI-KP method
In this subsection, we propose the GADI-KP method and give a fast algorithm of GADI-KP method by using the properties of Kronecker product. The generalized Kronecker product splitting (GKPS) method [11, 12] is equationparentequation
Then we obtain the GADI-KP scheme
where , the splitting parameters and . Notice that, when , the GADI-KP method reduces to the GKPS scheme , when and , the GADI-KP method becomes the Kronecker product splitting (KPS) approach .
Eliminating the intermediate vector , we can write the GADI-KP method in a fixed point form
where the iteration matrix of the GADI-KP method is
It is easy to see that there is a unique splitting
Therefore, the GADI-KP method Eq. 11 reduces to
Using the properties of Kronecker product, we can obtain a fast implementation of the GADI-KP method (see Algorithm 1).
For each iteration of the GADI-KP method, the computational complexity by directly solver is . Base on Algorithm 1, the computational complexity can be reduced to .
3.2 Convergence analysis of GADI-KP
In this subsection, we discuss the convergence of GADI-KP method.
Assume that all eigenvalues of have positive real parts and all the eigenvalues of have nonnegative real parts. Then for any , and , the GADI-KP method Eq. 11 is convergent to the unique solution of the linear system Eq. 8. The spectral radius of iteration matrix of GADI-KP method satisfies
Let and be the eigenvalues of the matrices and , respectively. Then by the property of the Kronecker product, the eigenvalues of the iteration matrix Eq. 12 have the following form:
Let and , then
which is equivalent to
where , .
3.3 Krylov subspace acceleration with GADI-KP preconditioner
In this subsection, we give a preconditioning strategy based on the GADI-KP method to accelerate Krylov subspace methods.
where and the matrix is a preconditioner. This equivalent system can be solved by GMRES. Algorithm 2 gives the solving process for Eq. 8 by putting the matrix as a preconditioner in GMRES. Notice that using the GADI-KP preconditioner within GMRES requires solving a linear system at each iteration. We can use Algorithm 1 to economically solve the above preconditioned linear system.
Assume that all the eigenvalues of have positive real parts and all the eigenvalues of have nonnegative real parts, , and . Since , we can naturally conclude that all the eigenvalues of the preconditioned matrix are located in a circle with radius . Compared with the KPS and GKPS preconditioners (see Figure 1), GADI-KP preconditioner can further reduce the condition number of matrix , and the eigenvalue distribution of preconditioned system has a more tighter bound.
3.4 Multitask kernel learning GPR
In this subsection, we propose a multitask kernel learning GPR method which improves both simultaneous prediction of multiple splitting parameters and data-driven kernel selection.
3.4.1 Multitask GPR
GADI-type methods have two or more splitting parameters. As we know, the splitting parameters heavily affect the efficiency of GADI approaches. Now we propose a practical multitask GPR method to simultaneously predict these splitting parameters. In the multitask method, we are interested in learning related functions , from training data , with , and . Consider the following noised model to avoid singularity in numerical computation
where denotes the -th output (input) of the -th task,
is the white noise of the-th task.
Let be the output vector, and latent function be the input vector. The multitask regression problem can be presented as the Gaussian process prior over the latent function
where is a diagonal matrix with . The prediction distribution for the -th task on a new point is
. and are the -th diagonal element and the -th column of , respectively. is a covariances vector between test point and training point , and
denotes the covariance matrix of all training points. Hyperparametersand appear both in the task and data covariance functions. In order to achieve the hyperparameter estimation, we apply the L-BFGS method to maximize marginal likelihood function in logarithmic form
3.4.2 Kernel learning approach
Choosing an appropriate kernel function determines whether GRP method succeeds. The original GRP method directly selects kernel function from the feature of problems . It might result in artificial error and could be hardly extended to other problems. For example, Fig. 3 shows that the GPR method with manual kernel function will produce a bad regression curve. How to give an automatic way to choose kernel functions is worth investigating. Here we present a data-driven method to learn the kernel function. Give a kernel library that contains basic kernel functions and their multiplicative combinations. For the -th () training task, the required kernel function is the linear combination of library elements
For the training multitask, the weighted matrix is
In GADI methods, the training data comes from the smaller systems 
. It is a small data set. Learning kernel function by (deep) neural network from small data is difficult. Therefore, we predetermine the kernel library via a priori knowledge instead of directly learning kernel function from data.
4 Numerical experiments
In this section, we present three numerical examples: 2D diffusion and convection-diffusion equations, and the differential Sylvester matrix equation, to show the performance of our proposed GADI-KP, GMRES with GADI-KP preconditioner (GMRES-GADIKP) methods. As a comparison, we also give the experimental data of KPS, GKPS, GMRES, and GMRES with GKPS preconditioner (GMRES-GKPS) methods. All computations are carried out using PyCharm 2020.2 on a Mac laptop with 2.3 GHz Quad Intel Core i5. All tests are started with the zero vector. All iteration methods are terminated if the relative residual error satisfies , where is the -step residual.
In the multitask kernel learning GPR method, the kernel library contains linear, Gaussian, periodic kernels, and their multiplicative combinations. Concretely, three basic kernel functions are