Confluent Vandermonde with Arnoldi

by   Qiang Niu, et al.
Xi'an Jiaotong-Liverpool University

In this note, we extend the Vandermonde with Arnoldi method recently advocated by P. D. Brubeck, Y. Nakatsukasa and L. N. Trefethen to dealing with the confluent Vandermonde matrix. To apply the Arnoldi process, it is critical to find a Krylov subspace which generates the column space of the confluent Vandermonde matrix. A theorem is established for such Krylov subspaces for any order derivatives. This enables us to compute the derivatives of high degree polynomials to high precision. It also makes many applications involving derivatives possible, as illustrated by numerical examples. We note that one of the approaches orthogonalizes only the function values and is equivalent to the formula given by P. D. Brubeck and L. N. Trefethen. The other approach orthogonalizes the Hermite data. About which approach is preferable to another, we made the comparison, and the result is problem dependent.


page 1

page 2

page 3

page 4


Recursive multivariate derivatives of e^f(x_1,..., x_n) of arbitrary order

High-order derivatives of nested functions of a single variable can be c...

Shifted Monic Ultraspherical Approximation for solving some of Fractional Orders Differential Equations

The purpose of this paper is to show and explain a new formula that indi...

Derivatives of partial eigendecomposition of a real symmetric matrix for degenerate cases

This paper presents the forward and backward derivatives of partial eige...

Chain Rules for Hessian and Higher Derivatives Made Easy by Tensor Calculus

Computing multivariate derivatives of matrix-like expressions in the com...

Backpropagation in matrix notation

In this note we calculate the gradient of the network function in matrix...

Fast Derivatives for Multilinear Polynomials

The article considers linear functions of many (n) variables - multiline...

Transfer function interpolation remainder formula of rational Krylov subspace methods

Rational Krylov subspace projection methods are one of successful method...

1 Introduction.

The recent work [va2021] presents a linear algebra method for the polynomial interpolation and fitting on arbitrary grids. It does not need the prior knowledge of a well-conditioned polynomial basis, and works easily on two intervals and subsets of complex plane, for example. The method starts with the ill-conditioned Vandermonde system

such that , and transforms it into a much more well-conditioned system

such that . In matrix form, the Vandermonde system , i.e.,

is transformed into where is a matrix and

is a column vector, and the approximation at

, is evaluated as where . The columns of form an orthogonal basis of the column space of . To construct , it is noted that is just the Krylov subspace with , . So, the Arnoldi process for the Krylov subspace is exploited to generate the columns of as well as a Hessenberg matrix such that (here is obtained by deleting the last column of ). The evaluation matrix has the first column all one’s and is constructed column by column such that , where and is without the -th column.

Now, we would like to employ the same idea for the Hermite interpolation and fitting. We start with the confluent Vandermonde system

or in matrix form (with as before)

We notice that the column space of (semicolon means as in MATLAB) is a Krylov subspace:

So, the Arnoldi process applies and can be used to generate a matrix with orthogonal columns and a Hessenberg matrix . Then, the evaluation matrix can be constructed with

Following [va2021], we give the MATLAB codes for the method of confluent Vandermonde with Arnoldi.

     function [d,H] = polyfitAh(x,f,fp,n)
     m = size(x,1); Q = zeros(2*m,n+1); Q(1:m,1) = 1; H = zeros(n+1,n);
     for k = 1:n
         q = [x.*Q(1:m,k); Q(1:m,k)+x.*Q(m+1:2*m,k)];
         for j = 1:k
             H(j,k) = Q(:,j)’*q/m; q = q - H(j,k)*Q(:,j);
         H(k+1,k) = norm(q)/sqrt(m); Q(:,k+1) = q/H(k+1,k);
     d = Q\[f;fp];

     function [y,yp] = polyvalAh(d, H, s)
     M = size(s,1);  n = size(d,1)-1; W = zeros(2*M,n+1); W(1:M,1) = 1;
     for k = 1:n
         w = [s.*W(1:M,k); W(1:M,k)+s.*W(M+1:2*M,k)];
         for j = 1:k
             w = w - H(j,k)*W(:,j);
         W(:,k+1) = w/H(k+1,k);
     y = W*d;  yp = y(M+1:2*M,:);  y = y(1:M,:);

To compare, we give also the naive method of confluent Vandermonde without Arnoldi.

     function c = polyfith(x,f,fp,n)
     A = x.^(0:n); o = zeros(size(x)); D = diag(1:n); A = [A; o A(:,1:n)*D]; c = A\[f;fp];

     function [y,yp] = polyvalh(c,s)
     n = length(c)-1;  B = s.^(0:n);  y = B*c;
     o = zeros(size(s));  D = diag(1:n);  yp = [o B(:,1:n)*D]*c;

Other than the Hermite scenario for both fitting and evaluation, we can also fit only the function values using polyfitA (or polyfit) from [va2021] but evaluates the function values and derivatives using polyvalAh (or polyvalh) here. This scenario is more interesting if the data of derivatives are not available for fitting. In other words, we construct first a discrete orthogonal basis for the function values by polyfitA and then the derivative matrix W(M+1:2*M,:) under the same basis by polyvalAh. We notice that an equivalent formula for doing this has been given in [lt2022], which we should detail in the Discussion section. There is a third possibility: one can still construct a discrete orthogonal basis for the Hermite data, but fit only the function values, and then construct the derivative evaluation matrix under that basis; however we found no advantage of it. The fourth possibility is to use the function values orthogonalized basis for Hermite fitting and evaluation, which we found performs equally well as using the Hermite data orthogonalized basis for Examples 1–2 in the following sections. We will see also for the other examples the comparison results of Hermite and function values orthogonalized bases.

Both the fitting and evaluation can be generalized to high-order derivatives. For example, let be the second-order derivative matrix under the standard basis. We have

It may be worthwhile to state the general case as follows.

Theorem 1.

Let be the Vandermonde matrix and be the th order derivative matrix corresponding to . Then, the matrix has the th column exactly equal to

2 Example 1: interpolation in Chebyshev points.

This example is used in [va2021] for the Lagrange interpolation. Now we interpolate the function and its derivative simultaneously (polyfitAh+polyvalAh) by a degree

(odd) polynomial in

Chebyshev points , . We also do the Lagrange interpolation in points as in [va2021] and evaluate the derivative as proposed here (polyfitA+polyvalAh). The results are shown in Figure 1. We see that with Arnoldi both and can be approximated to high accuracy with the exponential convergence rate, while without Arnoldi the method is prone to round-off errors as . The derivative evaluation enabled by the confluent Vandermonde with Arnoldi (polyvalAh) works equally well for the Hermite and Lagrange interpolation. Throughout the paper, the function norm is the maximum norm computed on a very fine grid.

Figure 1: Hermite (solid lines/points) and Lagrange (dashed lines/circles) interpolation of on Chebyshev grids of by the confluent Vandermonde without (left column) and with (right column) Arnoldi.

3 Example 2: least-squares fitting on two intervals.

For the function on , we sample the function values (and its derivatives for the Hermite version) on every interval with equally spaced points, and fit by least-squares these data with a degree polynomials. The number of sampling points on each interval is chosen as for the Hermite version and for the Lagrange version. The results are shown in Figure 2.

Figure 2: Least-squares fitting of on (dashed lines/circles) and simultaneously its derivatives (solid lines/points) in equally spaced points by the confluent Vandermonde without (left column) and with (right column) Arnoldi.

4 Example 3: Hermite Fourier extension.

This example is used in [va2021] for approximation of on by the Fourier series on the larger interval . Now we consider both the function values and its derivatives:

Following [va2021], this is realized with , () and

The last line of polyfith is changed to

     m = length(x);  c = [real(A(1:m,:))        imag(A(1:m,2:n+1)); ...
                          imag(A(m+1:2*m,:))   -real(A(m+1:2*m,2:n+1))]\[f;fp];
     c = c(1:n+1) - 1i*[0; c(n+2:2*n+1)];

where fp stores the scaled derivatives . Similar changes should be made also in polyfitAh but with instead of . In polyvalh and polyvalAh, now we take the real part for y and imaginary part for yp. Note again that the output yp stores the scaled derivatives . Following [va2021], 500 Chebyshev points on are used for the Hermite fitting. The results are shown in Figure 3. The Lagrange fitting is not present in the figure since the derivative evaluation for it gave totally wrong results.

Figure 3: Fourier extension of and simultaneously its derivatives on Chebyshev grids of by the truncated Fourier series on using the confluent Vandermonde without (left column) and with (right column) Arnoldi. Bottom right: Hermite (dots) vs function values (circles) orthogonalized bases.

5 Example 4: Laplace equation and Dirichlet-to-Neumann map.

Given a domain with boundary and a function , the Dirichlet-to-Neumann map involves first solving the Laplace equation in , on and then evaluating the outer normal derivative on . By identifying and , can be viewed as the real part of an analytic function in the complex plane, so that is automatically satisfied. Following [va2021], we approximate by a polynomial . Let (, ). Then, . To compute , we need only to fit the boundary condition on and then evaluate (since and corresponds to the unit outer normal on ), or more precisely

A concrete example is demonstrated in Figure 4, where points with the argument of equally spaced are used for the fitting. We see that the Hermite orthogonalized basis leads to a bit more accurate results than the function values orthogonalized basis does.

Figure 4: Dirichlet-to-Neumann map of the Dirichlet data at by the confluent Vandermonde without (left column) and with (right column) Arnoldi. Bottom-right: Hermite (dots) vs function values (circles) orthogonalized bases.

6 Example 5: Steklov eigenvalue problem in an ellipse.

This is a continuation of the previous section. We look for the eigenvalue


and the assoicated eigenfunction

such that . In other words, we want the real part of an analytic function in to satisfy on . Let be the matrix from the Arnoldi process on the confluent Vandermonde matrix; see the Introduction section. We partition the rows of in half: with containing the first rows that represents the sampling of basis function values and representing the sampling of derivatives of basis functions. Let the matrices be separated into real parts and imaginary parts: , , and , be diagonal matrices containing the sampled values of the unit outer normal of . Then, the discrete Steklov eigenvalue problem is to find , and such that

denoted by with (), which is a rectangular eigenvalue problem when . Following [re2021]

, we first do the economic QR decomposition

, then multiply the eigenvalue equation with to get , which is a generalized eigenvalue problem of square matrices and can be solved by the standard method (using the MATLAB command eig). For the solved example, see Figure 5. The points are sampled at equally spaced elliptic angle coordinates. The benchmark solution uses .

Figure 5: Laplace Steklov eigenvalue problem in the ellipse by the confluent Vandermonde without (left column) and with (right column) Arnoldi. Top: the th eigenfunction (), , for the eigenvalues . Bottom: errors of , (, ) and , (, ). Bottom right: Hermite vs function values (in black) orthogonalized bases.

7 Example 6: sloshing modes on top of a cup.

This example is taken from [mora2015]. The domain is . On the left, right and bottom sides, the homogeneous Neumann condition is imposed, while on the top side is the eigenvalue equation . The analytic solution is , ; see [mora2015]. Let be the identity operator on the top side and be the zero operator on the other sides. Define on piecewise equal to and . Then, we can write the boundary equation uniformly as on . Using the notation from the previous section, the discrete rectangular eigenvalue problem becomes . Now, testing the equation with the column space basis of as before is not a good idea because is rank deficient. Note that the two sides of the equation live in the sum of the column spaces of and . So, we follow [hashemi2021] to do the economic SVD: , but we keep the first columns of as to match the length of , then multiply the eigenvalue equation with to get . The results are shown in Figure 6. On each side of the square, we used Chebyshev points of the first kind for discretization of the boundary equation. From the bottom right, we can see that the function values orthogonalized basis leads to instability. Convergence of the eigenfunctions from the method with Hermite Arnoldi basis slowers down after some , which could be fixed with more sampling points. For example, for and points per side, we got the errors , for , .

Figure 6: Sloshing modes on the top side of a square cup by the confluent Vandermonde without (left column) and with (right column) Arnoldi. Top: the th eigenfunction (), , for the eigenvalues . Bottom: errors of the 5th (, ) and 10th (, ) eigenfunctions/eigenvalues. Bottom right: Hermite vs function values (in black) orthogonalized bases.

8 Example 7: indefinite integral on two intervals.

We want to find a polynomial approximation of , is a general constant. For example, if , we have , , the simplest differential equation with a point value condition. The goal is to find . The implementation is mostly the same as polyfitAh but the function header and the last line: function [d, H] = polyfitAih(x,f,n) and d = Q(m+1:2*m,2:n+1)\f; d = [0; d]; alternatively, one can orthogonalize only the function values and from the resulting Hessenberg matrix generate the derivative matrix. The solved example is shown in Figure 7. From the bottom right, we can see that, for both the Hermite Arnoldi basis and the function values Arnoldi basis, the error of can grow with in some range but still decreases to afterwards. Such transient instability does not appear for on the Chebyshev grids of .

Figure 7: Polynomial indefinite integral of on equidistant grids of by the confluent Vandermonde without (left column) and with (right column) Arnoldi. Bottom right: Hermite (dots) vs function values (circles) orthogonalized bases.

9 Discussion.

This work is greatly inspired by the fascinating paper Vandermonde with Arnoldi [va2021] by Brubeck, Nakatsukasa and Trefethen. We saw a door open to the well-conditioned bases of polynomial spaces on arbitrary grids, and to many interesting applications, e.g., the lightning solvers [lt2022, aa2021, log2021, trefethen2020]. In the spirit of Chebfun [chebfun], it is natural then to consider differential, integral and other operations of polynomials under the Arnoldi basis. Under the standard basis, the differential operation or Hermite interpolation leads to the confluent Vandermonde matrix, which is ill-conditioned [higham1990]. The Arnoldi process can come to the rescue. However, it is not so straightforward to apply the method from [va2021]. The key idea behind the new method is the discovery of a connection of the confluent Vandermonde matrix to a Krylov subspace. This enables us to represent and compute the derivatives in a stable way. Two approaches are proposed to accomplish this task. One is constructing the basis for the column space of the confluent Vandermonde matrix directly by the Confluent Vandermonde with Arnoldi method based on the confluent Krylov subspace that we discovered. Another approach is using the basis generated by Vandermonde with Arnoldi and constructing the derivative matrix of the basis by exploiting the confluent Krylov subspace. With the new functionality of computing derivatives, we are able to do the Hermite interpolation/evaluation to high precision, to accomodate the Neumann/Robin boundary conditions, to compute the Dirichlet-to-Neumann map and its eigenmodes, and to do the indefinite integral.

We realized just upon finalizing this note that a formula of the derivative matrix has been given in [log2021, lt2022]. It is based on the recursive relation of the polynomial basis from Vandermonde with Arnoldi. Recall that the Arnoldi process produces the basis matrix and the Hessenberg matrix that satisfy . If we imagine that the number of rows of and goes to infinity, then we would get the polynomial recursion or . Taking derivative gives . It can be seen that our second approach (e.g. polyfitA + polyvalAh) is equivalent to this formula.

Comparison of the Hermite orthogonalized basis and the function values orthogonalized basis has been made on each example. It is found that they perform equally well for the Hermite interpolation/evaluation task. But the function values orthogonalized basis is unstable for the example of sloshing modes, while the Hermite orthogonalized basis works stably. We also observed instability of both the approaches for the Fourier extension example and the indefinite integral example. A theoretical understanding of the comparison results and the instability phenomena could be interesting.