Code to reproduce the figures in Functional Tucker approximation using Chebyshev interpolation by S. Dolgov, D. Kressner, C.Stroessner
This work is concerned with approximating a trivariate function defined on a tensor-product domain via function evaluations. Combining tensorized Chebyshev interpolation with a Tucker decomposition of low multilinear rank yields function approximations that can be computed and stored very efficiently. The existing Chebfun3 algorithm [Hashemi and Trefethen, SIAM J. Sci. Comput., 39 (2017)] uses a similar format but the construction of the approximation proceeds indirectly, via a so called slice-Tucker decomposition. As a consequence, Chebfun3 sometimes uses unnecessarily many function evaluation and does not fully benefit from the potential of the Tucker decomposition to reduce, sometimes dramatically, the computational cost. We propose a novel algorithm Chebfun3F that utilizes univariate fibers instead of bivariate slices to construct the Tucker decomposition. Chebfun3F reduces the cost for the approximation for nearly all functions considered, typically by 75 sometimes by over 98READ FULL TEXT VIEW PDF
Code to reproduce the figures in Functional Tucker approximation using Chebyshev interpolation by S. Dolgov, D. Kressner, C.Stroessner
This work is concerned with the approximation of trivariate functions (that is, functions depending on three variables) defined on a tensor-product domain, for the purpose of performing numerical computations with these functions. Standard approximation techniques, such as interpolation on a regular grid, may require an impractical amount of function evaluations. Several techniques have been proposed to reduce the number of evaluations for multivariate functions by exploiting additional properties. For example, sparse grid interpolation  exploits mixed regularity. Alternatively, functional low-rank (tensor) decompositions, such as the spectral tensor train decomposition , the continuous low-rank decomposition [22, 23], and the QTT decomposition [31, 41], have been proposed. In this work, we focus on using the Tucker decomposition for third-order tensors, following work by Hashemi and Trefethen .
The original problem of finding separable decompositions of functions is intimately connected to low-rank decompositions of matrices and tensors [25, Chapter 7]. A trivariate function is called separable if it can be represented as a product of univariate functions: . If such a decomposition is available, it is usually much more efficient to work with the factors instead of when, e.g., discretizing the function. In practice, most functions are usually not separable, but they often can be well approximated by a sum of separable functions. Additional structure can be imposed on this sum, corresponding to different tensor formats. In this work, we consider the approximation of a function in the functional Tucker format , as in [29, 35, 44], which takes the form
with univariate functions and the so called core tensor . The minimal , for which (1) can be satisfied with equality is called multilinear rank of . It determines the number of entries in and the number of univariate functions needed to represent . For functions depending on more than three variables, a recursive format is usually preferable, leading to tree based formats [20, 38, 39] such as the hierarchical Tucker format [28, 49] and the tensor train format [42, 6, 16, 22, 23].
The existence of a good approximation of the form (1) depends, in a nontrivial manner, on properties of . It can be shown that the best approximation error in the format decays algebraically with respect to the multilinear rank of the approximation for functions in Sobolev Spaces [24, 49] and geometrically for analytic functions [27, 55]. Approximations based on the Tucker format are highly anisotropic , i.e., a rotation of the function may lead to a very different behavior of the approximation error. This can be partially overcome by adaptively subdividing the domain of the function as proposed, e.g., by Aiton and Driscoll .
The representation (1) is not yet practical because it involves continuous objects; a combination of low-rank tensor and function approximation is needed. Univariate functions can be approximated using barycentric Lagrange interpolation based on Chebyshev points [2, 5, 30]. This interpolation is fundamental to Chebfun  - a package providing tools to perform numerical computations on the level of functions . Operations with these functions are internally performed by manipulating the Chebyshev coefficients of the interpolant .
In Chebfun2 [51, 52], a bivariate function is approximated by applying Adaptive Cross Approximation (ACA) , which yields a low-rank approximation in terms of function fibers (e.g. for fixed but varying ), and interpolating these fibers. In Chebfun3, Hashemi and Trefethen  extended these ideas to trivariate functions by recursively applying ACA, to first break down the tensor to function slices (e.g., for fixed ) and then to function fibers. As will be explained in Section 3.4
, this indirect approach via slice approximations typically leads to redundant function fibers, which in turn involve unnecessary function evaluations. This is particularly problematic when the evaluation of the function is expensive, e.g. when each sample requires the solution of a partial differential equation (PDE); see Section5 for an example.
In this paper, we propose a novel algorithm aiming at computing the Tucker decomposition directly. Our algorithm is called Chebfun3F to emphasize that it is based on selecting the Fibers in the Tucker approximation (1). To compute a suitable core tensor, oblique projections based on Discrete Empirical Interpolation (DEIM) 
are used. We combine this approach with heuristics similar to the ones used in Chebfun3 for choosing the univariate discretization parameters adaptively and for the accuracy verification.
The remainder of this paper is structured as follows. In Section 2, we introduce and analyze the approximation format used in Chebfun3 and Chebfun3F. In Section 3, we briefly recall the approximation algorithm currently used in Chebfun3. Section 4 introduces our novel algorithm Chebfun3F. Finally, in Section 5, we perform numerical experiments to compare Chebfun3, Chebfun3F and sparse grid interpolation.
Given a function , we consider an approximation of the form
where is the coefficient tensor and denotes the -th Chebyshev polynomial.
using Fourier transformations. We define the transformation matricesfor as in [36, Sec. 8.3.2.]
The mapping from the function evaluations to the coefficients can now be written as
where denotes the mode- multiplication. For a tensor and a matrix it is defined as the multiplication of every mode- fiber of with , i.e.
where denotes the mode- matricization, which is the matrix containing all mode- fibers of . By construction, the interpolation condition is satisfied in all Chebyshev points.
The approximation error for Chebyshev interpolation applied to multivariate analytic functions has been studied, e.g., by Sauter and Schwab in . The following result states that the error decays exponentially with respect to the number of interpolation points in each variable.
Suppose that can be extended to an analytic function on with , where denotes the Bernstein ellipse, a closed ellipse with foci at and the sum of major and minor semi-axes equal to . Then the Chebyshev interpolant constructed above satisfies for the error bound
where denotes the uniform norm on and .
A Tucker approximation of multilinear rank for a tensor takes the form
where is called core tensor and , , are called factor matrices. If , the required storage is reduced from for to for .
Let be a Tucker approximation of the tensor obtained from evaluating in Chebyshev points. Inserted into (2), we now consider an approximation of the form
where the interpolation coefficients are computed from as in Equation (3)
Note that the application of is the mapping from function evaluations to interpolation coefficients in the context of univariate Chebyshev interpolation [3, 36]. By interpreting the values stored in the columns of as function evaluations at Chebyshev points, we can define its columnwise Chebyshev interpolant , where . Note that for fixed . Defining and analogously allows us to rewrite the approximation (4) as
where the mode- products are defined as in Section 2.1 for fixed . The goal of this paper is to compute approximations of this form.
The following lemma allows us to distinguish the interpolation error from the low-rank approximation error in the approximation (5).
By applying the triangle inequality we obtain
Tensorized interpolation is equivalent to applying univariate interpolation to each mode. The operator norm for univariate interpolation in Chebyshev points, the Lebesgue constant, satisfies . Because interpolation is a linear operation, we obtain
We consider the function
on with parameter . Let .
In this section, we show that a Chebyshev interpolation satisfying for a prescribed error bound requires polynomial degrees , which grows algebraically in . However, one can achieve with multilinear ranks . Therefore for small values of the required polynomial degree is much higher than the required multilinear rank. In this situation can achieve almost the same accuracy as , but with significantly less storage.
For the degree Chebyshev interpolant we require , which is equivalent to . By Theorem 1,
We set and extend analytically to on . By construction . Hence, we can choose to obtain the desired accuracy. Although this is only an upper bound for the polynomial degree required, numerical experiments reported below indicate that it is tight.
An a priori approximation with exponential sums is used to obtain a bound on the multilinear rank for a tensor containing function values of ; see . Given and , Braess and Hackbusch  showed that there exist coefficients and such that
Trivially, we have for the substitution with . Applying (7) yields or, equivalently,
for every when
The approximation in (8) has multilinear rank . In turn, the tensor has multilinear rank at most and satisfies .
In Figure 1
, we estimate the maximal polynomial degree required to compute a Chebyshev interpolant with accuracyfor selected fibers of , which is a lower bound for the required polynomial degrees . It perfectly matches the asymptotic behavior of the derived upper bound . In Figure 1
, we also plot the multilinear ranks from the truncated Higher Order Singular Value Decomposition (HOSVD) with tolerance applied to the tensor containing the evaluation of on a Chebyshev grid. This estimate serves as a lower bound for the multilinear rank required to approximate . Due to the limited grid size, this estimate does not fully match the asymptotic behavior , but nonetheless it clearly reflects that the multilinear ranks can be much smaller than the polynomial degrees, as predicted by , for sufficiently small .
In this section, we recall how an approximation of the form (4) is computed in Chebfun3 . As discussed in Section 2.5, there are often situations in which the multilinear rank of is much smaller than the polynomial degree. Chebfun3 benefits from such a situation by first using a coarse sample tensor to identify the fibers needed for the low-rank approximation. This allows to construct the actual approximation from a finer sample tensor by only evaluating these fibers instead of the whole tensor.
Chebfun 3 consists of three phases: preparation of the approximation by identifying fibers for a so called block term decomposition  of , refinement of the fibers, conversion and compression of the refined block term decomposition into Tucker format (5).
In Chebfun3, is initially obtained by sampling on a grid of Chebyshev points. A block term decomposition of is obtained by applying ACA  (see Algorithm 1) recursively. In the first step, ACA is applied to a matricization of , say, the mode- matricization . This results in index sets such that
where contains mode- fibers of and contains mode- slices of . For each , such a slice is reshaped into a matrix and, in the second step, approximated by again applying ACA:
denotes vectorization. Reshaping this approximation into a tensor can be viewed as a block term decomposition in the sense of[14, Definition 2.2.].
If the ratios of , and are larger than the heuristic threshold the coarse grid resolution is deemed insufficient to identify fibers. If this is the case, is increased to and Phase 1 is repeated.
The block term decomposition (11) is composed of fibers of . Such a fiber corresponds to the evaluation of a univariate function for certain fixed . Chebfun contains a heuristic to decide whether the function values in suffice to yield an accurate interpolation of . If this is not the case, the grid is refined.
In Chebfun3 this heuristic is applied to all fibers contained in (11) in order to determine the size , initially set to , of the finer sample tensor . For each , the size is repeatedly increased by setting , which leads to nested Chebyshev points, until the heuristic considers the resolution sufficient for all mode- fibers. Replacing all fibers in (11) by their refined counterparts yields an approximation of the tensor , which contains evaluations of on a Chebyshev grid. Note that might be very large and is never computed explicitly.
In the third phase of the Chebfun3 constructor, the refined block term decomposition is converted and compressed to the desired Tucker format (5), where the interpolants are stored as Chebfun objects ; see  for details. Lemma 1 guarantees a good approximation when the polynomial degrees are sufficiently large and when is well approximated by the Tucker approximation . Neither of these properties can be guaranteed in Phases 1 and 2 alone. Therefore in a final step, Chebfun3 verifies the accuracy by comparing and the approximation at Halton points . If the estimated error is too large, the whole algorithm is restarted on a finer coarse grid from Phase 1.
The Chebfun3 algorithm often requires unnecessarily many function evaluations. As we will illustrate in the following, this is due to redundancy among the mode- and mode- fibers. For this purpose we collect all (refined) mode- fibers in the block term decomposition (11) into the columns of a big matrix , where is the number of steps of the outer ACA (9). As will be demonstrated with an example below, matrix is often observed to have low numerical rank, which in turn allows to represent its column space by much fewer columns, that is, much fewer mode- fibers. As the accuracy of the column space determines the accuracy of the Tucker decomposition after the compression, this implies that the other mode- fibers in are redundant.
Let us now consider the block term decomposition111Note that the accuracy verification in Phase 3 fails once for this function. Here we only consider to block term decomposition obtained after restarting the procedure. (11) for the function
In Figure 2 the numerical rank and the number of columns of are compared. For the approximation of the slices to leads to a total of mode- fibers, the sum of the corresponding red and blue bars in Figure 2. In contrast, their numerical rank (blue bar) is only . Thus, the red bar can be interpreted as number of redundant mode- fibers. This happens since nearby slices tend to be similar.
The total block term decomposition contains slices and is compressed into a Tucker decomposition with multilinear rank . It contains redundant fibers, the refinement requires function evaluations for each of them. Note that the asymmetry in the rank of the Tucker decomposition is caused by the asymmetry of the block term decomposition.
Another disadvantage is that Chebfun3 always requires the full evaluation of in Phase 1. This becomes expensive when a large size is needed in order to properly identify suitable fibers.
In this section, we describe our novel algorithm Chebfun3F to compute an approximation of the form (5). The goal of Chebfun3F is to the avoid the redundant function evaluations observed in Chebfun3. While the structure of Chebfun3F is similar to Chebfun3, consisting of 3 phases to identify/refine fibers and compute a Tucker decomposition, there is a major difference in Phase 1. Instead of proceeding via slices, we directly identify mode- fibers of for building factor matrices. The core tensor is constructed in Phase 3.
As in Chebfun3, the coarse tensor is initially defined to contain the function values of on a Chebyshev grid. We seek to compute factor matrices , and such that the orthogonal projection of onto the span of the factor matrices is an accurate approximation of , i.e.
Additionally, we require that the columns in contain fibers of .
In the existing literature, algorithms to compute such factor matrices include the Higher Order Interpolatory Decomposition 
, which is based on a rank revealing QR decomposition, and the Fiber Sampling Tensor Decomposition, which is a generalization of the CUR decomposition. We propose a novel algorithm, which in contrast to the existing algorithms does not require the evaluation of the full tensor . We follow the ideas of TT-cross [40, 48] and its variants such as the Schur-Cross3D  and the ALS-cross .
Initially, we randomly choose index sets each containing indices. In the first step, we apply Algorithm 1 to . Note that this needs drawing only values of the function , in contrast to values in the whole tensor . The selected columns serve as a first candidate for the factor matrix . The index set is set to the row indices selected by Algorithm 1 (see Figure 3). We use the updated index set and apply Algorithm 1 to analogously, which yields and an updated . From we obtain and . We repeat this process in an alternating fashion with the updated index sets, which leads to potentially improved factor matrices. Following the ideas of Chebfun3, we check after each iteration whether the ratios , and surpass the heuristic threshold . If this is the case, we increase the size of the coarse tensor to and restart the whole process by reinitializing with random indices respectively.
It is not clear a priori how many iterations are needed to attain an approximation (12) that yields a Tucker approximation (5) which passes the accuracy verification in Phase 3. In numerical experiments, it has usually proven to be sufficient to stop after the second iteration, during which the coarse grid has not been refined, or when , or . This is formalized in Algorithm 2. In many cases, we found that the numbers of columns in the factor matrices are equal to the multilinear rank of the truncated HOSVD  of with the same tolerance.
In the final Phase of Chebfun3F, we compute a core tensor to yield an approximation .
In principle, the best approximation (with respect to the Frobenius norm) for fixed factor matrices is obtained by orthogonal projections . Such an approach comes with the major disadvantage that the full evaluation of is required. This can be circumvented by instead using oblique projections , where for an index set . Oblique projections in all three modes yield
for index sets . The choice of is crucial for the approximation quality and will be discussed later on. The computation of the core tensor only requires evaluations in Phase 3. From we construct the approximation (5) as described in Section 2.3.
In practice, instead of using the potentially ill-conditioned matrices we compute QR decompositions , , , and use the equivalent representation
in Chebfun3F. The evaluation of requires samples from .
The following lemma plays a critical role in guiding the choice of indices .
Let , , have orthonormal columns. Consider an index set of cardinality such that is invertible. Then the oblique projection satisfies
where denotes the matrix -norm.
Lemma 2 exhibits the critical role played by the quantity for oblique projections. In Chebfun3, we use the discrete empirical interpolation method (DEIM) , presented in Algorithm 3, to compute the index sets given . In practice, these index sets usually yield good approximations as tends to be small; see also Section 4.4.1.