1 Introduction
For the approximate solution of the initial value problem for a (huge) system of differential equations for the tensor ,
(1) 
we aim to construct in an approximation manifold of much smaller dimension, which in the present work will be chosen as a manifold of tree tensor networks of fixed tree rank. This shall provide a datasparse computational approach to highdimensional problems that cannot be treated by direct time integration because of both excessive memory requirements and computational cost.
A differential equation for is obtained by choosing the time derivative as that element in the tangent space for which
where the norm is chosen as the Euclidean norm of the vector of the tensor entries. In the quantum physics and chemistry literature, this approach is known as the Dirac–Frenkel timedependent variational principle, named after work by Dirac in 1930 who used the approach in the context of what is nowadays known as the timedependent Hartree–Fock method for the multiparticle timedependent Schrödinger equation; see, e.g.,
[14, 15]. Equivalently, this minimumdefect condition can be stated as a Galerkin condition on the statedependent approximation space ,Using the orthogonal projection onto the tangent space , this can be reformulated as the (abstract) differential equation on ,
(2) 
This equation needs to be solved numerically in an efficient and robust way. For fixedrank matrix and tensor manifolds, the orthogonal projection turns out to be an alternating sum of subprojections, which reflects the multilinear structure of the problem. The explicit form of the tangent space projection in the lowrank matrix case as an alternating sum of three subprojections was derived in [12] and was used in [17] to derive a projectorsplitting integrator for lowrank matrices, which efficiently updates an SVDlike lowrank factorization in every time step and which is robust to the typically arising small singular values that cause severe difficulties with standard integrators applied to the system of differential equations for the factors of the SVDlike decomposition of the lowrank matrices; see [11]. The projectorsplitting integrator was extended to tensor trains / matrix product states in [18]; see also [9] for a description of the algorithm in a physical idiom. The projectorsplitting integrator was extended to Tucker tensors of fixed multilinear rank in [16]. A reinterpretation was given in [19], in which the Tucker tensor integrator was rederived by recursively performing inexact substeps in the matrix projectorsplitting integrator applied to matricizations of the tensor differential equation followed by retensorization. This interpretation made it possible to show that the Tucker integrator inherits the favorable robustness properties of the lowrank matrix projectorsplitting integrator.
In the present paper we take up such a recursive approach to derive an integrator for general tree tensor networks, which is shown to be efficiently implementable (provided that the righthand side function can be efficiently evaluated on tree tensor networks in factorized form) and to inherit the robust convergence properties of the lowrank matrix, Tucker tensor and tensor train / matrix product state integrators shown previously in [11, 19]. The proposed integrator for tree tensor networks reduces to the wellproven projectorsplitting integrators in the particular cases of Tucker tensors and tensor trains / matrix product states. We expect (but will not prove) that it can itself be interpreted as a projectorsplitting integrator based on splitting the tangent space projection of the fixedrank tree tensor network manifold.
In Section 2 we introduce notation and the formulation of tree tensor networks as multilevelstructured Tucker tensors and give basic properties, emphasizing orthonormal factorizations. The tree tensor network (TTN) is constructed from the basis matrices at the leaves and the connection tensors at the inner vertices of the tree in a multilinear recursion that passes from the leaves to the root of the tree.
In Section 3 we recall the algorithm of the Tucker tensor integrator of [19] and extend it to the case of several Tucker tensors with the same basis matrices. This extended Tucker integrator, which is nothing but the Tucker integrator for an extended Tucker tensor, is a basic building block of the integrator for tree tensor networks.
In Section 4
, the main algorithmic section of this paper, we derive the recursive TTN integrator and discuss the basic algorithmic aspects. In every time step, the integrator uses a recursion that passes from the root to the leaves of the tree for the construction of initial value problems on subtree tensor networks using appropriate restrictions and prolongations, and another recursion that passes from the leaves to the root for the update of the factors in the tree tensor network. The integrator only solves lowdimensional matrix differential equations (of the dimension of the basis matrices at the leaves) and lowdimensional tensor differential equations (of the dimension of the connection tensors at the inner vertices of the tree), alternating with orthogonal matrix decompositions of such small matrices and of matricizations of the connection tensors.
In Section 5 we prove a remarkable exactness property: if for a given tree tensor network of the specified tree rank, then the recursive TTN integrator for this tree rank reproduces exactly. This exactness property is proved using the analogous exactness property of the Tucker tensor integrator proved in [19], which in turn was proved using the exactness property for the matrix projectorsplitting integrator that was discovered and proved in [17].
In Section 6 we prove firstorder error bounds that are independent of small singular values of matricizations of the connecting tensors. The proof relies on the similar error bound for the Tucker integrator [19], which in turn relies on such an error bound for the fixedrank matrix projectorsplitting integrator proved in [11], in a proof that uses in an essential way the exactness property. The robustness to small singular values distinguishes the proposed integrator substantially from standard integrators applied to the differential equations for the basis matrices and connection tensors derived in [24]. We note that the proposed TTN integrator foregoes the formulation of these differential equations for the factors. The illconditioned density matrices whose inverses appear in these differential equations are never formed, let alone inverted, in the TTN integrator.
The present paper thus completes a path to extend the lowrank matrix projectorsplitting integrator of [17], together with its favorable properties, from the dynamical lowrank approximation by matrices of a prescribed rank to Tucker tensors of prescribed multilinear rank to general tree tensor networks of prescribed tree rank.
In Section 7 we present a numerical experiment which shows the error behaviour of the proposed integrator in accordance with the theory. We choose the example of retraction of the sum of a tree tensor network and a tangent network, which is an operation needed in many optimization algorithms for tree tensor networks; cf. [1] for the lowrank matrix case. The corresponding example was already chosen in numerical experiments for the lowrank matrix, tensor train and Tucker tensor cases in [17, 18, 19], respectively. It is beyond the scope of this paper to present the results of numerical experiments with the recursive TTN integrator in actual applications of tree tensor networks in physics, chemistry or other sciences. We note, however, that striking numerical experiments with precisely this integrator (given in an ad hoc formulation) are already reported in [4] for the Vlasov–Poisson equation of plasma physics, for a tree tensor network where the tree is given by the separation of the position and velocity variables, which are further separated into their Cartesian coordinates.
2 Preparation: Matrices, Tucker tensors, tree tensor networks, and their ranks
2.1 Matrices of rank
The singular value decomposition shows that a matrix
is of rank if and only if it can be factorized aswhere and have orthonormal columns, and has full rank . The real matrices of rank (exactly) are known to form a smooth embedded manifold in [10].
2.2 Tucker tensors of multilinear rank
For a tensor , the multilinear rank is defined as the tuple of the ranks of the matricizations for , where . We recall that the th matricization aligns in the th row (for ) all entries of that have the index in the th position, usually ordered colexicographically. The inverse operation is retensorisation of the matrix, denoted by :
It is known from [3] that the tensor has multilinear rank if and only if it can be factorized as a Tucker tensor (we adopt the shorthand notation from [13])
(3) 
where the basis matrices have orthonormal columns and the core tensor has full multilinear rank . (This requires a compatibility condition among the ranks: . In particular, this condition is satisfied if all ranks are equal.)
A useful formula for the matricization of Tucker tensors is
(4) 
where denotes the Kronecker product of matrices.
The tensors of given dimensions and fixed multilinear rank are known to form a smooth embedded manifold in .
2.3 Orthonormal tree tensor networks of tree rank
A tree tensor network is a multilevelstructured Tucker tensor where the configuration is described by a tree. The notion of a ‘tree tensor network’ was coined in the quantum physics literature [22], but tree tensor networks were actually already used a few years earlier in the multilayer MCTDH method of chemical physics [24]. In the mathematical literature, tree tensor networks with binary trees have been studied as ‘hierarchical tensors’ [8] and with general trees as tensors in ‘treebased tensor format’ [5, 6]. We remark that Tucker tensors and matrix product states / tensor trains [21, 20] are particular instances of tree tensor networks, whose trees are trees of minimal height (bushes) and binary trees of maximal height, respectively. As there does not appear to exist a firmly established notation for tree tensor networks, we give a formulation from scratch that we find useful for our purposes.
[Ordered trees with distinct leaves] Let be a given finite set, to which we refer as the set of leaves. We define the set of trees with leaves in recursively as follows:

, for each .

If, for some ,
then the ordered tuple
The graphical interpretation is that (i) leaves are trees and (ii) every other tree is formed by connecting a root to several trees with distinct leaves. (Note that is excluded: the 1tuple is considered as identical to .) is the set of leaves of the tree .
The trees are called direct subtrees of the tree , which together with direct subtrees of direct subtrees of etc. are called the subtrees of . We let be the set of subtrees of a tree , including . More formally, we set
for , and for . 
In the graphical interpretation, the subtrees are in a bijective correspondence with the vertices of the tree, by assigning to each subtree its root; see Figure 2.1.
On the set of trees we define a partial ordering by writing, for ,
On a given tree , we work with the following set of data:

To each leaf we associate a dimension , a rank and a basis matrix of full rank .

To every subtree we associate a rank and a connection tensor of full multilinear rank . We set .
This can be interpreted as associating a tensor to the root of each subtree. In this way every vertex of the given tree carries either a matrix — if it is a leaf — or else a tensor whose order equals the number of edges leaving the vertex.
With these data, a tree tensor network (TTN) is constructed in a recurrence relation that passes from the leaves to the root of the tree.
[Tree tensor network] For a given tree , we recursively define a tree tensor network as follows:

If , we set
We note that the expression on the righthand side of the definition of can be viewed as an tuple of tensors with the same basis matrices but different core tensors for . The vectorizations of these tensors, which are of dimension , form the columns of the matrix . The index in and refers to the mode of dimension of , which we count as mode .
It is favorable to work with orthonormal matrices, so that each tensor is in the Tucker tensor format.
[Orthonormal tree tensor network] A tree tensor network (more precisely, its representation in terms of the matrices ) is called orthonormal if for each subtree , the matrix has orthonormal columns.
The following is a key lemma.
For a tree , let the matrices have orthonormal columns. Then, the matrix has orthonormal columns if and only if the matricization has orthonormal columns.
We have, by the definition of and and the unfolding formula (4),
It follows that
which proves the result.
We observe that due to the recursive definition of a tree tensor network it thus suffices to require that for each leaf the matrix and for every other subtree the matrix have orthonormal columns. Unless stated otherwise, we intend the tree tensor networks to be orthonormal in this paper.
The orthonormality condition of Lemma 2.3 is very useful, because it reduces the orthonormality condition for the large, recursively constructed and computationally inaccessible matrix to the orthonormality condition for the smaller, given matrix . To our knowledge, the first use of this important property was made in the chemical physics literature in the context of the multilayer MCTDH method [24].
A further consequence of Lemma 2.3
is that every tree tensor network has an orthonormal representation. This is shown using a QR decomposition of nonorthonormal matrices
and including the nonorthonormal factor in the tensor of the parent tree . It is of full tree rank if all these matrices are of full rank.We rely on the following property, which we can be proved as in [23], where binary trees are considered (this corresponds to the case above); see also [5].
The set of tree tensor networks for a tree of given dimensions and ranks is a smooth embedded manifold in the tensor space .
3 Extended Tucker Integrator
In Subsection 3.1 we recapitulate the algorithm of the Tucker tensor integrator of [16, 19], and in Subsection 3.2 we extend the algorithm to tuples of Tucker tensors with the same basis matrices. This will be a basic building block for the tree tensor network integrator derived in the next section.
3.1 Tucker tensor integrator
Let be the manifold of Tucker Tensors with fixed multilinear rank r=. Let us consider an approximation to the initial data ,
The nested Tucker integrator is a numerical procedure that gives, after substeps, an approximation in Tucker tensor format, , to the full solution after a time step . The procedure is repeated over further time steps to yield approximations to . At each substep of the algorithm, only one factor of the Tucker representation is updated while the others, with the exception of the core tensor, remain fixed. The essential structure of the algorithm can be summarized in the following algorithmic scheme.
The update process in the Tucker integrator is the most intensive and technical part of the algorithm. In order to reduce the complexity of the explanation we introduce the subflows and , corresponding to the update of the basis matrices and of the core tensor, respectively. We set .
The remaining subflow describes the final update process of the core.
Finally, the result of the Tucker tensor integrator after one time step can be expressed in a compact way as
(5) 
We refer the reader to [19] for a detailed derivation and major properties of this Tucker tensor integrator.
The efficiency of the implementation of this algorithm depends on the possibility to evaluate the functions without explicitly forming the large slim matrix and the tensor . This is the case if is a linear combination of Tucker tensors of moderate rank whose factors can be computed directly from the factors and without actually computing the entries of the Tucker tensor.
3.2 Extended Tucker Integrator
We consider the case of a dimensional Tucker tensor where the first basis matrix in the decomposition of the initial data is the identity matrix of dimension ,
as appears in the recursive construction of orthonormal tree tensor networks. This can be viewed as a collection of Tucker tensors in with the same basis matrices . Recalling (5), the action of the Tucker integrator after one time step can be represented as
The following result holds.
The action of the subflow on is trivial, i.e.
In the first step of the subflow , we matricize the core tensor in the zero mode and we perform a QR decomposition,
We define
and we set
The next step consists of solving the differential equation
We define
and we compute the corresponding QR decomposition,
To conclude, we solve the differential equation
We now observe that due to the presence of the negative sign, the solution of the equation can be directly computed, i.e.
Therefore, and we conclude that .
During the update process the structure of the initial data is preserved and we can now introduce the extended Tucker integrator as follows.
4 Recursive tree tensor network integrator
We now come to the central algorithmic section of this paper. We derive an integrator for orthonormal tree tensor networks, which updates the orthonormalbasis matrices of the leaves and the orthonormalityconstrained core tensors of the other vertices of the tree by a recursive TTN integrator.
4.1 Derivation
Let be a given tree with the set of leaves and a specified family of tree ranks, where we assume . For each subtree we introduce the space
(6) 
In the following, we associate to each subtree of the given tree a tensorvalued function . Its actual recursive construction, starting from the root with and passing to the leaves, will be given in the next subsection.
Consider a tree and an extended Tucker tensor associated to the tree ,
Applying the extended Tucker integrator with the function we have that
We recall that the subflow gives the update process of the basis matrix . The extra subscript indicates that the subflow is computed for the function .
We have two cases:

If is a leaf, we directly apply the subflow and update the basis matrix.

Else, we apply only approximately (but call the procedure still ). We tensorize the basis matrix and we construct new initial data and a function . We iterate the procedure in a recursive way, reducing the dimensionality of the problem at each recursion.
This leads to the definition of the recursive tree tensor network (TTN) integrator. It has the same general structure as the extended Tucker integrator.
The difference to the Tucker integrator is that now the subflow is no longer the same as for the extended Tucker integrator, but it recursively uses the TTN integrator for the subtrees. This approximate subflow is defined in close analogy to the subflow for Tucker tensors, but the first differential equation is solved only approximately unless is a leaf.
The subflow is the same as for the Tucker integrator, for the function instead of .
Comments
There are no comments yet.