# Dynamically orthogonal tensor methods for high-dimensional nonlinear PDEs

We develop new dynamically orthogonal tensor methods to approximate multivariate functions and the solution of high-dimensional time-dependent nonlinear partial differential equations (PDEs). The key idea relies on a hierarchical decomposition of the approximation space obtained by splitting the independent variables of the problem into disjoint subsets. This process, which can be conveniently be visualized in terms of binary trees, yields series expansions analogous to the classical Tensor-Train and Hierarchical Tucker tensor formats. By enforcing dynamic orthogonality conditions at each level of binary tree, we obtain coupled evolution equations for the modes spanning each subspace within the hierarchical decomposition. This allows us to effectively compute the solution to high-dimensional time-dependent nonlinear PDEs on tensor manifolds of constant rank, with no need for rank reduction methods. We also propose new algorithms for dynamic addition and removal of modes within each subspace. Numerical examples are presented and discussed for high-dimensional hyperbolic and parabolic PDEs in bounded domains.

## Authors

• 4 publications
• 10 publications
07/19/2020

### Dynamic tensor approximation of high-dimensional nonlinear PDEs

We present a new method based on functional tensor decomposition and dyn...
08/26/2019

### Stability analysis of hierarchical tensor methods for time-dependent PDEs

In this paper we address the question of whether it is possible to integ...
12/10/2020

### Rank-adaptive tensor methods for high-dimensional nonlinear PDEs

We present a new rank-adaptive tensor method to compute the numerical so...
03/25/2020

### Spectral methods for nonlinear functionals and functional differential equations

We develop a rigorous convergence analysis for finite-dimensional approx...
04/23/2021

### Hierarchical adaptive low-rank format with applications to discretized PDEs

A novel compressed matrix format is proposed that combines an adaptive h...
03/02/2022

### Neural Galerkin Scheme with Active Learning for High-Dimensional Evolution Equations

Machine learning methods have been shown to give accurate predictions in...
06/12/2019

### Model Order Reduction by Proper Orthogonal Decomposition

We provide an introduction to POD-MOR with focus on (nonlinear) parametr...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

High-dimensional partial differential equations (PDEs) arise in many areas of engineering, physical sciences and mathematics. Classical examples are equations involving probability density functions (PDFs) such as the Fokker-Plank equation

Risken , the Liouville equation Venturi_PRS ; HeyrimJCP_2014 , or the Boltzmann equation cercignani1988 ; dimarco2014 . Other types of high-dimensional PDEs can be obtained as finite-dimensional approximations of functional differential equations venturi2018numerical , such as the Hopf equation of turbulence Hopf ; Hopf1 ; Monin2 , the Schwinger-Dyson equation of quantum mechanics Itzykson , or the Martin-Siggia-Rose formulation of classical statistical dynamics Martin ; Jensen ; Jouvet ; Phythian

. Computing the solution to high-dimensional PDEs is a challenging problem that requires approximating high-dimensional functions, i.e., the solution to the PDE, and then developing appropriate numerical schemes to compute such functions accurately. Classical numerical methods based on tensor product representations are not viable in high-dimensions, as the number of degrees of freedom grows exponentially fast with the dimension. To address this problem there have been substantial research efforts in recent years on high-dimensional numerical approximation theory. Techniques such as sparse collocation

Bungartz ; Chkifa ; Barthelmann ; Foo1 ; Akil , high-dimensional model representations (HDMR) Li1 ; CaoCG09 ; Baldeaux

and, more recently, deep neural networks

Raissi ; Raissi1 ; Zhu2019 and tensor methods khoromskij ; Bachmayr ; parr_tensor ; Hackbusch_book ; ChinestaBook ; Kolda were proposed to mitigate the exponential growth of the degrees of freedom, the computational cost and memory requirements.

In this paper, we develop a new dynamically orthogonal tensor method to approximate multivariate functions and the solution of high-dimensional time-dependent nonlinear PDEs. The key idea relies on a hierarchical decomposition of the function space in terms of a sequence of nested subspaces of smaller dimension. Such decomposition is induced by by splitting the independent variables of the problem recursively into two disjoint subsets which can conveniently be visualized by binary trees. In particular, we study two classes of trees which are analogous to the Tensor-Train (TT) OseledetsTT and Hierarchical Tucker (HT) Grasedyck2018 tensor formats. By enforcing dynamic orthogonality (DO) do or bi-orthogonality (BO) Cheng2013 conditions at each level of the TT or the HT binary tree, we obtain coupled evolution equations for the modes spanning each subspace within the hierarchy. This allows us to represent the time evolution of high-dimensional functions and compute the solution of high-dimensional time-dependent nonlinear PDEs on a tensor manifold with constant rank. This formulation has several advantages over classical numerical tensor methods. In particular, the hard-to-compute nonlinear projection Lubich2018 ; hierar that maps the solution of high-dimensional PDEs onto a tensor manifold with constant rank h_tucker_geom here is represented explicitly by the hierarchical DO/BO propagator111The hierarchical DO/BO propagator is nonlinear even for linear PDEs. Such nonlinearity implicitly represnts the projection onto a tensor manifold with constant rank., i.e., by a system of coupled one-dimensional nonlinear PDEs. In other words, there is no need to perform tensor rank reduction hsvd_tensors_grasedyk ; Grasedyck2018 ; Kressner2014 , rank-constrained temporal integration Lubich2018 ; hierar , or Riemannian optimization DaSilva2015 , when solving high-dimensional PDEs with the hierarchical subspace decomposition method we propose222Classical numerical tensor methods for high-dimensional PDEs with explicit time stepping schemes require rank-reduction to project the solution back into the tensor manifold with prescribed rank (see venturi2018numerical §5.5), the so-called retraction step DaSilva2015

. This can be achieved, e.g., by a sequence of suitable matricizations followed by hierarchical singular value decomposition

hsvd_tensors_grasedyk ; Grasedyck2018 ; Kressner2014 , or by optimization DaSilva2015 ; Kolda ; parr_tensor ; Silva ; Rohwedder ; Karlsson . Rank reduction can be computationally intensive, especially if performed at each time step. Tensor methods with implicit time stepping suffer from similar issues. In particular, the nonlinear system that yields the solution at the next time step needs to be solved on a tensor manifold with constant rank by using, e.g., Riemannian optimization algorithms DaSilva2015 ; h_tucker_geom ; Etter ..

This paper is organized as follows. In Section 2

we introduce a recursive bi-orthogonal decomposition method for time-independent multivariate functions, develop error estimates and provide simple examples of application. In Section

3 we extend the recursive bi-orthogonal decomposition to time-dependent functions, and develop a hierarchy of nested time-dependent orthogonal projections generalizing the DO and BO conditions to tensor formats with multiple levels. We also prove that the approximations resulting from the BO and the DO conditions are equivalent, in the sense that they span the same function spaces. In Section 4, we apply the recursive subspace decomposition method to compute the solution of high-dimensional nonlinear PDEs. In Section 5 we provide numerical examples demonstrating the accuracy and computational effectiveness of the recursive subspace decomposition method we propose. Specifically we study high-dimensional hyperbolic and parabolic PDEs. The main findings are summarized in Section 6.

## 2 Recursive bi-orthogonal decomposition of time-independent multivariate functions

Let be a subset of () that contains an open set333If contains an open set then ., and let

 u:Ω→R (1)

be a multivariate function which we assume to be an element of a separable Hilbert space . A possible choice of such Hilbert space is the Sobolev space

 Hk(Ω)={u∈L2(Ω):Dαu∈L2(Ω)for all|α|≤k} ,k=0,1,2,… (2)

where is a multi-index and

 Dαu=∂|α|u∂xα11…∂xαdd,|α|=α1+⋯+αd. (3)

Note that (2) includes the classical Lebesgue space . We equip (2) with the standard inner product

 ⟨f,g⟩Hk(Ω)=∑|α|≤k∫ΩDαf(x)Dαg(x)dx1…dxd. (4)

If needed, this inner product can be weighted by a non-negative separable density . Any separable Hilbert space is isomorphic to , and it can be represented as a tensor product of two Hilbert spaces (math_phys_vol1, , p.51), i.e.,

 H≅H1⊗H2. (5)

The spaces and may be specified by partitioning the spatial variables into two disjoint subsets. This is equivalent to represent the domain as a Cartesian product of two sub-domains (whenever possible). For instance, consider the partition

 Ω=Ω(1,…,p)×Ω(p+1,…,d) (6)

induced by the following splitting of the spatial variables

 (x1,x2,…,xd)inΩ=((x1,…,xp)inΩ(1,…,p),(xp+1,…,xd)inΩ(p+1,…,d)). (7)

In this setting, the Sobolev space (2) admits the following decomposition

 Hk(Ω)≅Hk(Ω(1,…,p))⊗Hk(Ω(p+1,…,d)). (8)

The inner products within each subspace and can be defined, respectively, as

 ⟨f,g⟩Hk(Ω(1,…,p))=∑α1+⋯+αp≤k∫Ω(1,…,p)∂α1+⋯+αpf∂xα11…∂xαpp∂α1+⋯+αpg∂xα11…∂xαppdx1⋯dxp, (9)

and

 ⟨f,g⟩Hk(Ω(p+1,…,d))=∑αp+1+⋯+αd≤k∫Ω(p+1,…,d)∂αp+1+⋯+αdf∂xαp+1p+1…∂xαdd∂αp+1+⋯+αdg∂xαp+1p+1…∂xαdddxp+1⋯dxd. (10)

A representation of the multivariate function (1) in the tensor product space (8) has the general form

 u(x1,…,xd)=∞∑i,j=1aijφ(1,…,p)i(x1,…,xp)φ(p+1,…,d)j(xp+1,…,xd) , (11)

where and are orthonormal basis functions in and , respectively444Orthonormality is relative to the inner products in and .. The superscripts in (11) denote which spatial components the function depends on. This will be the case throughout this paper and for notational simplicity, the spatial arguments will be often omitted when there is no ambiguity.

With the isomorphism (8) and the inner products (9)-(10) set, it is straightforward to develop an operator framework which guarantees the existence of a diagonalized bi-orthogonal representation of the field . To this end, following Aubry et. al. aubry_1 ; aubry_2 ; aubry_3 and Venturi venturi_bi_orthogonal (see also venturi2006 ; venturi2008 ), we define the integral operator

 Uu:Hk(Ω(1,…,p))→Hk(Ω(p+1,…,d)), (12) (Uuψ)(xp+1,…,xd)=⟨u,ψ⟩Hk(Ω(1,…,p)).

The formal adjoint of , denoted by , is a linear operator defined by the requirement

 ⟨Uuψ,φ⟩Hk(Ω(p+1,…,d))=⟨ψ,U†uφ⟩Hk(Ω(1,…,p)) , (13)

for all , and all . By using integration by parts and discarding boundary conditions (formal adjoint operator) we obtain

 U†u:Hk(Ω(p+1,…,d))→Hk(Ω(1,…,p)) , (14) (U†uφ)(x1,…,xp)=⟨u,φ⟩Hk(Ω(p+1,…,d)) ,

The subscript “” in and identifies the kernel of the integral operators. We will shortly define a hierarchy of such operators and it will be important to distinguish them by their kernels. Next, we introduce the following correlation operators

 Ru=UuU†u,Ru:Hk(Ω(p+1,…,d))→Hk(Ω(p+1,…,d)) , (15)

and

 Lu=U†uUu,Lu:Hk(Ω(1,…,p))→Hk(Ω(1,…,p)). (16)

Note that and are self-adjont relative to (9) and (10), respectively. Moreover, if is compact (e.g., if we consider a decomposition in ), then is compact, and therefore and are compact. Hence, by the Riesz-Schauder theorem, they have the same discrete spectra (see e.g. (kato, , p.185)). By a direct calculation, it can be show that

 (Ruφ)(xp+1,…,xd)=⟨ru,φ⟩Hk(Ω(p+1,…,d))φ∈Hk(Ω(p+1,…,d)) , (17)

where the correlation function is defined by

 ru(xp+1,…,xd,x′p+1,…,x′d)=⟨u,u⟩Hk(Ω(1,…,p)). (18)

Similarly,

 (Luψ)(x1,…,xp)=⟨lu,ψ⟩Hk(Ω(1,…,p))ψ∈Hk(Ω(1,…,p)) , (19)

where the correlation function is defined by

 lu(x1,…,xp,x′1,…,x′p)=⟨u,u⟩Hk(Ω(p+1,…,d)). (20)

It is a classical result in the spectral theory of compact operators (see e.g., aubry_1 ; aubry_3 ) that there exists a canonical decomposition of the field (1) of the form

 u=∞∑k=1λkψ(1,…,p)kψ(p+1,…,d)k , (21)

where the modes and

satisfy the eigenvalue problem

 [Uu00U†u]⎡⎣ψ(1,…,p)kψ(p+1,…,d)k⎤⎦=λk[0110]⎡⎣ψ(1,…,p)kψ(p+1,…,d)k⎤⎦. (22)

Moreover, it can be shown that , and

 ⟨ψ(1,…,p)iψ(1,…,p)j⟩Hk(Ω(1,…,p))=⟨ψ(p+1,…,d)iψ(p+1,…,d)j⟩Hk(Ω(p+1,…,d))=δij. (23)

The series (21) is usually called bi-orthogonal (or Schmidt) decomposition of the multivariate function , and it converges in norm. The modes

are eigenfunctions of the operator

with corresponding eigenvalue , while the modes are eigenfunctions of the operator with corresponding eigenvalues , i.e.,

 Luψ(1,…,p)k =λ2kψ(1,…,p)k , (24) Ruψ(p+1,…,d)k =λ2kψ(p+1,…,d)k.

In practice, since and are determined up to two unitary transformations aubry_2 , to compute (21) we can solve one of the two eigenvalue problems in (24) (the one with smaller dimension), and then use one of the dispersion relations (22), i.e.,

 ψ(p+1,…,d)k=1λkUuψ(1,…,p)k,ψ(1,…,p)k=1λkU†uψ(p+1,…,d)k. (25)

### 2.1 Hierarchical subspace decomposition and tensor formats

To obtain a series expansion of the multivariate function in terms of univariate functions, we apply the bi-orthogonal decomposition method discussed in the previous Section recursively. The way in which the variables are split in each step of the recursive procedure, i.e., the choice of in (8), can be conveniently visualized by binary trees. In Figure 1 we provide two simple examples of such binary trees corresponding to the Tensor-Train (TT) OseledetsTT and the Hierarchical Tucker (HT) Grasedyck2018 tensor formats (see also Hackbusch_book ; venturi2018numerical , and the references therein).

#### 2.1.1 Tensor Train (TT) format

The Tensor-Train format singles out one variable at a time, resulting in a binary tree with depth when decomposing -variate functions (see Figure 1). This corresponds to the following hierarchical subspace decomposition of the Sobolev space (2)

 Hk(Ω)= Hk(Ω(1))⊗Hk(Ω(2,…,d)), = Hk(Ω(1))⊗[Hk(Ω(2))⊗Hk(Ω(3,…,d))], = Hk(Ω(1))⊗[Hk(Ω(2))⊗{Hk(Ω(3))⊗Hk(Ω(4,…,d))}] , ⋯ .

in which we diagonalize each tensor product representation using the bi-orthogonal decomposition method. This yields the following TT expansion of the multivariate function

 u =∞∑i1=1λi1ψ(1)i1ψ(2,…,d)i1 , (26) ψ(2,…,d)i1 =∞∑i2=1λi1i2ψ(2)i1i2ψ(3,…,d)i1i2 , (27) ⋮ ψ(j,…,d)i1⋯ij−1 =∞∑ij=1λi1⋯ijψ(j)i1⋯ijψ(j+1,…,d)i1⋯ij , (28) ⋮ ψ(d−1,d)i1⋯id−2 =∞∑id−1=1λi1⋯id−1ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 , (29)

i.e.,

 u=∞∑i1=1∞∑i2=1⋯∞∑id−1=1λi1λi1i2⋯λi1⋯id−1ψ(1)i1ψ(2)i1i2⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1. (30)

Each of the bi-orthogonal modes can be obtained by solving a sequence of one-dimensional eigenvalue problems followed by projections. Specifically, the eigenvalue problems are

 Luψ(1)i1=λ2i1ψ(1)i1,Lψ(k−1)i1⋯ik−1ψ(k)i1⋯ik=λ2i1⋯ikψ(k)i1⋯ik,k=2,…,d−1, (31)

(see Eq. (19)) while the corresponding projections are defined as

 ψ(2,…,d)i1=1λi1⟨u,ψ(1)i1⟩Hk(Ω(1)),ψ(j+1,…,d)i1⋯ij=1λi1⋯ij⟨u,ψ(j)i1⋯ij⟩Hk(Ω(j+1))j=2,…,d−1. (32)

#### 2.1.2 Hierarchical Tucker (HT) format

The Hierarchical Tucker format splits variables into disjoint subsets of equal size, whenever possible. In the case -variate functions , where for some natural number , the tree is balanced. In general, the Hierarchical Tucker tree is more shallow than a Tensor Train tree for the same number of variables . In fact, the depth of the HT tree for is , while the corresponding TT tree has depth . The HT format is based on the following hierarchical decomposition of the Sobolev space

 Hk(Ω)= Hk(Ω(1,…,d/2))⊗Hk(Ω(d/2+1,…,d)), = ⋯ .

As before, we diagonalize each tensor product representation as we proceed splitting variables down the tree. This yields the following sequence of bi-orthogonal decompositions

 u =∞∑i1=1λ(1,…,d/2)i1ψ(1,…,d/2)i1ψ(d/2+1,…,d)i1 , (33) ψ(1,…,d/2)i1 =∞∑i2=1λ(1,…,d/4)i1i2ψ(1,…,d/4)i1i2ψ(d/4+1,…,d/2)i1i2 , (34) ψ(d/2+1,…,d)i1 =∞∑i2=1λ(d/2+1,…,3d/4)i1i2ψ(d/2+1,…,3d/4)i1i2ψ(3d/4+1,…,d)i1i2 , (35) ⋮ ψ(1,2)i1⋯in−1 =∞∑in=1λ(1)i1⋯inψ(1)i1⋯inψ(2)i1⋯in , (36) ⋮ ψ(d−1,d)i1⋯in−1 =∞∑in=1λ(d−1)i1⋯inψ(d−1)i1⋯inψ(d)i1⋯in , (37)

and the expansion

 u=∞∑i1=1⋯∞∑in=1λ(1,…,d2)i1⋯λ(d−1)i1⋯inψ(1)i1⋯inψ(2)i1⋯in⋯ψ(d)i1⋯in. (38)

Similar to the TT format, a sequence of eigenfunction problems followed by projections are used to obtain the modes spanning the hierarchical subspaces. However, in the HT case the eigenfunction problems are higher dimensional and not tractable for large .

##### Remark

Clearly, we may decompose a multivariate function by splitting variables in various ways at different levels of the decompositions. Any binary tree which has leaves containing one index leads to a series expansion in terms of functions of one spatial variable. Separable Hilbert spaces defined on a Cartesian product of one-dimensional domains always allow such reduction.

### 2.2 Error analysis

In this Section we develop an error analysis for the recursive biorthogonal decomposition we discussed in Section 2.1. To this end we first state a Lemma which will be useful in Section 2.4 for establishing a thresholding criterion to truncate the infinite sums in (30) and (38).

###### Lemma 2.1.

 u=∞∑i=1λiψ(1,…,p)iψ(p+1,…,d)ip∈{2,…,d−1} , (39)

then .

###### Proof.

This result follows immediately from the orthonormality of the modes and relative to the inner products (9)-(10).

Next, we analyze the error in the norm between the Tensor Train series expansion (30) and the truncated expansion

 ~u=r1∑i1=1r2(i1)∑i2=1⋯rd−1(i1,…,id−2)∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1ψ(2)i1i2⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 , (40)

where are truncation ranks. To simplify indexing and array bounds for truncated TT expansions such as (40), we will omit the array indices in the rank arrays and write, e.g., instead of , since the rank array indices are clear from the subscripts of the mode . In this simplified notation, the truncated TT expansion (40) can be written as

 ~u(1,…,d)=r1∑i1=1r2∑i2=1⋯rd−1∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1ψ(2)i1i2⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1. (41)
###### Proposition 2.1.

Let . The error incurred by truncating the infinite expanion (30) to the finite expansion (41) is given by

 ∥u−~u∥2Hk(Ω)= ∞∑i1=r1+1λ2i1+r1∑i1=1∞∑i2=r2+1λ2i1λ2i1i2+⋯ (42) +r1∑i1=1r2∑i2=1⋯rd−2∑id−2=1∞∑id−1=rd−1+1λ2i1λ2i1i2⋯λ2i1⋯id−1.
###### Proof.

Let us rewrite (30) as

 u=∞∑i1=1λi1ψ(1)i1∞∑i2=1λi1i2ψ(2)i1i2⋯∞∑id−1=1λi1⋯id−1ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 (43)

and split each infinite sum into the superimposition of a finite sum and an infinite sum, i.e.,

 u= (r1∑i1=1λi1ψ(1)i1+∞∑i1=r1+1λi1ψ(1)i1)(r2∑i2=1λi1i2ψ(2)i1i2+∞∑i2=r2+1λi1i2ψ(2)i1i2)⋯ (44) ⋯⎛⎝rd−1∑id−1=1λi1⋯id−1ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1+∞∑id−1=rd−1+1λi1⋯id−1ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1⎞⎠.

Expanding the products in (44) yields the following expression

 u= r1∑i1=1r2∑i2=1⋯rd−1∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 (45) +∞∑i1=r1+1∞∑i2=1⋯∞∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 +r1∑i1=1∞∑i2=r2+1∞∑i3=1⋯∞∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 +r1∑i1=1r2∑i2=1∞∑i3=r3+1∞∑i4=1⋯∞∑id−1=1λi1⋯λi1⋯id−1ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1 ⋮ +r1∑i1=1r2∑i2=1⋯rd−2∑id−2=1∞∑id−1=rd−1+1λi1⋯λi1⋯id−1ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1.

Using the orthogonality of each set of modes we obtain

 ∥u−~u∥2Hk(Ω)= ∞∑i1=r1+1∞∑i2=1⋯∞∑id−1=1λi1⋯λi1⋯id−1∥ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1∥2Hk(Ω) + r1∑i1=1∞∑i2=r2+1∞∑i3=1⋯∞∑id−1=1λi1⋯λi1⋯id−1∥ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1∥2Hk(Ω) + r1∑i1=1r2∑i2=1∞∑i3=r3+1∞∑i4=1⋯∞∑id−1=1λi1⋯λi1⋯id−1∥ψ(1)i1⋯ψ(d−1)i1⋯id−1ψ(d)i1⋯id−1∥2Hk(Ω) ⋮ +r1∑i1=1r2∑i2=1