# Dynamic tensor approximation of high-dimensional nonlinear PDEs

We present a new method based on functional tensor decomposition and dynamic tensor approximation to compute the solution of a high-dimensional time-dependent nonlinear partial differential equation (PDE). The idea of dynamic approximation is to project the time derivative of the PDE solution onto the tangent space of a low-rank functional tensor manifold at each time. Such a projection can be computed by minimizing a convex energy functional over the tangent space. This minimization problem yields the unique optimal velocity vector that allows us to integrate the PDE forward in time on a tensor manifold of constant rank. In the case of initial/boundary value problems defined in real separable Hilbert spaces, this procedure yields evolution equations for the tensor modes in the form of a coupled system of one-dimensional time-dependent PDEs. We apply the dynamic tensor approximation to a four-dimensional Fokker-Planck equation with non-constant drift and diffusion coefficients, and demonstrate its accuracy in predicting relaxation to statistical equilibrium.

## Authors

• 4 publications
• 10 publications
12/10/2020

### Rank-adaptive tensor methods for high-dimensional nonlinear PDEs

We present a new rank-adaptive tensor method to compute the numerical so...
07/12/2019

### Dynamically orthogonal tensor methods for high-dimensional nonlinear PDEs

We develop new dynamically orthogonal tensor methods to approximate mult...
03/08/2021

### Efficient numerical approximation of a non-regular Fokker–Planck equation associated with first-passage time distributions

In neuroscience, the distribution of a decision time is modelled by mean...
08/01/2020

### Step-truncation integrators for evolution equations on low-rank tensor manifolds

We develop a new class of algorithms, which we call step-truncation meth...
04/08/2022

### Tensor approximation of the self-diffusion matrix of tagged particle processes

The objective of this paper is to investigate a new numerical method for...
08/26/2019

### Stability analysis of hierarchical tensor methods for time-dependent PDEs

In this paper we address the question of whether it is possible to integ...
09/06/2019

### Computing Derivatives for PETSc Adjoint Solvers using Algorithmic Differentiation

Most nonlinear partial differential equation (PDE) solvers require the J...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

High-dimensional partial differential equations (PDEs) arise in many areas of engineering, physical sciences and mathematics. Classical examples are equations involving probability density functions (PDFs) such as the Fokker–Planck equation

Risken (1989), the Liouville equation Venturi et al. (2012); Cho et al. (2016), or the Boltzmann equation Cercignani (1988); di Marco and Pareschi (2014); Boelens et al. (2020). More recently, high-dimensional PDEs have also become central to many new areas of application such optimal mass transport Gangbo et al. (2019); Villani (2009), random dynamical systems Venturi and Karniadakis (2014); Venturi et al. (2012), mean field games E et al. (2019); Ruthotto et al. (2020), and functional-differential equations Venturi and Dektor (2020); Venturi (2018).

Computing the solution to high-dimensional PDEs is a challenging problem that requires approximating high-dimensional functions, i.e., the solution to the PDE, and then developing appropriate numerical schemes to compute such functions accurately. Classical numerical methods based on tensor product representations are not viable in high-dimensions, as the number of degrees of freedom grows exponentially fast with the dimension. To address this problem there have been substantial research efforts in recent years on approximation theory for high-dimensional systems. Techniques such as sparse collocation

Bungartz and Griebel (2004); Chkifa et al. (2014); Barthelmann et al. (2000); Foo and Karniadakis (2010); Narayan and Jakeman (2014), high-dimensional model representations Li and Rabitz (2008); Cao et al. (2009); Baldeaux and Gnewuch (2014)

, deep neural networks

Raissi and Karniadakis (2018); Raissi et al. (2019); Zhu et al. (2019) and tensor methods Khoromskij (2015); Bachmayr et al. (2016); Rodgers and Venturi (2020); Boelens et al. (2018); Hackbusch (2012); Kolda and Bader (2009) were proposed to mitigate the exponential growth of the degrees of freedom, the computational cost and memory requirements. In recent work Dektor and Venturi (2020), we proposed a new method for solving high-dimensional time-dependent PDEs based on dynamically orthogonal tensor series expansions. The key idea is to represent the solution in terms of a hierarchy of Schmidt decompositions and then enforce dynamic orthogonality constraints on the tensor modes. In the case of initial/boundary value problems for PDEs defined in separable geometries, this procedure yields evolution equations for the dynamic tensor modes in the form of a coupled system of one-dimensional time-dependent PDEs.

In this paper, we develop an extension of this approach based on the functional tensor train (FTT) expansion recently proposed by Bigoni, Engsig–Karup and Marzouk in Bigoni et al. (2016). In particular, we prove that FTT, combined with the set of hierarchical dynamic orthogonality constraints we introduced in Dektor and Venturi (2020), defines the best dynamic approximation of the solution to a nonlinear PDE on a smooth tensor manifold with constant rank. To describe what we mean by best dynamic approximation, consider the autonomous PDE

 ∂u(x,t)∂t=N(u(x,t)),u(x,0)=u0(x), (1)

where is a -dimensional (time-dependent) scalar field defined in the domain and is a nonlinear operator which may depend on the spatial variables, and may incorporate boundary conditions. Suppose that at some fixed time the solution belongs to a smooth manifold embedded in a real Hilbert space . The best dynamic approximation aims at approximating at a later time with a point lying on the manifold by determining the optimal vector in the tangent plane of that best approximates . This is achieved by solving the variational problem

 minv(x,t)∈Tu(x,t)M∥∥∥v(x,t)−∂u(x,t)∂t∥∥∥H=minv(x,t)∈Tu(x,t)M∥v(x,t)−N(u(x,t))∥H. (2)

Such an approximation is an infinite-dimensional analogue of the dynamical low–rank approximation on Euclidean manifolds considered by Lubich et al. for matrices Koch and Lubich (2007); Nonnenmacher and Lubich (2008), Tucker tensors Koch and Lubich (2010), and hierarchical tensors Lubich et al. (2013, 2015).

This paper is organized as follows. In section 2 we briefly review the hierarchical Schmidt decomposition of multivariate functions (FTT format) and address its effective computation. In section 3 we prove that the set of constant-rank FTT tensors is a smooth Hilbert manifold, which therefore admits a tangent plane at each point. This result generalizes (Uschmajew and Vandereycken, 2013, Theorem 4) to tensor manifolds in infinite dimensions. In section 4 we parameterize the tangent space of the Hilbert manifold and derive a system of partial differential equations for the FTT cores corresponding to a given PDE. This system is shown to be the projection of the time derivative of the PDE solution onto the tangent space of the tensor manifold. In Section 5 we provide a numerical demonstration of the dynamic functional tensor train approximation for a four-dimensional Fokker–Planck equation with non-constant drift and diffusion coefficients. Finally, the main findings are summarized in section 6.

## 2 Functional tensor train (FTT) decomposition in real separable Hilbert spaces

Let be a Cartesian product of real intervals

 Ω =d×i=1Ωi, (3)

a finite product measure on

 μ(x)=d∏i=1μi(xi), (4)

and

 H=L2μ(Ω) (5)

the standard weighted Hilbert space111Note that the Hilbert space in equation (5) can be equivalently chosen to be a Sobolev space (see Dektor and Venturi (2020) for details). of square–integrable functions on . In this section we briefly review the functional tensor train decomposition Bigoni et al. (2016); Gorodetsky et al. (2019) of a multivariate function in the setting of hierarchical bi-orthogonal series expansions Aubry et al. (1991); Aubry and Lima (1995); Venturi (2006, 2011). To this end, let , and . The operator

 T:L2μy(Ωy) →L2μx(Ωx) (6) g ↦∫Ωyu(x,y)g(y)dμy(y)

is linear, bounded, and compact since is a Hilbert-Schmidt kernel. The formal adjoint operator of is given by

 T∗:L2μx(Ωx) →L2μy(Ωy) (7) h ↦∫Ωxu(x,y)h(x)dμx(x).

The composition operator is a self-adjoint compact Hermitian operator. The spectrum of , denoted as , is countable with one accumulation point at , and satisfies

 ∞∑i=1λi<∞. (8)

The normalized eigenfunction of

corresponding to , denoted by is an element of . The set is an orthonormal basis of . The operator is also self-adjoint, compact, and Hermitian, and shares the same spectrum as , i.e., . Its eigenfunctions form an orthonormal basis of . It is a classical result in functional analysis that can be expanded as (see Griebel and Li (2019); Aubry et al. (1991); Aubry and Lima (1995))

 u(x,y)=∞∑i=1√λiψi(x)φi(y). (9)

The functional tensor train (FTT) decomposition recently proposed in Bigoni et al. (2016) can be developed in the setting of hierarchical bi-orthogonal expansions as follows. Let and set and in (6)-(7) to obtain

 u(x)=∞∑α1=1√λ1(α1)ψ1(x1;α1)φ1(α1;x2,…,xd). (10)

Now we let and and the counting measure on . From the orthonormality of and the fact that we have

 ∫Ωx×Ωy|√λ1(α1)φ1(α1;x2,…,xd)|2dτ(α1)dμ2(x2)⋯dμd(xd) (11) =∞∑α1=1λ1(α1)∫Ω2×⋯Ωd|φ1(α1;x2,…,xd)|2dμ2(x2)⋯dμd(xd) =∞∑α1=1λ1(α1)<∞,

i.e., . Moreover, can be decomposed further by using an expansion of the form (9), i.e.,

 √λ1(α1)φ1(α1;x2,…,xd)=∞∑α2=1√λ2(α2)ψ2(α1;x2;α2)φ2(α2;x3,…,xd). (12)

Substituting this expression into (10) yields

 u(x)=∞∑α1=1∞∑α2=1√λ2(α2)ψ1(x1;α1)ψ2(α1;x2;α2)φ2(α2;x3,…,xd). (13)

Proceeding recursively in this manner yields the following FTT expansion

 u(x)=∞∑α1,…,αd−1=1ψ1(α0;x1;α1)ψ2(α1;x2;α2)⋯ψd(αd−1;xd;αd), (14)

where and . By truncating the expansion (14

) such that the largest singular values are retained we obtain

 uTT(x)=r∑α0,…,αd=1ψ1(α0;x1;α1)ψ2(α1;x2;α2)⋯ψd(αd−1;xd;αd), (15)

where is the TT-rank (or rank if the TT format is clear from context).

It is known that the truncated FTT expansion converges optimally with respect to the norm Bigoni et al. (2016). More precisely, for any given function the FTT approximant (15) minimizes the residual relative to independent variations of the functions on a tensor manifold with constant rank . It is convenient to write (15) in a more compact form as

 uTT(x)=Ψ1(x1)Ψ2(x2)⋯Ψd(xd), (16)

where is a matrix with entries . The matrix-valued functions will be referred to as FTT cores. The spatial dependency is clear from the subscript of the core so we will often suppress the explicit dependence on the spatial variable to simply write and . Rank FTT decompositions can be computed at quadrature points by first discretizing on a tensor product grid and then using a tensor product quadrature rule together with known algorithms for computing a discrete TT decomposition of a full tensor as discussed in (Bigoni et al., 2016).

At this point we summarize the main differences between the FTT series expansion (15) and the series expansions we recently developed in Dektor and Venturi (2020). With reference to the first level of the hierarchical TT decomposition, i.e., Eq. (10), we notice that in the FTT setting the functions are not decomposed independently (for each ) as in Dektor and Venturi (2020). Instead, only one bi-orthogonal decomposition is performed on the average

 ¯¯¯¯φi(xi+1,…,xd)=∞∑αi=1φi(αi;xi+1,…,xd). (17)

This follows naturally from the assumption , which includes a counting measure that yields the summation in (17) as part of the inner product. On the other hand, the hierarchical expansion we studied in Dektor and Venturi (2020) treats as an element of for each . Hence, a bi-orthogonal decomposition is performed on for each . Obviously, such decomposition requires many more computations but offers more information about the spectrum of the multivariate function at each level of the TT binary tree. Hereafter we proceed by considering the FTT decomposition (15), but note that similar theoretical results can also be developed for the hierarchical series expansions we studied in Dektor and Venturi (2020).

## 3 The manifold of constant rank FTT tensors

In this section we prove that the space of constant rank FTT tensors is a smooth manifold, which therefore admits a tangent plane at each point. The tangent plane will be used in section 4 to develop an integration theory based on dynamic tensor approximation for time-dependent nonlinear PDEs. To prove that the space of constant rank FTT tensors is a smooth manifold, we follow a similar construction as presented in Musharbash et al. (2015); Musharbash and Nobile (2018). Closely related work was presented in Chiumiento and Melgaard (2012) in relation to Slater–type variational spaces in many particle Hartree–Fock theory. Also, the discrete analogues of the infinite-dimensional tensor manifolds discussed hereafter were studied in detail in Uschmajew and Vandereycken (2013); Holtz et al. (2012).

Let , where denotes the set of matrices with entries in . Define the matrix

 CΨ,~Ψ=⟨ΨT,~Ψ⟩L2μ(Ω)∈Mr2×r2(R) (18)

with entries222In equations (18) and (19) denotes the standard inner product in .

 (19)

Denote by the set of all with the property that is invertible. We are interested in the following subset of consisting of rank- FTT tensors in dimensions

 T(d)r ={u∈L2μ(Ω):u=Ψ1Ψ2⋯Ψd,Ψi∈V(i)ri−1×ri,∀i=1,2,…,d}. (20)

The set

 V=V(1)r0×r1×V(2)r1×r2×⋯×V(d)rd−1×rd (21)

can be interpreted as a latent space for via the mapping

 π:V →T(d)rπ(Ψ1,Ψ2,…,Ψd)=Ψ1Ψ2⋯Ψd. (22)

Any tensor has many representations in , that is the map is not injective. The purpose of the following Lemma 3.1 and Proposition 3.1 is to characterize all elements of the space which have the same image under .

###### Lemma 3.1

If are two bases for the same finite dimensional subspace of then the matrix defined in (18) is invertible.

Proof: The matrix under consideration is given by

 (23)

We will show that the columns of this matrix are linearly independent. To this end, consider the linear equation

 (24)

the -th row of which reads

 r∑k=1⟨ψ(k;x;p),r∑i=1vi~ψ(k;x;i)⟩L2μ(Ω)=0p=1,…,r. (25)

If not all the are equal to zero then (25) implies that is orthogonal to in and therefore linearly independent for all . This contradicts the assumption that span the same finite dimensional subspace of . Hence are zero for every .

###### Proposition 3.1

Let , be elements of . Then

 π(Ψ1,…,Ψd)=π(~Ψ1,…,~Ψd) (26)

if and only if there exist matrices () such that with .

Proof: To prove the forward implication we proceed by induction on . For we have that

 Ψ1Ψ2=~Ψ1~Ψ2 (27)

implies

 Ψ1=~Ψ1C~ΨT2,ΨT2C−1ΨT2,ΨT2. (28)

Set which is invertible since it is a change of basis matrix. Substituting into (27) we see that

 ~Ψ1P1Ψ2=~Ψ1~Ψ2 (29)

which implies

 Ψ2=P−11~Ψ2. (30)

This proves the proposition for . Suppose that the proposition holds true for and that

 Ψ1⋯Ψd=~Ψ1⋯~Ψd. (31)

Then,

 Ψ1⋯Ψd−1=~Ψ1⋯~Ψd−1C~ΨTd,ΨTdC−1ΨTd,ΨTd, (32)

and we are gauranteed the existence of invertible matrices such that

 Ψ1 =~Ψ1P1, (33) ⋮ Ψd−2 =P−1d−3~Ψd−2Pd−2, Ψd−1 =P−1d−2~Ψd−1C~ΨTd,ΨTdC−1ΨTd,ΨTd.

Let . Substituting equation (33) into (31) yields

 ~Ψ1⋯~Ψd−2~Ψd−1Pd−1Ψd =~Ψ1⋯~Ψd, (34) Pd−1Ψd =~Ψd,

from which it follows that is invertible and

 Ψd=P−1d−1~Ψd. (35)

This completes the proof.

With Proposition 3.1 in mind we define the group333In equation (36) denotes the general linear group of invertible matrices with real entries, together with the operation of ordinary matrix multiplication.

 G=GLr1×r1(R)×GLr2×r2(R)×⋯×GLrd−1×rd−1(R) (36)

with group operation given by component-wise matrix multiplication. Let act on by

 (P1,…,Pd−1)⋅(Ψ1,…,Ψd)=(Ψ1P1,P−11Ψ2P2,…,P−1d−1Ψd) (37)

for all and . It is easy to see that this is action is free and transitive making , , and a principal -bundle Rudolph and Schmidt (2017). In particular is isomorphic to which allows us to equip with a manifold structure. Thus, we can define its tangent space at a point . We characterize such tangent space as the equivalence classes of velocities of smooth curves passing through the point

 TuT(d)r={γ′(s)|s=0:γ∈C1((−δ,δ),T(d)r),γ(0)=u}. (38)

Here is the space of continuously differentiable functions from the interval to the space of constant rank FTT tensors . We conclude this section with the following Lemma which singles out a particular representation of for which the matrices (see Eqs. (18)-(19)) are diagonal.

###### Lemma 3.2

Given any FTT tensor there exist such that and for all .

Proof: Let us first represent relative to the tensor cores . Since

is symmetric there exists an orthogonal matrix

such that is diagonal. Set and so that and . The matrix is symmetric so there exists an orthogonal matrix such that is diagonal. Set and so that and . Proceed recursively in this way until with , . It is easy to check that the collection of cores satisfies the conclusion of the Lemma.

## 4 Dynamical approximation of PDEs on FTT tensor manifolds with constant rank

Computing the solution to high-dimensional PDEs has become central to many new areas of application such as optimal mass transport Gangbo et al. (2019); Villani (2009), random dynamical systems Venturi and Karniadakis (2014); Venturi et al. (2012), mean field games E et al. (2019); Ruthotto et al. (2020), and functional-differential equations Venturi and Dektor (2020); Venturi (2018). In an abstract setting, such PDEs involve the computation of a function governed by an autonomous evolution equation

 ⎧⎨⎩∂u∂t=N(u),u(x,0)=u0(x), (39)

where is a -dimensional (time-dependent) scalar field defined in the domain (see Eq. (3)) and is a nonlinear operator which may depend on the spatial variables and may incorporate boundary conditions.

We are interested in computing the best dynamic approximation of the solution to (39) on the tensor manifold for all . Such an approximation aims at determining the vector in the tangent plane of at the point that best approximates for each . One way to obtain the optimal vector in the tangent plane is by orthogonal projection which we now describe. For each the tangent space is canonically isomorphic to . Moreover, for each the normal space to at the point , denoted by , consists of all vectors in that are orthogonal to with respect to the inner product in . The space is finite-dimensional and therefore it is closed. Thus, for each the space admits the decomposition

 L2μ(Ω)=TuT(d)r⊕NuT(d)r. (40)

Assuming that the solution to the PDE (39) lives on the manifold at time , we have that its velocity can be decomposed uniquely into a tangent component and a normal component with respect to , i.e.,

 N(u)=v+w,v∈TuT(d)r,w∈NuT(d)r. (41)

The orthogonal projection we are interested in computing for the best dynamic approximation is

 Pu:L2μ(Ω) →TuT(d)r, (42) N(u) ↦PuN(u).

In practice, we will compute the image of such a projection by solving the following minimization problem over the tangent space of at

 minv(x,t)∈Tu(x,t)T(d)r∥∥∥v(x,t)−∂u(x,t)∂t∥∥∥2L2μ(Ω)=minv(x,t)∈Tu(x,t)T(d)r∥v(x,t)−N(u(x,t))∥2L2μ(Ω) (43)

for each fixed . From an optimization viewpoint the following proposition establishes the existence and uniqueness of the optimal tangent vector.

###### Proposition 4.1

If then there exists a unique solution to the minimization problem (43), i.e., a unique global minimum.

Proof: We first notice that the feasible set is a real vector space and thus a convex set. Next we show that the functional is strictly convex. Indeed, take distinct and . Then

 (F[qv1+(1−q)v2])12 =∥qv1+(1−q)v2−N(u)∥L2μ(Ω) (44) =∥q(v1−N(u))+(1−q)((v2−N(u))∥L2μ(Ω) ≤q∥v1−N(u)∥+(1−q)∥v2−N(u)∥L2μ(Ω),

with equality if and only if there exists an such that . However, this implies that for some real number , whence . Therefore if then the inequality in (44) is strict and the functional is strictly convex. Since the function is strictly increasing on the image of it follows that is strictly convex and thus admits a unique global minimum over the feasible set .

It can easily be shown that the unique solution to the optimization problem (43) is . Next, we will use this optimization framework for computing the best tangent vector to integrate the PDE (39) forward in time on the manifold . To this end, let us first assume that the initial condition . If not, can be projected onto using the methods described in section 2. In both cases, this allows us to represent as

 u0(x)=Ψ1(0)Ψ2(0)⋯Ψd(0), (45)

with for . A representation of this form (with diagonal matrices ) always exists thanks to Lemma 3.2. To compute the unique solution of (43), we expand an arbitrary curve of class on the manifold passing through the point at in terms of -dependent FTT cores. This yields

 γ(s) =Ψ1(s)⋯Ψd(s) (46) =r∑α0,…,αd=1ψ1(s;α0,α1)ψ2(s;α1,α2)⋯ψd(s;αd−1,αd),

which allows us to represent any element of the tangent space at as

 v=∂∂s[Ψ1(s)⋯Ψd(s)]s=0, (47)

with . At this point we notice that minimizing the functional in (43) over the tangent space is equivalent to minimizing the same functional over the velocity of each of the FTT cores. For notational convenience, hereafter we omit evaluation at of all quantities depending on the curve parameter . For example, we will write

 ψi(αi−1,αi)=ψi(s;αi−1,αi)|s=0,∂ψi(αi−1,αi)∂s=∂ψi(s;αi−1,αi)∂s∣∣∣s=0. (48)

With this notation, the minimization problem (43) is equivalent to

 min∂ψ1(α0,α1)∂s,…,∂ψd(αd−1,αd)∂s∥∥ ∥∥∂∂s⎛⎝r∑α0,…,αd=1ψ1(α0,α1)ψ2(α1,α2)⋯ψd(αd−1,αd)⎞⎠−N(u)∥∥ ∥∥2L2μ(Ω). (49)

Of course, in view of Lemma 3.1 one curve has many different expansions in terms of -dependent FTT cores, which can be mapped into one another via collections of -dependent invertible matrices. From Lemma 3.2 it is clear that any curve