 # Existence of dynamical low-rank approximations to parabolic problems

The existence of weak solutions of dynamical low-rank evolution for parabolic partial differential equations in two spatial dimensions is shown, covering also non-diagonal diffusion in the elliptic part. The proof is based on a variational time-stepping scheme on the low-rank manifold. Moreover, this scheme is shown to be closely related to practical methods for computing such low-rank evolutions.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Finding hidden structure in the solutions of partial differential equations has always been a key goal in the study of such equations, whether it is for the sake of modeling or for efficient numerical approximation. In fact, exploiting structures such as low-dimensional parametrizations can be crucial for the numerical treatment of equations on high-dimensional domains to avoid the curse of dimensionality.

It has been observed that under certain conditions on the domain and the data, the solutions of elliptic and parabolic partial differential equations with a dominating “Laplacian part” exhibit low-rank approximability, that is, they can be approximated in certain low-rank tensor formats

[19, 42, 11, 3]. If this is the case, then instead of working on full discretization grids, one can impose the low-rank constraint in the design of the solution method in order to take advantage of low-parametric representation. This typically results in a nonlinear approximation algorithm.

A typical approach is to discretize the partial differential equation on possibly huge, but finite grids, and then use numerical linear algebra techniques for solving the resulting linear systems in low-rank formats; see, e.g, [6, 20] for an overview and further references. How the obtained solutions behave with refinement of discretization depends strongly on the details of the considered methods. This point has been considered for methods that adjust solution ranks adaptively in each step [4, 5]. For methods based on a fixed low-rank constraint, this question is more difficult due to the nonlinearity of the resulting constrained problems and has found only limited attention in the literature. Since methods operating on fixed-rank manifolds are important algorithmic building blocks, understanding their robustness under discretization refinement is of high practical interest. A first important requirement is to study the well-posedness of the underlying low-rank problem on function spaces. While it is not so difficult to make an appropriate variational formulation for elliptic problems subject to low-rank constraints that ensure existence of solutions  [6, Sec. 4], the parabolic case poses substantial difficulties. In this paper we propose such a formulation for parabolic evolution equations on low-rank manifolds in Hilbert space and prove existence of solutions via a time-stepping scheme.

Dynamical low-rank approximation is a general technique for approximating time-dependent problems under low-rank constraints by projecting the vector field onto the tangent space of the low-rank manifold. For general initial value problems

, for matrices , the dynamical low-rank approximation on the manifold of rank- matrices as considered in  is given by

 ˙Y(t)=PY(t)F(t,Y(t)), (1.1)

where is the orthogonal projector onto the tangent space . Note that (1.1) is equivalent to the variational problem

 ⟨˙Y(t)−F(t,Y(t)),X⟩=0for all X∈TY(t)Mr,

in analogy to (1.7); this approach is also known as the Dirac-Frenkel variational principle [13, 29]. It has been adapted to several different classes of evolution problems in scientific computing, see, e.g., [40, 22, 37, 36, 34, 14] as well as  for an overview, and the monograph  on applications in quantum dynamics.

In this work, we develop a weak formulation of the Dirac-Frenkel principle for low-rank approximation of parabolic problems and prove the existence of solutions in a function space setting. As a model problem one may consider the two-dimensional parabolic equation on the product domain ,

 ut(x,t)−∇⋅α(t)∇u(x,t) =f(x,t) for (x,t)∈Ω×(0,T), (1.2) u(x,t) =0 for (x,t)∈∂Ω×(0,T), u(x,0) =u0(x) for x∈Ω.

Here we assume that the matrix is symmetric for every , uniformly bounded, and uniformly positive definite. The problem (1.2) is typically formulated in weak form as follows: given and , find

 u∈W12(0,T;H10(Ω),L2(Ω))={u∈L2(0,T;H10(Ω):∃u′∈L2(0,T;H−1(Ω))}

such that for almost all ,

 ⟨u′(t),v⟩+a(u(t),v;t) =⟨f(t),v⟩for all v∈H10(Ω), (1.3) u(0) =u0.

Here, by we denote the dual pairing of and , and the symmetric, bounded and coercive bilinear form is defined as

 a(u,v;t)=α11(t)∫Ω∂1u(x,t)∂1v(x,t)dx+α22(t)∫Ω∂2u(x,t)∂2v(x,t)dx+α12(t)∫Ω∂1u(x,t)∂2v(x,t)dx+α21(t)∫Ω∂2u(x,t)∂1v(x,t)dx. (1.4)

By classical theory the problem (1.3) admits a unique solution; see, e.g., [46, Thm. 23.A].

Since , we have in the sense of tensor products of Hilbert spaces, and with norm

 ∥v∥2H10(Ω)=∥v∥2H10(0,1)⊗L2(0,1)+∥v∥2L2(0,1)⊗H10(0,1).

Every function can be written as

 u(x)=u(x1,x2)=r∑k=1u1k(x1)u2k(x2)%a.e., (1.5)

with for all . By we denote the smallest , which may be infinite, such that such a representation exists.

As low-rank representations are convenient for several reasons, one may ask whether the parabolic equation (1.2) admits approximate solutions of low-rank. In dynamical low-rank approximation one assumes this to be the case, and attempts to directly evolve the solution on the set

 Mr={u∈L2(Ω):rank(u)=r} (1.6)

for a certain value of . One can show that is a submanifold in . The dynamics are then determined by the following problem: find such that

 u(t)∈Mrfor all t∈[0,T],

and such that for almost all ,

 ⟨u′(t),v⟩+a(u(t),v;t) =⟨f(t),v⟩for all v∈Tu(t)Mr∩H10(Ω), (1.7) u(0) =u0∈Mr,

where is the tangent space of the manifold at . Thus, in contrast to (1.3), in (1.7) we seek a curve on the manifold which for almost every satisfies the weak parabolic formulation (1.3) on the tangent space only.

Our goal in this paper is to provide an abstract framework for dealing with problems of the type (1.7), and to prove existence of solutions via a time-stepping scheme. In contrast to previous works, we do not require the diffusion matrix to be diagonal, which means that we allow anisotropic diffusion. If is diagonal, that is, , the problem is substantially easier; in particular, in this case the exact solution of the homogeneous equation with and satisfies for all . In the case of non-diagonal , the unbounded operator on induced by the bilinear form no longer maps to the tangent space of the manifold, which means that previously used techniques are no longer applicable in this setting.

Our existence proof is based on a Rothe-type temporal semidiscretization using minimization problems on in each time step. Off-diagonal parts in the diffusion are treated via bounds on mixed derivatives that are always available for elements in the intersection , which is a remarkable aspect of the interplay between low-rank structures and regularity in function spaces. We require slightly more regularity of and than necessary for standard parabolic problems in linear spaces like (1.3), but still less than needed for strong solutions. Specifically, applied to the model problem (1.3), our abstract results give solutions to the dynamical low-rank formulation (1.7) under the assumptions and

, as long as the smallest singular values in the low-rank representation of

do not approach zero. Compared to previous works, we do not make use of components in low-rank representations, but treat the problem directly on the manifold. This allows for generalization to evolutions on more general manifolds. However, the strategy that we follow here does not lend itself to showing uniqueness of solutions, and in our present setting this question remains open.

Beyond the comparably well-developed analysis of dynamical low-rank approximations in finite-dimensional spaces [25, 1, 24, 17, 39], the available results for low-rank evolution problems in function spaces cover mainly Schrödinger-type equations , in particular the closely related higher-dimensional generalization of the multi-configuration time-dependent Hartree method (MCTDH) considered in [35, 26, 8, 7, 28, 16]. An important ingredient in many results is the decomposition of the operators into a Laplacian part, which maps the low-rank manifold to its tangent space, and a potential term satisfying suitable boundedness properties. A very similar decomposition with differential operators mapping to the tangent space is also assumed in the recent work  on parameter-dependent parabolic problems, where the separation of variables is done not between spatial variables as considered here, but rather between the spatial and the parametric variables. An error analysis for such an approach was presented in .

The paper is organized as follows: in Section 2 we give an abstract formulation of the problem for general evolution equations on manifolds under assumptions that reflect the main features of the model problem (1.7). In Section 3, we introduce the time-stepping scheme that is used to approximate solutions. Then we show in Section 4 that this scheme yields solutions to the continuous problem in the limit. Section 5 is devoted to questions of numerical approximation. We give an outlook on directions for further work in Section 6.

## 2. Abstract formulation

Before we switch to an abstract model for our existence proof, we highlight some particular properties of the model problem (1.7) that will motivate the assumptions made in the abstract setting. We believe that the general formulation presented in Section 2.2 will be useful to study parabolic problems on more general low-rank tensor manifolds in tensor product Hilbert spaces of higher order, for instance , as well. Low-rank tensor formats with suitable properties may include Tucker tensors , hierarchical Tucker tensors , and tensor trains .

### 2.1. Some features of the model problem on Ω=(0,1)2

Let us first inspect the rank- manifold defined in (1.6) in more detail. We have already mentioned that it is an embedded Hilbert submanifold of , but is not closed. In fact, its closure is the set of all with and this closure is even weakly sequentially closed; see, e.g., [20, Lemma 8.6]. In other words,

 M≤r=M≤r−1∪Mr=¯¯¯¯¯¯¯¯¯Mr=¯¯¯¯¯¯¯¯¯Mrw,

where the superscript indicates the weak sequential closure. Another important property of is that it is a cone, that is, implies for all .

#### 2.1.1. Tangent spaces

For convenience let us use the notation for the tensor product of two functions, that is, a.e. Every admits infinitely many representations of the form (1.5), among which the singular value decomposition (SVD)

 u=r∑k=1σku1k⊗u2k (2.1)

is of great importance for the geometric description of the manifold. In (2.1), and are both -orthonormal systems, and is a non-increasing, positive sequence of singular values. The existence of such a decomposition is well known in any tensor product of Hilbert spaces [20, Thm. 4.137].

Given (2.1), the tangent space to at can be written as

 TuMr={v=r∑k=1v1k⊗u2k+u1k⊗v2k:v1k,v2k∈L2(0,1)}. (2.2)

To see this, consider a curve

 ϕ(t)=r∑k=1σk(u1k+tσ−1kv1k)⊗(u2k+tσ−1kv2k) (2.3)

in . Then and is of the form (2.2). One can show that every admissible curve in through is locally of this form, using the orthogonality of the factors in the SVD. Note that , which is also clear due to the cone property.

Without loss of generality, we could add the gauging conditions

 (v1k,u1ℓ)L2=0for all k,ℓ,(v1k,v1ℓ)L2=0for k≠ℓ (2.4)

to the definition of . Then the representation of tangent vectors becomes unique. With these gauging conditions it is not difficult to show that is closed in and locally homeomorphic (around zero) to a neighborhood of in , using essentially the same construction as (2.3). As a result, is a manifold, e.g. in the sense of [45, Def. 43.10], and in fact it is infinitely smooth.

We will also use the intersection of with smoothness spaces. As shown below, see (2.12), that if belongs to , then the factors in the SVD (2.1) all belong to . Likewise, a similar argument shows that if a corresponding tangent vector , obeying the gauging conditions (2.4), belongs to , then the are in as well. Consequently, in this case the curve (2.3) yielding the tangent vector satisfies

 ϕ(t)∈Mr∩H10(Ω) (2.5)

for all . The same condition will be assumed in the abstract setting as well.

A famous theorem due to Schmidt  states that truncating the SVD of yields best approximations of lower rank in the -norm. A particular instance of this result is that the smallest singular value of equals the -distance of to the relative boundary of :

 σmin(u)=distL2(u,M≤r−1)=distL2(u,¯¯¯¯¯¯¯¯¯Mrw∖Mr). (2.6)

The smallest singular value is also related to curvature bounds for the manifold, specifically to perturbations of tangent spaces. For we denote by the -orthogonal projection on . It is given as

 Pu=P1⊗I+I⊗P2−P1⊗P2 (2.7)

where and denote the -orthogonal projections on the spans of and , respectively. Then one can show the following: for any there exist such that for all with and all with we have

 ∥Pu−Pv∥L2→L2≤Mσmin(u)∥u−v∥L2. (2.8)

This behavior of tangent spaces to low-rank manifolds is well known in finite dimension, even for more general tensor formats [1, 32]. In infinite-dimensional Hilbert spaces, a bound like (2.8) was obtained, for instance, for the (more general) Tucker format in . For convenience, we give a self-contained proof for (2.8) in the appendix (Lemma A.1).

Regarding the estimate (

2.8), we note that on every weakly sequentially compact subset of , the infimum

 σ∗:=infu∈M′rσmin(u)=infu∈M′rdistH(u,M≤r−1)=distH(M′r,¯¯¯¯¯¯¯¯¯Mrw∖Mr)

is positive and attained by some . To see this consider sequences and such that

 ∥un−vn∥H≤σ∗+1/n.

Both sequences are bounded, and hence there exists a common weakly converging subsequence. Let and denote the limits. Then and since both sets are weakly sequentially closed. Since the norm is weakly sequentially lower semicontinuous, we obtain and thus equality. This shows

 σ∗=distL2(u∗,M≤r−1)>0. (2.9)

#### 2.1.2. Elliptic operators and low-rank manifolds

Let us now discuss the interplay between the elliptic operator and the manifold in the model problem (1.7). Note that from the formulation (1.7), we will only have information on the bilinear form on the tangent spaces . One can therefore expect that it will not be possible for arbitrary bilinear forms to derive the necessary a priori estimates vital for existence proofs. Obviously, additional structure is required.

In case of the model problem (1.7) we can split the bilinear form into two parts with

 a1=a11+a22,a2=a12+a21.

These two parts are generated by the differential operators

 A1(t)=−α11(t)∂21−α22(t)∂22,A2(t)=−α12(t)∂1∂2−α21(t)∂2∂1, (2.10)

corresponding to divergence and mixed derivatives at time , respectively. The operator has the remarkable property that it maps sufficiently smooth functions to the tangent space . Namely, given the SVD representation (2.1), we get

 (A1(t)u)(x1,x2)=−r∑k=1σk(u)(α11(t)∂21u1k(x1)u2k(x2)+u1k(x1)α22(t)∂22u2k(x2)), (2.11)

which is in by (2.2) if the second derivatives and are in .

In order to translate this property to the generated bilinear forms , we observe that if , then actually . That is, a low-rank function automatically possesses mixed derivatives of order one, and all factors in the SVD (2.1) are themselves in . To see this, let have the SVD (2.1), then, by orthogonality

 u1k(x1)=1σk∫10u(x1,x2)u2k(x2) dx2,

which gives

 ∥∂1u1k∥L2≤1σk(u)∥u∥H10. (2.12)

Likewise, admits precisely the same bound. Note that these bounds can be refined, since, e.g., in (2.12) only the derivative of with respect to is needed, but this will not be required.

Now based on the regularity of the singular vectors one can show that if , the tangent space projection given by (2.7) is also bounded in -norm as a map from to . One also has a curvature bound

 ∥Pu−Pv∥H10→H10≤~Mσmin(u)∥u−v∥H10 (2.13)

in this norm, see Corollary A.3 in the appendix.

As a consequence, requiring only , we can generalize the feature that the operator maps to the tangent space to the following property of the induced bilinear form : for every ,

 a1(u,v;t)=a1(u,Puv;t)for all u∈Mr∩H10(Ω) and v∈H10(Ω). (2.14)

To see this, choose a sequence converging to in -norm. Then for , we have

 a1(un,v;t)=⟨A1(t)un,v⟩=⟨A1(t)un,Punv⟩=a1(un,Punv;t)

since by (2.11). At the same time, , but also

 a1(un,Punv;t)=a1(un,Puv;t)+a1(un,(Pun−Pu)v;t)→a1(u,Puv;t)

by (2.13).

For the operator on the other hand, the preceding considerations show that it actually is well defined on in a strong sense: applying to (2.1) and using the triangle inequality we get from (2.12) that

 ∥∂1∂2u∥L2≤r∑k=11σk(u)∥u∥2H10≤rσmin(u)∥u∥2H10.

By (1.4), this implies that for every , the bilinear form associated to the operator has the following property: for fixed , the linear functional on is actually continuous on , its dual norm being

 ∥A2(t)u∥L2≤2r|α12(t)|σmin(u)∥u∥2H10. (2.15)

Note that here, the inverse of the smallest singular value of enters again.

### 2.2. Abstract formulation of the problem

The features of the model problem discussed above are now formalized.

#### 2.2.1. Standard assumptions on parabolic evolution equations

We consider a Gelfand triplet

 V⊆H⊆V∗

where the real Hilbert space is compactly embedded in the real Hilbert space . Since the embedding is compact it is also continuous, that is,

 ∥u∥2H≲∥u∥2Vfor all u∈V. (2.16)

In the case and , (2.16) is the Poincaré inequality.

By we denote the dual pairing of and , and by we denote the inner product on . For every , let be a bilinear form which is assumed to be symmetric,

 a(u,v;t)=a(v,u;t)for all u,v∈V and t∈[0,T],

uniformly bounded,

 |a(u,v;t)|≤β∥u∥V∥v∥Vfor all u,v∈V and t∈[0,T]

for some , and uniformly coercive,

 a(u,u;t)≥μ∥u∥2Vfor all u∈V and t∈[0,T]

for some . Under these assumptions, is an inner product on defining an equivalent norm. Furthermore, it defines a bounded operator

 A(t):V→V∗ (2.17)

such that

 a(u,v;t)=⟨A(t)u,v⟩for all u,v∈V.

We also assume that is Lipschitz continuous with respect to , in other words, there exists an such that

 |a(u,v;t)−a(u,v;s)|≤Lβ∥u∥V∥v∥V|t−s| (2.18)

for all and , which in the model problem corresponds to the Lipschitz continuity of the function .

#### 2.2.2. Manifolds and tangent spaces

Our aim is to deal with evolution equations on a manifold

 M⊆H.

For present purposes, we do not have to be very strict regarding the notion of a manifold. What we essentially need is a tangent bundle: we assume that for every there exists a closed subspace given via a bounded -orthogonal projection

 Pu:H→TuM,

such that contains tangent vectors to at , that is: for every there exists a differentiable curve (for some ) such that for all and

 ϕ(0)=u,ϕ′(0)=v.

For our main existence result, we eventually assume that the map is locally Lipschitz continuous on as a mapping on . These assumptions do not define an (embedded) submanifold, since a set like in satisfies them, too. In particular it is not assumed that is locally homeomorphic to a neighborhood of .

It will be tacitly assumed that

• is not empty,

• for every , the space is not empty.

Indeed, in the main assumptions below we also require that is a cone, as is the case for low-rank manifolds. Then the first property implies the second, because in this case for every .

#### 2.2.3. Problem formulation and main assumptions

The abstract problem we are considering is now the following.

###### Problem 2.1.

Given and , find

 u∈W12(0,T;V,H)={u∈L2(0,T;V):u′∈L2(0,T;V∗)}

such that for almost all ,

 u(t) ∈M, (2.19) ⟨u′(t),v⟩+a(u(t),v;t) =⟨f(t),v⟩for all v∈Tu(t)M∩V, u(0) =u0.

We emphasize again that the main challenge of this weak formulation is that according to the Dirac-Frenkel principle, the test functions are from the tangent space only. For showing that Problem 2.1 admits solutions we will require several assumptions. These assumptions are abstractions of corresponding properties of the model problem of a low rank manifold as discussed in Section 2.1, and hence the main results of this paper apply to this setting.

• [leftmargin=2em,itemsep=1em]

• (Cone property) The manifold is a cone, that is, implies for all .

• (Curvature bound) For every weakly sequentially compact (in ) subset of there exists a constant such that

 ∥Pu−Pv∥H→H≤κ∥u−v∥Hfor all u,v∈M′.
• (Compatibility of tangent space)

• For and an admissible curve with , can be chosen such that

 ϕ(t)∈M∩V

for all small enough.

• If and then .

• (Operator splitting) The associated operator in (2.17) admits a splitting

 A(t)=A1(t)+A2(t)

such that for all , all and all , the following holds:

• maps to the tangent space”:

 ⟨A1(t)u,v⟩=⟨A1(t)u,Puv⟩.
• is locally bounded from to ”: For every weakly sequentially compact (in ) subset of there exist constants and such that

 A2(t)u∈Hand∥A2(t)u∥H≤γu∥u∥ηVfor all u∈M′.

Recall that for the model problem, A2 is stated in (2.8), taking (2.9) into account. Property A3(a) has been discussed in (2.5). With the splitting of according to (2.10), in (2.14) we have shown that A4(a) holds, and A4(b) follows (with independent of ) from (2.15), again using (2.9) and the boundedness of .

## 3. Temporal discretization

Given the main assumptions A1A4 stated above, we prove existence of solutions for Problem 2.1 by discretizing in time and studying a sequence of approximate solutions with time steps . A backward Euler method on for (2.19) takes the following form: given at time step , find at time step such that

 (ui+1−uiti+1−ti,v)+a(ui+1,v;ti+1)=⟨fi+1,v⟩for all v∈Tui+1M∩V. (3.1)

Here are the mean values of on the interval , that is,

 fi+1=1ti+1−ti∫ti+1tif(t) dt. (3.2)

As the test space depends on the solution, this equation appears quite difficult to solve. However, when is symmetric, (3.1) is the first order optimality condition of the optimization problem

 ui+1=argminu∈¯¯¯¯¯¯Mw∩VF(u)=12(ti+1−ti)∥u−ui∥2H+12a(u,u;ti+1)−⟨fi+1,u⟩. (3.3)

This is stated in the following lemma.

###### Lemma 3.1.

Let and . Then any local minimum of (defined in (3.3)) on satisfies the conditions (3.1).

###### Proof.

Let be a local minimum and . By main assumption A3(a) we can find a differentiable curve defined for small enough such that , and . Then has a local minimum at and hence it’s derivative is zero there, which yields (3.1). ∎

Next, we consider the existence of minima of (3.3) on the set . This asserts that we can generate approximate solutions at a sequence of time steps using (3.3), which will serve as the temporal discretization. It will be later ensured that for a small enough time steps, we have if . Note that in any case the are not uniquely determined from , since in general is not convex.

Since the function in (3.3) is convex on and is weakly sequentially closed in by definition, the existence of solutions to (3.3) is more or less standard.

###### Lemma 3.2.

The optimization problem (3.3) has at least one solution.

###### Proof.

Since is convex and continuous on it is also weak sequentially lower semicontinuous on ; see, e.g., [47, Sec. 2.5, Lemma 5]. Note that has bounded sublevel sets on since the bilinear form is coercive by assumptions. It now follows that attains a minimum on every weak sequentially closed subset of by the standard arguments, since the intersection with a sublevel set remains weak sequentially compact; see, e.g. [45, Prop. 38.12(d)]. It hence remains to verify that is weak sequentially closed in . Consider a sequence converging weakly (in ) to . Obviously, since , weak convergence in implies weak convergence in , and since is weakly sequentially closed in , we get . This shows that this set is weak sequentially closed in . ∎

## 4. Existence of solutions

In the previous section we defined a time-stepping scheme through a sequence of optimization problems. Starting from and setting

 h=T/N,ti=ih,

this generates approximate solutions at time points . In this section we will study the properties of these solutions, and use them to prove existence of solutions to Problem 2.1. Specifically, construct a function

by piecewise affine linear interpolation of

, and another function by piecewise constant interpolation of such that and on .

Our main result is then as follows.

###### Theorem 4.1.

Given the assumptions stated in Section 2.2.3.

• The functions and converge, up to subsequences, weakly in and strongly in , to the same function with , while the weak derivatives converge weakly to in , again up to subsequences. We have for almost all .

• Let . There exists a constant independent of such that solves (2.19) for almost all , where we set , and for all .

Note that with implies . A possible constant in statement (b) is provided the right hand side of (4.2) in the energy estimates below, and thus in particular depends continuously on and . In the proof of the theorem, which is given in the following sections, we adapt standard techniques for establishing the existence of limits of time discretizations to the abstract manifold setup.

Combining Theorem 4.1 with a continuation argument, we can obtain a solution on a maximal time interval.

###### Theorem 4.2.

There exist and solving Problem 2.1 on the time interval , where either or

 liminft→T∗distH(u(t),¯¯¯¯¯¯¯Mw∖M)=0.
###### Proof.

Theorem 4.1(b) provides us with a solution of Problem 2.1 on a time interval with such that