Low-rank representation of tensor network operators with long-range pairwise interactions

09/05/2019 ∙ by Lin Lin, et al. ∙ berkeley college 0

Tensor network operators, such as the matrix product operator (MPO) and the projected entangled-pair operator (PEPO), can provide efficient representation of certain linear operators in high dimensional spaces. This paper focuses on the efficient representation of tensor network operators with long-range pairwise interactions such as the Coulomb interaction. For MPOs, we find that all existing efficient methods exploit a peculiar "upper-triangular low-rank" (UTLR) property, i.e. the upper-triangular part of the matrix can be well approximated by a low-rank matrix, while the matrix itself can be full-rank. This allows us to convert the problem of finding the efficient MPO representation into a matrix completion problem. We develop a modified incremental singular value decomposition method (ISVD) to solve this ill-conditioned matrix completion problem. This algorithm yields equivalent MPO representation to that developed in [Stoudenmire and White, Phys. Rev. Lett. 2017]. In order to efficiently treat more general tensor network operators, we develop another strategy for compressing tensor network operators based on hierarchical low-rank matrix formats, such as the hierarchical off-diagonal low-rank (HODLR) format, and the H-matrix format. Though the pre-constant in the complexity is larger, the advantage of using the hierarchical low-rank matrix format is that it is applicable to both MPOs and PEPOs. For the Coulomb interaction, the operator can be represented by a linear combination of O((N)(N/ϵ)) MPOs/PEPOs, each with a constant bond dimension, where N is the system size and ϵ is the accuracy of the low-rank truncation. Neither the modified ISVD nor the hierarchical low-rank algorithm assumes that the long-range interaction takes a translation-invariant form.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 17

page 23

page 24

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Tensor network states [White1992, Vidal2007], particularly presented by the matrix product states (MPS) [White1992, OstlundRommer1995, Schollwock2011] and projected entangled-pair states (PEPS) [VerstraeteCirac2004, VerstraeteWolfPerezEtAl2006, Orus2014], are among the most promising classes of variational methods for approximating high-dimensional functions in quantum physics. Besides their successes for treating strongly correlated quantum systems [YanHuseWhite2011, Norman2016], the MPS (also known as the tensor train method (TT) [OseledetsTyrtyshnikov2009, Oseledets2011, LubichRohwedderSchneiderEtal2013]), have become useful in a wide range of applications [RakhubaOseledets2016, StoudenmireSchwab2016, HanWangFanEtAl2018]. A core component of tensor network algorithms is the efficient representation of linear operators. In MPS the operator representation is called the matrix product operator (MPO), and in PEPS the projected entangled-pair operator (PEPO). These forms of representation are naturally designed for short-range interactions, such as interactions of nearest-neighbor type on a regular lattice. For long-range interactions, such as those naturally appearing from electronic structure calculations due to the Coulomb interaction, a straightforward representation would lead to a large MPO/PEPO rank, which can be prohibitively expensive. Therefore efficient representation of tensor network operators with long-range interactions is crucial for the success of the methods.

In this paper, we focus on the tensor network operators with long-range pairwise interactions, such as the Coulomb interaction. A tensor network operator with pairwise interaction can be defined as

(1)

Here is a symmetric matrix, called the coefficient matrix of , and is called a number operator. The precise definition of , as well as the connection to the Coulomb interaction will be discussed in Section 2.

There have been a number of proposals to treat MPOs with long-range pairwise interactions. Using the exponential fitting techniques based on finite state machines (FSM) [crosswhite2008finite, pirvu2010matrix, frowis2010tensor, ChanKeselmanEtAl2016, crosswhite2008applying], the Coulomb interaction and other translation-invariant long-range interactions on one-dimensional lattice systems can be efficiently represented. For the Coulomb interaction, the MPO rank can be bounded by , where is the number of sites and is the target accuracy. For one-dimensional systems without the translation-invariance property, recently a method based on a sequence of singular value decomposition (SVD) compression operations [ChanKeselmanEtAl2016, StoudenmireWhite2017] has been developed. This method is observed to yield a compact MPO representation for the Coulomb interaction with basis functions arising from electronic structure calculations that are not translation-invariant. On the other hand, this is a greedy method, and there is no a priori upper bound for the MPO rank. For two-dimensional and higher dimensional lattice systems, PEPOs can provide more efficient representation of linear operators than MPOs. To our knowledge, the only efficient PEPO representation for long-range interaction is given by the recently developed correlation function valued PEPO (CF-PEPO) [o2018efficient]. CF-PEPO uses a fitting method, which maps the original lattice system with translation-invariant long-range interactions to an effective system defined on a superlattice but with short-range interactions. There is no a priori upper bound for the PEPO rank. Neither is it clear whether the CF-PEPO method can be efficiently generalized to non-translation-invariant systems.

Contribution: The contribution of this paper is two-fold. First, we find that the success of both the exponential fitting method and the SVD compression method rests on the assumption that satisfies the “upper-triangular low-rank” (UTLR) property, i.e. the upper-triangular part of can be extended into a matrix that is approximately low-rank, while can be a full-rank matrix. Therefore the problem can be viewed as a special type of matrix completion problem [candes2009exact]. However, this matrix completion problem is highly ill-conditioned, and standard matrix completion algorithms suffer from numerical stability problems [cai2010singular, hu2008collaborative, mnih2008probabilistic, brand2002incremental, balzano2013grouse]. Based on the work of [brand2002incremental], we develop a modified incremental singular value decomposition method (ISVD) to simultaneously find the low-rank structure in a numerically stable fashion, and to construct the MPO representation. The algorithm yields an MPO representation equivalent to that in the SVD compression method [StoudenmireWhite2017]. The ISVD method is not restricted to translation-invariant systems, and numerical results indicate that the performance of ISVD is comparable to (in fact is marginally better than) that of the exponential fitting methods for translation-invariant systems.

Second, we propose a new way for representing long-range interactions based on the framework of hierarchical low-rank matrices. In particular, we focus on the hierarchical off-diagonal low-rank (HODLR) format [AmbikasaranDarve2013], and the -matrix format [grasedyck2003construction, hackbusch1999sparse]. The main advantage of the hierarchical low-rank format is that it allows us to construct efficient representation for both MPOs and PEPOs. For the Coulomb interaction, we can represent using a linear combination of MPOs/PEPOs, and the bond dimension of each MPO/PEPO is bounded by a constant. From a computational perspective, such a format can be more preferable than a single MPO/PEPO operator with bond dimension bounded by . Furthermore, the hierarchical low-rank format can also be applied to systems without the translation-invariance property.

Notation:

Throughout the paper all vectors and operators are defined on finite-dimensional spaces. We use hatted letters such as

to denote high-dimensional linear operators. Vectors, matrices and tensor cores are denoted by bold font letters such as . The tensor product of and is denoted by . We use the matlab notation such as to denote sub-matrices, where are index sets. Given a certain global ordering of all indices, the notation means that for any and , we have .

Organization: The rest of the paper is organized as follows. In Section 2 we briefly introduce the MPO and PEPO representation. In Section 3 we introduce and use the UTLR structure to construct MPO representation of long-range interaction for 1D systems. In Section 4 we introduce the method to construct MPO/PEPO representation using the hierarchical low-rank matrix format. Numerical results are presented in Section 6. We conclude our work in Section 7. The rules of finite state machine (FSM) for representing non-overlapping MPOs and PEPOs are given in the Appendix A and B, respectively.

2 Preliminaries

We first briefly review the construction of MPS/MPO, as well as of PEPS/PEPO. For more detailed information we refer readers to [Schollwock2011, Orus2014]. A single vector can be interpreted as an order- tensor, where the -tuple is called the size of the tensor. We assume that the index follows a lexicographic ordering (a.k.a. row-major ordering). Each entry can be accessed using multi-indices . A linear operator is an order -tensor, with each entry denoted by

The application of to gives a tensor as

which can be written in short-hand notation as .

In (1), we used as the site indices and . The sites can be organized into lattices of one, two and three dimensions. For Coulomb interaction in 1D for , and we set . In higher dimensions we use the notation to represent a lattice site. For example, on an 2D lattice we have for . For a pairwise interaction in the form of (1), the coefficient matrix has as its entries , using a row-major order, and this is written more compactly as hereafter. The Coulomb interaction takes the form , where is the Euclidean distance. Using this notation, the pairwise interaction is called translation-invariant if is a function of (in the presence of periodic boundary conditions, we may simply redefine ). In Eq. (1), we may assume for simplicity that . acts only on a single-site and should be interpreted as , where

is the identity matrix of size

and is an matrix. can be understood as a spin operator for quantum spin systems, or a number operator for quantum fermionic systems, though the precise form of is not relevant for the purpose of this paper.

2.1 Matrix product operators

A vector is represented by an MPS/TT format if each entry can be represented as as a matrix product

(2)

where each is a matrix of size . Since is a scalar, by definition . Each , called a core tensor, can be viewed as a -tensor of size , and the matrix is called the -th slice of . Since the first and third indices of are to be contracted out via matrix-multiplication operations, they are called the internal indices, while the second index is called the external index. The -tuple is called the bond dimension (a.k.a. the MPO/TT rank). Sometimes the bond dimension also refers to a single scalar .

The MPO format of a linear operator is

(3)

Each is also called a core tensor, and is of size . The first and the fourth indices are called the internal indices, and the second and third indices the external indices. The matrix is called a slice. Again . The -tuple , or sometimes simply , is called the bond dimension.

Consider the application of the linear operator on , denoted by , then the vector has an MPS representation as

Using the MPO representation, we may readily find that

The bond dimension of is .

The simplest example of the MPO is an operator in the tensor product form, i.e.

where is a matrix, i.e. is a scalar. In the component form

Clearly the MPO bond dimension is .

The second example is an operator with nearest-neighbor interaction, e.g.

where is the identity matrix of size . In the component form

(4)

can be viewed as the linear combination of MPOs in the tensor product form, and if so the resulting MPO rank would be . However, note that we may define

and

Then it can be readily verified that the MPO of the form that agrees with the component form . Hence the MPO rank is independent of and is always .

When the context is clear, we may also identify with . Then

In the MPO form, we may also omit the indices and write

The third example is an operator with a special form of long-range interaction:

We assume , and omit the component form for simplicity. The length of the interaction is characterized by . A straightforward term-by-term representation would lead to an MPO of rank . Nonetheless, using the fact that

can also be written as an MPO with rank- as

In physics literature, this is a special case of the finite state machine (FSM) [crosswhite2008applying]. For more complex settings, the FSM rule often specifies the row/column indices of each core tensor as input/output signals. We will use the language of input/output signals only in the Appendices A and B.

(a) A -tensor
(b) Tensor contraction between a -tensor and a -tensor
Figure 1: (a) Graphical representation of a -tensor . (b) Graphical representation of the tensor contraction between tensors and : .

In order to proceed with the discussion of PEPS/PEPO, it is no longer productive to keep using the component form. Instead, the graphical representation of tensors is preferred. Fig. 1 shows an example of the graphical representation of tensors and tensor contraction operations. A tensor is visualized as a vertex and each index (internal and external) is represented as an edge. When two vertices are linked by an edge, the corresponding index is contracted out. A more detailed introduction can be found in [biamonte2017tensor]. Using this representation, MPS and MPO can be represented using graphs in Fig. 2 (a) and (b) respectively.

(a) MPS
(b) MPO
Figure 2: Graphical representation of an MPS and an MPO.

2.2 Projected entangled-pair operators

The MPS and MPO impose an intrinsically one-dimensional structure on the tensor following the index . This is a very efficient representation for certain problems defined on a one-dimensional lattice. For problems defined on a two-dimensional lattice and beyond, the PEPS and PEPO often provide a more efficient representation of vectors and linear operators, respectively. The PEPS/PEPO can be most conveniently represented using the graphical representation (see Fig. 3 (a) and (b)).

In the two-dimensional case, each core tensor in the PEPS/PEPO format has up to 5/6 indices, respectively. Similar to what we have done for MPS/MPO, the (up to 4) contracted indices are called internal indices, while remaining ones are called external indices.

(a) PEPS
(b) PEPO
Figure 3: Graphical representation of a PEPS and a PEPO corresponding to a two-dimensional lattice.

3 Long-range interaction MPO in 1D systems

In this section we focus on a 1D system consisting of sites. The simplest example of constructing a long-range interaction MPO in 1D system is the exponential fitting method [pirvu2010matrix, crosswhite2008finite, frowis2010tensor]. We briefly describe this method used for the Coulomb interaction. First we approximate the inverse-distance by the sum of exponentials

(5)

Then the tensor corresponding to the Coulomb interaction can be approximated by

where . This can be done analytically through the quadrature of an integral representation [braess2009efficient], or numerically through a least squares procedure. Following the discussion of the finite state machine in Section 2.1, the operator on the right-hand side can be represented by an MPO of bond dimension , the MPO tensor of which at site has the form

(6)

where , , and . This is true for . For the tensors at the beginning and end of the MPO we only need to use the last row and first column of the above matrix respectively. Then in order to achieve a target accuracy for a system of size , we only need [beylkin2002numerical, braess2005approximation].

For the Coulomb interaction, there are two main methods for performing the exponential fitting procedure. One way is to represent as an integral, e.g.

and then approximate the right hand side with a quadrature scheme with points, such as the Gauss quadrature. This procedure results in pointwise error of in a fixed size finite interval where with upper bounded by terms [beylkin2005approximation, braess2009efficient, braess2005approximation]. Another way is to solve a nonlinear least squares problem to find the fitting parameters. However, this optimization problem can have many stationary points, making finding the global minimum difficult. We can only start from different reasonable initial guesses to increase the chance of obtaining the global minimum.

The exponential fitting method is efficient when is strictly decaying and translation-invariant. When either of these two conditions is not satisfied, this method needs some modification. There are methods to deal with the former by introducing complex exponents [pirvu2010matrix]. The generalization to non-translation-invariant systems can be more difficult.

In this work we propose a new perspective based on matrix completion, which naturally generalizes the exponential fitting method to situations where is neither strictly decaying nor translation-invariant. We define a rank matrix by

(7)

The success of the exponential fitting method rests on the fact that can agree very well with in the upper-triangular part, but can differ with arbitrarily on the diagonal part and the lower-triangular part. Indeed, the lower-triangular part of grows exponentially with respect to , while that of decays with respect to due to the symmetry of . Despite the difference, the tensor very well approximates .

More generally, we shall demonstrate that it is possible to construct an efficient MPO representation of , if the matrix satisfies the following upper triangular low rank (UTLR) property. [Upper-triangular low-rank matrix] A symmetric matrix is an upper-triangular low-rank (UTLR) matrix, if for any , there exists , and a rank- matrix , such that for any index sets ,

For a UTLR matrix , we may find its rank- approximation by solving the following optimization problem

(8)

where denotes the set of indices corresponding to the upper-triangular part of the matrix, and is an restriction operation that returns only the elements in the set . Eq. (8) is a matrix completion problem [candes2009exact].

Once the low-rank solution is obtained, where and are both matrices, we let , denote the transpose of the -th row of and respectively. Then . Therefore we can construct an MPO with the following tensor at site ():

(9)

Again the core tensor for and are given by the last row and the first column of Eq. (9). In order to recover the MPO construction in the exponential fitting method (6), we may define

The extra term is introduced to keep the notation consistent with a more general case discussed later.

3.1 An online matrix completion algorithm

The matrix completion problem (9) has several features:

  1. The sparsity pattern is always fixed to be all the upper-triangular entries.

  2. We are actually not concerned about restoring the missing data. The missing data only serves to support a low-rank structure, and they do not necessarily need to be explicitly computed.

  3. The matrix completion problem is ill-conditioned, i.e. the ratio between the largest singular value and the smallest nonzero singular value scales exponentially with respect to the system size. Therefore explicit computation of the missing entries can easily lead to an overflow error.

The last feature requires some explanation. Consider the matrix restored by the exponential fitting procedure as in (7). The lower left corner of the matrix is . Therefore this entry, and nearby entries, grow exponentially with respect to the system size. This also results in an exponential growth of the largest singular value.

For the matrix completion problem, both convex optimization-based approaches [cai2010singular] and alternating least squares (ALS)-based approaches [hu2008collaborative, mnih2008probabilistic] require some regularization term. In our case, the regularization term will become dominant even if the regularization parameter is chosen to be very small. Online matrix completion algorithms, which applies SVD compression operations incrementally [brand2002incremental, balzano2013grouse], also suffer from the exponential growth of matrix entries in the low-rank approximation. More importantly, the largest singular value, which grows exponentially with system size, also makes the matrix completion problem difficult to solve.

We introduce a modified incremental SVD (ISVD) algorithm with missing data, which avoids the computation of the SVD of the whole matrix and therefore improves the numerical stability. This method is modified from the ISVD algorithm with missing data introduced in [balzano2013grouse, brand2002incremental]. The main modification is that we add a QR factorization step which makes the method stable even for extremely ill-conditioned matrices. We also find that the resulting MPO representation is equivalent to that in [StoudenmireWhite2017]. For clarity, we first introduce how to apply the original ISVD with missing data on the present matrix completion problem in this section and Section 3.2. In Section 3.3 we introduce the modified algorithm.

The algorithm proceeds in a row-by-row fashion. Define the set of column indices corresponding to the sparsity pattern in the -th row, and correspondingly . Suppose we have already completed the first rows, and the completed matrix is . This matrix is rank- and has an SVD . Now we want to complete the first rows denoted by the matrix . Note that the first rows of are just , and on the -th row we have . Our goal is to choose the unknown entries , so that can also be approximated by a rank- matrix. Now assume , where . If , then is already a rank- matrix. Otherwise

(10)

Note that on the third line of Eq. (10), both the first and the third matrices are orthogonal matrices. Hence the smallest singular value of is upper bounded by . In order to approximate by a rank- matrix, we want to be as small as possible. This corresponds to the following optimization problem:

It is easy to see that the solution is

where denotes the Moore–Penrose inverse of . Then we obtain the truncated SVD

(11)

where is an matrix, are column orthogonal matrices of size . Thus we may approximate by

Thus letting

we can proceed to the next row. The , and we get in the end gives the SVD of the completed matrix (noting that ).

3.2 Constructing the MPO

We now apply the above procedure to the coefficient matrix . Let and , and use (9), we get an MPO that represents the interaction . However, this procedure is not numerically stable. Let us consider the Coulomb interaction approximated with exponential fitting method. Then the ratio grows exponentially with respect to the system size, and so is .

To solve this problem, we will make use of the interior part of the MPO tensor. Write

then we have

Thus

where . Therefore we have

Now define

(12)

we then have

(13)

Therefore we can construct an MPO whose tensors are of the form

(14)

to represent . This is analogous to (6) in the exponential fitting. The algorithm is summarized in Algorithm 1. Due to the fact that , as a practical algorithm we do not need to keep track of or , but only .

0:  
1:  
2:  
3:  
4:  
5:  for  do
6:     
7:     
8:     if  then
9:         {keep singular values}
10:        
11:     else
12:         {keep singular values}
13:        
14:     end if
15:     
16:     if  then
17:        ,
18:     end if
19:  end for
19:  , , for , , , and MPO according to (14).
Algorithm 1 Constructing the MPO representation via matrix completion

Note that Eq. (12) is not the only possible choice for . If the matrix is exactly of rank , then

This relation becomes approximately correct for the case when is numerically low-rank. Therefore we may set

(15)

We argue that Eq. (15) is in fact a better choice. Just like we can obtain without computing that grows exponentially with the system size, we want to do the same with so that it does not rely on computing either. And this is achieved by our new choice for the value of . Also, the new can be expected to give a more accurate approximation than the original one, because for an element of on the -th column, the error of the approximation using this only comes from the first SVDs, and does not depend on any future steps.

Furthermore, using the fact that , we do not need to explicitly fill out the missing data either, thereby avoiding computing and storing the lower triangular part of the matrix . We can avoid computing the missing part for each row as well. The pseudocode for the resulting algorithm is presented in Algorithm 1. Note that there is some abuse of notation involved in the algorithm, as we use to denote the last rows of the matrix we described previously.

3.3 A modified practical algorithm

0:  
1:  
2:  
3:  
4:  
5:  for  do
6:      QR factorization of
7:     
8:     
9:     if  then
10:         {keep singular values}
11:        
12:     else
13:         {keep singular values}
14:        
15:     end if
16:     
17:     if  then
18:        ,
19:     end if
20:  end for
20:  , , for , , , and MPO according to (14).
Algorithm 2 A modified ISVD method for robust construction of the MPO representation

Despite that Algorithm 1 computes directly, in practice it can still become unstable when becomes large. This is because the singular values of grows exponentially with respect to . Then in Algorithm 1, the SVD step (11) also increasingly ill-conditioned as increases.

Figure 4: The purple and red rectangles are matrix blocks with and 4, for which we are going to compute approximate truncated SVDs in Algorithm 2. Diagonal entries are marked with blue dots.

We now further modify the ISVD algorithm as follows. Instead of trying to approximate the SVD of for each as is done in the previous sections, we compute the approximate SVD for each off-diagonal block , which avoids the exponential growth of singular values. This can be implemented by introducing only an extra QR factorization step.

Denoting the approximate SVD obtained for as (note this is different from the one defined previously), we perform a QR factorization on . Then we have

Then we perform SVD on the matrix in the middle of the right-hand side, and use this to obtain , , . The pseudocode for the algorithm is described in Algorithm 2. The extra QR factorization step is in Line 6 in Alg. 2 We remark that the MPO representation obtained from Algorithm 2 is equivalent to that obtained in [StoudenmireWhite2017].

Below we demonstrate that one can reconstruct the UTLR low-rank factors from the , , obtained from Algorithm 2. First, note that not all ’s are square matrices, though all ’s are full-rank matrices. Each for is an matrix, and each for is an matrix. Each ’s in the middle is an matrix. Therefore the first ’s have right inverses, the last ’s have left inverses, and every in the middle has an inverse. We denote these left or right inverses as

Now we define

and it can be checked that for any . Therefore we have a upper-triangular low-rank approximation for the matrix .

4 Representing long-range interaction via the hierarchical low-rank format

In this section we present an alternative method to construct tensor network operators with long-range pairwise interactions using the hierarchical low-rank format. The advantage of this method is that it can be naturally generalized to higher dimensional tensors represented by PEPOs. Compared to the CF-PEPO approach [o2018efficient] which relies on a fitting procedure and therefore is difficult to establish the a priori error bound, we demonstrate that our algorithm yields a sum of PEPOs whose bond dimension is bounded by a constant for the Coulomb interaction.

We study the case when can be represented as a hierarchical off-diagonal low-rank (HODLR) matrix and a -matrix respectively. Furthermore, the hierarchical low-rank representation is not restricted to translation-invariant operators.

4.1 Rank-one MPOs and PEPOs

We first introduce some terminologies with respect to the MPOs and PEPOs we are going to use in this section. There are mainly two types of operators we are going to use extensively, which can be efficiently represented by MPOs and PEPOs. Consider a rank-one operator in 1D

where and are local operators defined on sites and , and as a result they commute with each other. This operator can be represented by an MPO with bond dimension 3, since we can treat it as a special case of the interaction in (2.1) in which .

The notion of rank-one operators can be easily generalized to a 2D system, for operators in the form

where is some order assigned to the system. This can be represented by a snake-shaped MPO [o2018efficient] with bond dimension 3. The coefficient matrix is a UTLR matrix of rank 1. We add some bonds with bond dimension 1 to make it a PEPO. The bond dimension of the PEPO is still 3. A graphical illustration of the PEPO is provided in Fig. 5.

Figure 5: A PEPO constructed from the snake-shaped MPO by adding bonds with bond dimension 1 (blue dotted lines).

Sometimes the operator acts trivially everywhere except for a given region (described by an index set ) in the system, we may order the sites in such a way that all sites in are labeled before the rest of the sites. Then the operator is