TensorNetwork on TensorFlow: A Spin Chain Application Using Tree Tensor Networks

05/03/2019 ∙ by Ashley Milsted, et al. ∙ 0

TensorNetwork is an open source library for implementing tensor network algorithms in TensorFlow. We describe a tree tensor network (TTN) algorithm for approximating the ground state of either a periodic quantum spin chain (1D) or a lattice model on a thin torus (2D), and implement the algorithm using TensorNetwork. We use a standard energy minimization procedure over a TTN ansatz with bond dimension χ, with a computational cost that scales as O(χ^4). Using bond dimension χ∈ [32,256] we compare the use of CPUs with GPUs and observe significant computational speed-ups, up to a factor of 100, using a GPU and the TensorNetwork library.



There are no comments yet.


page 1

page 2

page 3

page 4

Code Repositories


uniform tree tensor network using TensorNetwork

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Tensor networks are sparse data structures originally developed to efficiently simulate complex quantum systems in condensed matter Fannes ; White ; Vidal ; Perez-Garcia ; MERA ; MERA2 ; MERAalgorithms ; Shi ; Tagliacozzo ; Murg ; PEPS1 ; PEPS2 ; PEPS3 ; rev1 ; rev2 ; rev3 ; rev4 ; rev5 . In recent years, highly successful tensor networks such as the matrix product state (MPS) Fannes ; White ; Vidal ; Perez-Garcia and the multi-scale entanglement renormalization ansatz (MERA) MERA ; MERA2 ; MERAalgorithms (see Figs. 1(a)-(b)) have found a much wider range of applications, including quantum chemistry QC1 ; QC2 ; QC3 ; QC4 , statistical mechanics CTMRG ; TRG ; TEFRG ; TNR

, machine learning

ML1 ; ML2 ; ML3 ; ML4 ; ML5 , quantum fields cMPS ; cMERA , and even quantum gravity and cosmology Swingle ; dS1 ; dS2 ; dS3 ; MERAgeometry .

TensorFlow TensorFlow

is a free, open source software library for dataflow and differentiable programming, developed by the Google Brain team, that can be used for a range of tasks including machine learning applications such as neural networks. Recently, the open source library TensorNetwork

download has been released to allow running tensor network algorithms on TensorFlow.

This paper is one of a series of papers that aim to illustrate, with examples of tensor network algorithms, the use of TensorNetwork in actual computations. Specifically, here we describe an algorithm for approximating the ground state of a periodic quantum spin chain or thin torus with a tree tensor network (TTN) Shi ; Tagliacozzo ; Murg , which is a tensor network where the tensors are connected according to a tree structure. We use a standard energy minimization algorithm, whose code can be downloaded here download . Companion papers will present other algorithms, including MPS and MERA algorithms.

Figure 1: (a) Matrix product state (MPS) for a many-body wave-function on sites. (b) Multi-scale entanglement renormalization ansatz (MERA) also for the state of a lattice system made of sites. (c) Example of a tree tensor network (TTN), where the network of tensors is organized according to a tree structure. Notice the absence of closed loops, as in the MPS and in contrast to the MERA. (d) The specific TTN considered in this paper: a regular binary tree. Like the MPS, it is loop-free. Like the MERA, it is organized in layers of tensors corresponding to different length scale.

The specific TTN for 1D quantum systems that we consider here, represented in Fig. 1(d), lies in some sense between an MPS and the MERA, depicted in Figs. 1(a) and 1(b), respectively Comparison . Like the MPS, a TTN has no closed loops, and this allows for an optimal compression of each bond index of the tensor network (and thus also of each tensor) using the Schmidt decomposition. Like the MERA, however, the TTN in Fig. 1(d) organizes the tensors in an additional (vertical) dimension corresponding to scale. One can think of this TTN as a simplified version of the MERA in which a subset of tensors, called disentanglers, have been removed. The advantage of a TTN over MERA is that the absence of disentanglers makes it conceptually simpler. TTN algorithms are also more easily generalized from 1D to 2D systems than MPS or MERA algorithms. These properties make the TTN a good starting point to demonstrate TensorNetwork.

Ii Tree Tensor Network for ground states of lattice models

Figure 2: (a) TTN variational ansatz for the ground state of a square lattice of quantum spins with toric boundary conditions. Notice that a single dangling leg of the TTN is an index of dimension that labels an orthonormal basis in the -dimensional Hilbert space of quantum spins. (b) Example of diagram needed in order to compute the so-called environment for an isometry , which would be placed in the empty location indicated by a grey shadow. This example corresponds to a TTN for and for a Hamiltonian term (in green) connecting effective sites and . The bond index connecting the two green circles has dimension (for the Ising Hamiltonian) instead of .

ii.1 Thin torus

We use the TTN as an approximation or variational ansatz for the ground state of a periodic square lattice made of quantum spins. Here and denote the length (in units of the lattice spacing) of the lattice in the and directions. We consider lattices corresponding to a thin torus, with , which for turns into a periodic quantum spin chain. We label lattice sites with a pair of integers , with and

, and assign a complex vector space

of dimension , representing a spin degree of freedom, to each lattice site.

As a concrete example, we consider the Ising model with transverse magnetic field, with Hamiltonian


where , are Pauli matrices and denotes the strength of a transverse magnetic field. For concreteness, we choose , for which find a ground state that is entangled over many length scales, see Fig. 8.

ii.2 Effective quantum spin chain

As Fig. 2(a) shows for and , each open index at the bottom of the TTN is assigned a Hilbert space of dimension corresponding to the sites , , , at fixed value of the direction. Therefore from the perspective of the TTN, the lattice model is effectively a quantum spin chain with sites , with each effective site corresponding to a complex vector space of dimension and with Hamiltonian


where collects all the Hamiltonian contributions connecting effective sites and . For instance, for and we have


where (4) collects couplings between pairs of spins, with one spin in column and the other spin in column ; (6) and (6) correspond to interactions and magnetic fields of spins within column ; finally (8) and (8) correspond to interactions and magnetic fields within column . The factor is included to avoid double counting in Eq. (3).

ii.3 The tensor network

The TTN represents a pure state and is made of isometric tensors , or isometries, which are rank-3 tensors of size (here we assume, for simplicity in the explanation, that all the bond dimensions in the TTN are the same and given by ) and components that fulfil


There is also a rank-2 tensor at the top of the TTN, which is normalized to 1,


and fixes to 1 the normalization of the wavefunction . The isometric constraint (9) and the normalization (10) are represented diagrammatically in Fig. 4(a).

We label the isometries in the TTN as where labels the scale direction, with at the bottom of the TTN and at the top, whereas labels the position within layer . There are isometries , , at the lowest layer of the TTN, isometries , , at the second lowest layer of the TTN, etc. The total number of isometries is thus . For instance, in the example of Fig. 2(a), there are isometries , , , and in the lowest layer of the TTN, and isometries and in the second lowest layer. Finally, there is also the rank-2 tensor at the top of the TTN.

Figure 3: Contraction of the tensor network in Fig. 2(b), corresponding to one contribution to the environment for a given isometry . In a first step, we use the isometric constraints of isometries in Fig. 4(a) to eliminate pairs and thus simplify the network (no actual tensor-tensor contractions need to be computed). The rest of steps can be ultimately decomposed into tensor-tensor contractions at cost , see Fig. 4(b).
Figure 4: (a) Isometric constraint of an isometry and normalization of the top tensor , see Eqs. (9) and (10). (b) Example of tensor-tensor contractions needed in Fig. 3, at computational cost . (c) SVD decomposition of the environment of an isometry . (d) Updated isometry in terms of the tensors and that appear in the SVD of the environment .

Iii Algorithm

The TTN is optimized using a standard energy minimization algorithm, as described e.g. in section IV of Ref. Tagliacozzo . The energy minimization algorithm proceeds by iteratively updating each isometry in the TTN, as outlined below. We exploit translation invariance of to set all the isometries in a given layer of the TTN to be the same, that is , and therefore the iterative update only progresses through scale, as parametrized by the integer (and not through space, parametrized by the integer ).

iii.1 Environment of an isometry

In order to update an isometry of the TTN, we first need to compute its environment . Like the isometry , the environment is a rank-3 tensor of dimensions . It is defined as a sum of a number of contributions, coming from different Hamiltonian terms . An example of such contributions is represented in Fig. 2(b).

The computation of the environment is achieved by contracting the tensor networks for all relevant contributions. Fig. 3 shows a sequence of diagrams corresponding to the contraction of the tensor network in Fig. 2(b). Such a tensor network can be contracted using the ncon function in TensorNetwork. In practice, contracting the whole network is reduced to a sequence of tensor-tensor contractions. Some of these contractions are trivial due to the isometric constraint and do not need to be implemented, whereas some contractions must be explicitly performed, see Fig. 4(a)-(b). The latter correspond, possibly after flattening the indices of the tensors, to matrix-matrix multiplications, with computational cost of at most per multiplication.

iii.2 Updated isometry

Once the environment for an isometry has been computed, we flatten the rank-3 tensor into a matrix (which we also refer to as

) and apply a singular value decomposition to it,

. Then we build the matrix , which we turn into the updated rank-3 isometry by splitting its second index into two, see Fig 4(c)-(d). The top tensor is updated similarly.

Iv Benchmark results

We consider a 2D lattice made of quantum spins or, equivalently, a 1D lattice made of effective spins, each of dimension . We choose the value , which is seen to lead to a scaling of ground state entanglement entropy compatible with being near a quantum critical point. This value is slightly below , which corresponds to the critical point in a fully 2D lattice, that is for . We approximate the ground state using a TTN for increasing values of in the range . For each value of we minimize the expectation value of the energy per site by iterating the isometry update scheme outlined above, until the energy per site changes by less than after a whole sweep of updates.

iv.1 Ground state energy

Fig. 5 shows the converged value of the ground state energy per site as a function of the bond dimension . The energy per site only changes by about as we increase the bond dimension from to , suggesting that the error in the energy due to using a finite value of might be on that order of magnitude. In Fig. 6 we then see that the energy converges to its extrapolated value roughly as .

Figure 5: Variational ground state energy per site as a function of the bond dimension .
Figure 6: Variational ground state energy per site as a function of . At large bond dimension , the energy seems to be scaling to some limit value as .

iv.2 Ground state entanglement

From the TTN it is particularly simple to extract the spectrum of eigenvalues of reduced density matrices for particular blocks of spins, and thus compute the corresponding entanglement spectrum and entanglement entropy.

Specifically, the upper bond index of the isometry corresponds to a block of sites of the 1D effective spin chain (or a rectangular block of quantum spins of the initial 2D lattice model). The spectrum of eigenvalues of the reduced density matrix on that index can then be converted into the entanglement spectrum


and the entanglement entropy


for that block of spins.

Fig. 7 shows the entanglement spectrum for , that is for a block of sites of the effective 1D quantum spin chain or sites of the 2D quantum Ising model on the thin torus. One can see that, as a function of the of the bond dimension , the lower part of the spectrum converges faster than the upper part. Fig. 8 then shows the scaling of entanglement entropy as a function of , for different values of .

Figure 7: First 30 values of the entanglement spectrum of the reduced density matrix assigned to an upper bond index of the 6th row (i.e. ) of isometries of the TTN, corresponding to a block of sites of the effective 1D spin chain. As expected, the lowest values converge faster than the larger ones with growing bond dimension .
Figure 8: Entanglement entropy for a block of sites of the effective 1D quantum spin chain. As we increase the bond dimension , the TTN variational ansatz is capable of better reproducing the entanglement structure of the ground state. We see that for the largest bond dimensions in the range the profile of entanglement entropies is already very stable.
Figure 9: Computational time as a function of the bond dimension . For large , the computational time on a single CPU (both numpy and TensorNetwork) scales as , as anticipated, with TensorNetwork being twice as fast as numpy for large . On the GPU, the computational time does not yet reach the large scaling but scales instead roughly as for the largest bond dimensions we tested. Using clusters of CPUs reduces the gap with the GPU, although a GPU is still a factor faster than CPUs.

V Computational time

A highlight of TensorNetwork is that, thanks to running on top of TensorFlow, the same tensor network code download can be used on different computational resources. We used TensorFlow v1.13.1 built with the Intel math kernel library (MKL). The computations described above were carried out using real numbers at 64 bit floating-point precision. We employed Google’s cloud compute engine. For CPU computations we used Xeon Skylake with 1, 8, 16, and 32 cores. For GPU computations we used NVIDIA Tesla V100. For further reference, we also run equivalent numpy code using a single CPU.

The computational cost of the TTN algorithm scales as for sufficiently large . This is indeed the scaling of both tensor-tensor contractions and of the SVD of the environment required to update an isometry . There are also other steps, including permuting indices of rank-3 tensors, that scale as .

Fig. 9 shows the computational time required in order to update all the isometries in the TTN once (wall time per sweep). We see that for large bond dimension , single CPU computations with code using either the numpy library or TensorNetwork both scale as , as expected. However, using TensorNetwork is twice as fast. We also observe that for large bond dimension, using TensorNetwork with a GPU is about times faster than with a CPU. Moreover, with the range of tested bond dimension , the cost still scales roughly as on the GPU (larger values of will be tested in the near future). Finally, further optimizations are still required to fully take advantage of TPU architecture (work in progress), but early experiments suggest that the performance will likely exceed that of the GPU when those optimizations are completed.

Vi Conclusions

This paper described a TTN algorithm for approximating the ground state of a quantum spin lattice model on a thin cylinder, implemented using TensorNetwork download , an open source library that works on TensorFlow TensorFlow . The code can be found here download . We have used this sample code to find increasingly refined TTN approximations to the ground state of the transverse field Ising Hamiltonian on a periodic 2D lattice made of quantum spins, with bond dimension . Using TensorNetwork, we have seen that when running the code on a GPU, the computational time was reduced by a factor compared to a single CPU.

Code for other simulation algorithms for quantum systems based on tensor networks, such as MPS and MERA algorithms, will be similarly provided and discussed in subsequent papers.

Acknowledgements.— A. Milsted, M. Ganahl, and G. Vidal thank X for their hospitality. X is formerly known as Google[x] and is part of the Alphabet family of companies, which includes Google, Verily, Waymo, and others (www.x.company). Research at Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science.


Figure 10: (a) Computation of an isometry from the corresponding environment tensor , which can be regarded as a matrix. First the environment is decomposed in its singular value decomposition , at cost . Then is built as . (b) Alternative computation of of an isometry from the corresponding enviroment tensor . This time first the squared environment , which is a matrix, is decomposed in its eigenvalue decomposition . Then is built as .

Compared to a CPU, both GPU and TPU appear to provide very significant computational speed-ups on the order of - for tensor-tensor contractions involving large tensors, but more modest speed-ups for matrix factorizations such as a singular value decomposition (SVD) or eigenvalue decomposition (EVD). In those tensor network algorithms, such as MERA algorithms, where the cost of the required tensor-tensor multiplications and SVD scale e.g. as and respectively, the use of GPUs and TPUs is expected to lead to massive savings in computational time. However, in a TTN where tensor-tensor multiplications and SVD scale both as , GPUs and TPUs will lead to less spectacular gains.

In our current TTN algorithm, it is possible to replace the SVDs with tensor-tensor multiplications and EVDs, see Fig. 10. In this way, larger speed-ups than the ones reported in the main text are expected. However, the squaring of the environment in Fig. 10(b) leads to a loss of half of the numerical precision. In those simulations where this is not a problem (e.g. because the error due to a finite bond dimension is more important than the loss of numerical precision due to squaring the environment), it might then convenient to use an EVD instead of an SVD.