uniform tree tensor network using TensorNetwork
TensorNetwork is an open source library for implementing tensor network algorithms in TensorFlow. We describe a tree tensor network (TTN) algorithm for approximating the ground state of either a periodic quantum spin chain (1D) or a lattice model on a thin torus (2D), and implement the algorithm using TensorNetwork. We use a standard energy minimization procedure over a TTN ansatz with bond dimension χ, with a computational cost that scales as O(χ^4). Using bond dimension χ∈ [32,256] we compare the use of CPUs with GPUs and observe significant computational speed-ups, up to a factor of 100, using a GPU and the TensorNetwork library.READ FULL TEXT VIEW PDF
We demonstrate the use of tensor networks for image classification with ...
We present ShapeFlow, a dynamic abstract interpreter for TensorFlow whic...
We introduce the CUDA Tensor Transpose (cuTT) library that implements
Graph neural networks are a versatile machine learning architecture that...
Matrix and tensor operations form the basis of a wide range of fields an...
The Cheap Gradient Principle (Griewank 2008) --- the computational cost ...
Tensor Train decomposition is used across many branches of machine learn...
uniform tree tensor network using TensorNetwork
Tensor networks are sparse data structures originally developed to efficiently simulate complex quantum systems in condensed matter Fannes ; White ; Vidal ; Perez-Garcia ; MERA ; MERA2 ; MERAalgorithms ; Shi ; Tagliacozzo ; Murg ; PEPS1 ; PEPS2 ; PEPS3 ; rev1 ; rev2 ; rev3 ; rev4 ; rev5 . In recent years, highly successful tensor networks such as the matrix product state (MPS) Fannes ; White ; Vidal ; Perez-Garcia and the multi-scale entanglement renormalization ansatz (MERA) MERA ; MERA2 ; MERAalgorithms (see Figs. 1(a)-(b)) have found a much wider range of applications, including quantum chemistry QC1 ; QC2 ; QC3 ; QC4 , statistical mechanics CTMRG ; TRG ; TEFRG ; TNRML1 ; ML2 ; ML3 ; ML4 ; ML5 , quantum fields cMPS ; cMERA , and even quantum gravity and cosmology Swingle ; dS1 ; dS2 ; dS3 ; MERAgeometry .
is a free, open source software library for dataflow and differentiable programming, developed by the Google Brain team, that can be used for a range of tasks including machine learning applications such as neural networks. Recently, the open source library TensorNetworkdownload has been released to allow running tensor network algorithms on TensorFlow.
This paper is one of a series of papers that aim to illustrate, with examples of tensor network algorithms, the use of TensorNetwork in actual computations. Specifically, here we describe an algorithm for approximating the ground state of a periodic quantum spin chain or thin torus with a tree tensor network (TTN) Shi ; Tagliacozzo ; Murg , which is a tensor network where the tensors are connected according to a tree structure. We use a standard energy minimization algorithm, whose code can be downloaded here download . Companion papers will present other algorithms, including MPS and MERA algorithms.
The specific TTN for 1D quantum systems that we consider here, represented in Fig. 1(d), lies in some sense between an MPS and the MERA, depicted in Figs. 1(a) and 1(b), respectively Comparison . Like the MPS, a TTN has no closed loops, and this allows for an optimal compression of each bond index of the tensor network (and thus also of each tensor) using the Schmidt decomposition. Like the MERA, however, the TTN in Fig. 1(d) organizes the tensors in an additional (vertical) dimension corresponding to scale. One can think of this TTN as a simplified version of the MERA in which a subset of tensors, called disentanglers, have been removed. The advantage of a TTN over MERA is that the absence of disentanglers makes it conceptually simpler. TTN algorithms are also more easily generalized from 1D to 2D systems than MPS or MERA algorithms. These properties make the TTN a good starting point to demonstrate TensorNetwork.
We use the TTN as an approximation or variational ansatz for the ground state of a periodic square lattice made of quantum spins. Here and denote the length (in units of the lattice spacing) of the lattice in the and directions. We consider lattices corresponding to a thin torus, with , which for turns into a periodic quantum spin chain. We label lattice sites with a pair of integers , with and
, and assign a complex vector spaceof dimension , representing a spin degree of freedom, to each lattice site.
As a concrete example, we consider the Ising model with transverse magnetic field, with Hamiltonian
where , are Pauli matrices and denotes the strength of a transverse magnetic field. For concreteness, we choose , for which find a ground state that is entangled over many length scales, see Fig. 8.
As Fig. 2(a) shows for and , each open index at the bottom of the TTN is assigned a Hilbert space of dimension corresponding to the sites , , , at fixed value of the direction. Therefore from the perspective of the TTN, the lattice model is effectively a quantum spin chain with sites , with each effective site corresponding to a complex vector space of dimension and with Hamiltonian
where collects all the Hamiltonian contributions connecting effective sites and . For instance, for and we have
where (4) collects couplings between pairs of spins, with one spin in column and the other spin in column ; (6) and (6) correspond to interactions and magnetic fields of spins within column ; finally (8) and (8) correspond to interactions and magnetic fields within column . The factor is included to avoid double counting in Eq. (3).
The TTN represents a pure state and is made of isometric tensors , or isometries, which are rank-3 tensors of size (here we assume, for simplicity in the explanation, that all the bond dimensions in the TTN are the same and given by ) and components that fulfil
There is also a rank-2 tensor at the top of the TTN, which is normalized to 1,
We label the isometries in the TTN as where labels the scale direction, with at the bottom of the TTN and at the top, whereas labels the position within layer . There are isometries , , at the lowest layer of the TTN, isometries , , at the second lowest layer of the TTN, etc. The total number of isometries is thus . For instance, in the example of Fig. 2(a), there are isometries , , , and in the lowest layer of the TTN, and isometries and in the second lowest layer. Finally, there is also the rank-2 tensor at the top of the TTN.
The TTN is optimized using a standard energy minimization algorithm, as described e.g. in section IV of Ref. Tagliacozzo . The energy minimization algorithm proceeds by iteratively updating each isometry in the TTN, as outlined below. We exploit translation invariance of to set all the isometries in a given layer of the TTN to be the same, that is , and therefore the iterative update only progresses through scale, as parametrized by the integer (and not through space, parametrized by the integer ).
In order to update an isometry of the TTN, we first need to compute its environment . Like the isometry , the environment is a rank-3 tensor of dimensions . It is defined as a sum of a number of contributions, coming from different Hamiltonian terms . An example of such contributions is represented in Fig. 2(b).
The computation of the environment is achieved by contracting the tensor networks for all relevant contributions. Fig. 3 shows a sequence of diagrams corresponding to the contraction of the tensor network in Fig. 2(b). Such a tensor network can be contracted using the ncon function in TensorNetwork. In practice, contracting the whole network is reduced to a sequence of tensor-tensor contractions. Some of these contractions are trivial due to the isometric constraint and do not need to be implemented, whereas some contractions must be explicitly performed, see Fig. 4(a)-(b). The latter correspond, possibly after flattening the indices of the tensors, to matrix-matrix multiplications, with computational cost of at most per multiplication.
Once the environment for an isometry has been computed, we flatten the rank-3 tensor into a matrix (which we also refer to as
) and apply a singular value decomposition to it,. Then we build the matrix , which we turn into the updated rank-3 isometry by splitting its second index into two, see Fig 4(c)-(d). The top tensor is updated similarly.
We consider a 2D lattice made of quantum spins or, equivalently, a 1D lattice made of effective spins, each of dimension . We choose the value , which is seen to lead to a scaling of ground state entanglement entropy compatible with being near a quantum critical point. This value is slightly below , which corresponds to the critical point in a fully 2D lattice, that is for . We approximate the ground state using a TTN for increasing values of in the range . For each value of we minimize the expectation value of the energy per site by iterating the isometry update scheme outlined above, until the energy per site changes by less than after a whole sweep of updates.
Fig. 5 shows the converged value of the ground state energy per site as a function of the bond dimension . The energy per site only changes by about as we increase the bond dimension from to , suggesting that the error in the energy due to using a finite value of might be on that order of magnitude. In Fig. 6 we then see that the energy converges to its extrapolated value roughly as .
From the TTN it is particularly simple to extract the spectrum of eigenvalues of reduced density matrices for particular blocks of spins, and thus compute the corresponding entanglement spectrum and entanglement entropy.
Specifically, the upper bond index of the isometry corresponds to a block of sites of the 1D effective spin chain (or a rectangular block of quantum spins of the initial 2D lattice model). The spectrum of eigenvalues of the reduced density matrix on that index can then be converted into the entanglement spectrum
and the entanglement entropy
for that block of spins.
Fig. 7 shows the entanglement spectrum for , that is for a block of sites of the effective 1D quantum spin chain or sites of the 2D quantum Ising model on the thin torus. One can see that, as a function of the of the bond dimension , the lower part of the spectrum converges faster than the upper part. Fig. 8 then shows the scaling of entanglement entropy as a function of , for different values of .
A highlight of TensorNetwork is that, thanks to running on top of TensorFlow, the same tensor network code download can be used on different computational resources. We used TensorFlow v1.13.1 built with the Intel math kernel library (MKL). The computations described above were carried out using real numbers at 64 bit floating-point precision. We employed Google’s cloud compute engine. For CPU computations we used Xeon Skylake with 1, 8, 16, and 32 cores. For GPU computations we used NVIDIA Tesla V100. For further reference, we also run equivalent numpy code using a single CPU.
The computational cost of the TTN algorithm scales as for sufficiently large . This is indeed the scaling of both tensor-tensor contractions and of the SVD of the environment required to update an isometry . There are also other steps, including permuting indices of rank-3 tensors, that scale as .
Fig. 9 shows the computational time required in order to update all the isometries in the TTN once (wall time per sweep). We see that for large bond dimension , single CPU computations with code using either the numpy library or TensorNetwork both scale as , as expected. However, using TensorNetwork is twice as fast. We also observe that for large bond dimension, using TensorNetwork with a GPU is about times faster than with a CPU. Moreover, with the range of tested bond dimension , the cost still scales roughly as on the GPU (larger values of will be tested in the near future). Finally, further optimizations are still required to fully take advantage of TPU architecture (work in progress), but early experiments suggest that the performance will likely exceed that of the GPU when those optimizations are completed.
This paper described a TTN algorithm for approximating the ground state of a quantum spin lattice model on a thin cylinder, implemented using TensorNetwork download , an open source library that works on TensorFlow TensorFlow . The code can be found here download . We have used this sample code to find increasingly refined TTN approximations to the ground state of the transverse field Ising Hamiltonian on a periodic 2D lattice made of quantum spins, with bond dimension . Using TensorNetwork, we have seen that when running the code on a GPU, the computational time was reduced by a factor compared to a single CPU.
Code for other simulation algorithms for quantum systems based on tensor networks, such as MPS and MERA algorithms, will be similarly provided and discussed in subsequent papers.
Acknowledgements.— A. Milsted, M. Ganahl, and G. Vidal thank X for their hospitality. X is formerly known as Google[x] and is part of the Alphabet family of companies, which includes Google, Verily, Waymo, and others (www.x.company). Research at Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science.
Compared to a CPU, both GPU and TPU appear to provide very significant computational speed-ups on the order of - for tensor-tensor contractions involving large tensors, but more modest speed-ups for matrix factorizations such as a singular value decomposition (SVD) or eigenvalue decomposition (EVD). In those tensor network algorithms, such as MERA algorithms, where the cost of the required tensor-tensor multiplications and SVD scale e.g. as and respectively, the use of GPUs and TPUs is expected to lead to massive savings in computational time. However, in a TTN where tensor-tensor multiplications and SVD scale both as , GPUs and TPUs will lead to less spectacular gains.
In our current TTN algorithm, it is possible to replace the SVDs with tensor-tensor multiplications and EVDs, see Fig. 10. In this way, larger speed-ups than the ones reported in the main text are expected. However, the squaring of the environment in Fig. 10(b) leads to a loss of half of the numerical precision. In those simulations where this is not a problem (e.g. because the error due to a finite bond dimension is more important than the loss of numerical precision due to squaring the environment), it might then convenient to use an EVD instead of an SVD.