## 1 Introduction

Tensor networks have found a wide range of applications within mathematics [1, 2], physics and chemistry, in particular as matrix product states (MPS), projected entangled pair states (PEPS) or the multiscale entanglement renormalization ansatz (MERA) for strongly correlated quantum systems [3, 4, 5]. While tensor networks and associated operations are conveniently represented as graphical diagrams, a subsequent implementation of these operations is often tedious, especially if one has to keep track of arrangements of many indices. On the other hand, fundamental operations like finding the (quasi-)optimal contraction order and performing the partial contraction of a given tensor network are available as software packages [6, 7, 8] or via NumPy’s `einsum`

command [9, 10]. To bridge the gap between graphical representation and implementation, we introduce a graphical user interface (GUI) for constructing arbitrary tensor networks and specifying common operations on them, like contractions or splitting via QR- or SVD-decompositions. Our software framework then instantly generates source code for these operations; currently Python/NumPy is supported, with additional programming languages planned for the future. We use JavaScript and the D3.js library to make the GUI conveniently available via web browsers.

## 2 Description of the GUI

The GUI represents each tensor as a node with an arbitrary number of legs, corresponding to the number of dimensions (rank) of the tensor. The ordering of the dimensions is indicated by labels. Fig. 1 visualizes a single tensor as it appears in the GUI. Note that this abstract representation leaves the actual dimensions open, i.e., it does not differentiate between, say, a tensor and a tensor, since both have rank .

The user interacts with the GUI mainly via drag-and-drop gestures, to add tensors to the network or attach legs to a tensor, and to specify operations like contractions and tensor splitting; see below for more details. The GuiTeNet framework visualizes the current tensor network, and simultaneously generates source code which implements the hitherto sequence of user actions. For example, the generated Python code for a contraction of three tensors followed by QR splitting reads:

import numpy as np def f(T0, T1, T2): T3 = np.einsum(T0, (0, 1, 2), T1, (3, 2), T2, (0, 4, 5), (1, 3, 4, 5)) T4 = np.transpose(T3, (3, 0, 2, 1)) T5, T6 = np.linalg.qr(T4.reshape((np.prod(T4.shape[:2]), np.prod(T4.shape[2:]))), mode=’reduced’) T5 = T5.reshape(T4.shape[:2] + (T5.shape[1],)) T6 = T6.reshape((T6.shape[0],) + T4.shape[2:]) return (T5, T6)

Details of the code generation are provided in section 3.

### Creating tensors

A new tensor is added to the network by a drag-and-drop gesture. The user drags a special “create tensor” symbol (blue circle in Fig. (a)a) to the desired location. When “dropping” the symbol, a new tensor (black circle) appears there. Initially it has zero legs. The tensors are automatically labeled to provide a unique identifier. The “create tensor” symbol reappears at its default location after this operation, and can then be used to add another tensor to the network.

### Attaching tensor legs

Each leg represents one dimension of the tensor. The user creates a new leg by “pulling” it out of the tensor (i.e., drag-and-drop on the tensor), when simultaneously holding the Control key. Each tensor and its legs can still be freely moved around within the GUI window.

### Tensor contractions

Tensor contractions are specified by connecting the tips of tensor legs. The tips snap to each other when brought into close contact. The actual contraction (possibly of several tensors) is executed when pressing the “Contract” button of the GUI, see Fig. (a)a for an example.

### Splitting a tensor

The splitting of a tensor by QR or singular value decomposition (SVD) is a ubiquitous operation in tensor network algorithms, in particular for reducing “bond dimensions” by devising a singular value cut-off tolerance, and a prerequisite for working with left- and right-orthogonal tensors in the MPS framework

[3]. The first step for decomposing a tensor is its “matricization”: a subset of legs is grouped together into one “fat” leg and the remaining (complementary) legs into a second “fat” leg. The two fat legs are interpreted as the rows and columns of a matrix, which is then decomposed. Fig. (a)aillustrates this process (as it appears in the GUI) for the QR decomposition of a tensor with initially 5 legs. (An analogous SVD decomposition is currently still under development.) The user first right-clicks on a tensor to initiate the splitting operation. An overlay window then asks for the ordering and partitioning of dimensions attributed to the rows and columns in the matricization process. In the example, the “row” consists of dimensions

, , (in this order) and the “column” of dimensions , (in this order). After the decomposition, the resulting and matrices are finally reshaped to restore the original dimensions, with an additional dimension for the shared bond (last dimension of , first dimension of ). Thus the dimensions , , of match the original dimensions , , , and dimensions , of the original dimensions , .The initial reordering of dimensions becomes a separate “elementary transposition operation”, as described in section 3 below. The generated code uses a temporary tensor for this purpose. In Fig. (a)a, this temporary tensor has index , and hence the and tensors are consecutively labeled and .

After this reordering, the partitioning is simply a reinterpretation of the data stored in the tensor, since the “row” group now consists of the first leading dimensions, and the “column” group of the remaining trailing dimensions, where the rank is the total number of dimensions.

## 3 Elementary tensor network operations

Somewhat analogous to an intermediate representation in source code compilation, we decompose the actions supported by the GUI into the following elementary operations on tensor networks:

### (i) Elementary contraction of tensors

The GuiTeNet framework supports general contraction operations on a tensor network. An *elementary* contraction acts on a subset of tensors such that these tensors are joined (directly or indirectly) by shared legs, yielding a single tensor after the contraction. Note that “multi-bond” contractions, i.e., the simultaneous contraction of multiple legs as in Fig. (a)a, is explicitly allowed. In principle, a contraction of several tensors could be decomposed into a sequence of pairwise contractions, e.g., computing the matrix product by first “contracting” with to obtain and then multiplying with . However, in general the optimal order of these pairwise contractions poses a delicate optimization problem [7] and is not straightforwardly applicable to multi-bond contractions. Hence we regard the contraction of (possibly more than two) tensors as elementary operation, and leave the optimized implementation to backend software packages.

On the other hand, a sequence of tensor network operations can be optimized by merging subsequent elementary contractions into a single elementary contraction. As simple (toy model) illustration why this might be useful, consider the contraction (matrix-matrix multiplication) followed by the contraction

(matrix-vector product). Merging these two contractions leads to

, for which a backend algorithm would naturally choose the order .To uniquely specify a contraction operation, we follow NumPy’s `einsum`

command convention in the form `einsum`

. Here the refer to tensors, and are lists of integer labels for the corresponding dimensions, with multiply occurring labels to be summed over. The last argument determines the ordering of dimensions in the output tensor after the contraction. For the example in Fig. (a)a with tensors,

Thus, the three dimensions of tensor are labeled , , , the three dimensions of tensor are labeled , , etc. The dimensions labeled , and will be contracted since they appear multiple times, and the remaining dimensions are ordered as in the output tensor. The generated Python source code follows exactly this scheme and reads explicitly

T4 = np.einsum(T0, (0, 1, 2), T1, (0, 1, 3), T2, (0, 4), T3, (4, 5), (2, 3, 5))

### (ii) General transposition of a tensor

Formally, a tensor transposition is a permutation of dimensions, generalizing the usual transposition of matrices. For example, applying the permutation to a tensor yields a tensor, such that the -th entry of is the -th entry of the transposed tensor. Since the tensor elements are typically stored as a contiguous array in memory, a transposition implies a reshuffling of the array elements. Thus, while a transposition does not involve arithmetic calculations besides computing memory addresses, its cost can still be significant, in particular due to the inherent “cache-unfriendliness”.

Specifying a transposition only requires designating the permutation of dimensions. We follow the convention of NumPy’s `transpose`

function.

Regarding transpositions as separate elementary operations — instead of first step for splitting a tensor for example — facilitates additional optimizations. A plausible scenario is integrating the transposition into a preceding contraction operation [11], generalizing for matrices and .

### (iii) QR decomposition of a tensor

The elementary QR decomposition considered here does not involve any reordering of dimensions. Thus, as described in section 2, it is uniquely specified by the number of leading tensor dimensions to be interpreted as “row” dimension in the matricization process, and correspondingly the remaining dimensions as “column” dimension.

To illustrate, the generated Python code (up to renaming variables) for the elementary QR decomposition of a tensor and leading dimensions reads as follows:

Q, R = np.linalg.qr(T.reshape((np.prod(T.shape[:3]), np.prod(T.shape[3:]))), mode=’reduced’) Q = Q.reshape(T.shape[:3] + (Q.shape[1],)) R = R.reshape((R.shape[0],) + T.shape[3:])

The `reshape`

functions implement the matricization before and “de-matricization” after the actual QR decomposition, `T.shape`

stores the tensor dimensions, `np.prod`

computes the product of the leading and trailing dimensions, and `np.linalg.qr`

implements the conventional QR decomposition of matrices.

### (iv) Singular-value decomposition of a tensor

The (de-)matricization process for an elementary singular-value decomposition (SVD) of a tensor is analogous to the elementary QR decomposition. The output now consists of three tensors, corresponding to the , and matrices of the matrix-SVD, with the diagonal matrix storing the singular values. Additional parameters (compared to the QR decomposition) are a cut-off tolerance for the singular values, and optionally the maximally allowed number of singular values (the maximal “bond” dimension).

## 4 Strategies for optimization

Based on the elementary tensor network operations, several high-level optimization strategies are conceivable, solely based on the rank of each tensor instead of the actual dimensions.

A natural representation for the sequence of user actions is a directed acyclic graph (DAG), storing an elementary operation or input tensor at each node. Such a representation clarifies dependencies, and allows to determine which operations can be executed in parallel.

A more subtle optimization strategy tailored to tensor networks is the merging of subsequent contractions, i.e., if the tensor resulting from a contraction is immediately used in another contraction. A toy model example (as mentioned in section 3) consists of merging followed by into , which can then be evaluated in the order . Note that the optimized computational cost for the merged contraction cannot be higher than performing the contractions sequentially (since the latter restricts the allowed contraction order), but actually determining the optimal order for the merged contraction (by a backend software package) is in general more difficult [7].

Another optimization strategy is avoiding explicit transpositions (i.e., permutations of tensor dimensions), and aiming for advantageous dimension ordering. As mentioned, the transposition of a tensor resulting from a contraction can be integrated into the contraction (see also [11]), generalizing for matrices. Transpositions may also be pushed through the computational graph; for example, instead of permuting the leading dimensions of the -tensor resulting from a QR decomposition, one could already permute these dimensions in the input tensor, or vice versa.

## 5 Conclusions and outlook

In its present form, the GuiTeNet software framework is well suited to handle a relatively small number of tensors, but manually constructing a network with hundreds of tensors is cumbersome. Instead, generating code for subroutines or blocks inside loops is a plausible scenario for employing GuiTeNet in larger software projects. As specific example, rather than instantiating all tensors of a matrix product state, the GuiTeNet framework could be used to generate a local contraction operation required during a left-right sweep over the chain.

We also want to point out the pedagogical value which GuiTeNet might offer, including the seamless transition from vectors and matrices to general tensors.

Nevertheless, there are many desirable features left for future work, including code generation for other programming languages and software libraries, or a timeline of previous network states (e.g., before a contraction) with associated *Undo* functionality. Tensors with special properties (like orthogonal tensors resulting from a SVD or QR decomposition) should be marked, e.g., using a different symbol, and ideally such properties should be exploited in the generated code. Furthermore, one could take -symmetries into account by endowing the legs with additive quantum numbers (like particle or spin) and a directional arrow. Conceptually, the sum of quantum numbers flowing into a tensor must be equal to the sum of quantum numbers leaving the tensor, enforcing a block sparsity structure of the tensor. Another worthwhile goal is incorporating more exotic tensor network operations, like “loop skeletonization” [12].

An interesting open question is how GuiTeNet could inspire or profit from software and hardware architectures tailored to tensor operations, like contractions beyond conventional BLAS routines [11] or Google’s Tensor Processing Units (TPUs) [13]

employed in the TensorFlow machine learning framework.

We encourage active contributions and further development of GuiTeNet, see guitenet.org for details.

### Acknowledgments

CM likes to thank Lexing Ying for inspiring discussions.

## References

- [1] W. Hackbusch and S. Kühn. A new scheme for the tensor representation. J. Fourier Anal. Appl., 15:706–722, 2009.
- [2] W. Hackbusch. Numerical tensor calculus. Acta Numerica, 23:651–742, 2014.
- [3] U. Schollwöck. The density-matrix renormalization group in the age of matrix product states. Ann. Phys., 326:96–192, 2011.
- [4] F. Verstraete, V. Murg, and J. I. Cirac. Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. Adv. Phys., 57:143–224, 2008.
- [5] G. Vidal. Class of quantum many-body states that can be efficiently simulated. Phys. Rev. Lett., 101:110501, 2008.
- [6] R. N. C. Pfeifer, G. Evenbly, S. Singh, and G. Vidal. NCON: A tensor network contractor for MATLAB. arXiv:1402.0939, 2014.
- [7] R. N. C. Pfeifer, J. Haegeman, and F. Verstraete. Faster identification of optimal contraction sequences for tensor networks. Phys. Rev. E, 90:033315, 2014.
- [8] G. Evenbly and R. N. C. Pfeifer. Improving the efficiency of variational tensor network algorithms. Phys. Rev. B, 89:245118, 2014.
- [9] S. van der Walt, S. C. Colbert, and G. Varoquaux. The NumPy array: A structure for effcient numerical computation. Comput. Sci. Eng., 13:22–30, 2011.
- [10] https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html.
- [11] D. Matthews. High-performance tensor contraction without transposition. SIAM J. Sci. Comput., 40:C1–C24, 2018.
- [12] L. Ying. Tensor network skeletonization. Multiscale Model. Sim., 15:1423–1447, 2017.
- [13] N. P. Jouppi, C. Young, N. Patil, and et al. In-datacenter performance analysis of a Tensor Processing Unit. In ISCA’17 Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017.