The study of the structural and computational aspects of tensors (multi-dimensional arrays of numbers over a field) is deeply connected to key research areas in mathematics, physics and computer science. Examples of these are the research on sunflowers and cap sets in combinatorics [EG17, Tao16], understanding quantum entanglement [DVC00, LQWD18], tensor networks, secant varieties of Segre varieties in algebraic geometry [Lan12], scaling algorithms [BGO18, BFG18], the complexity of matrix multiplication [BCS97, Blä13], and the complexity of arithmetic formulas [Raz13, Blä14]. This work advances and makes novel connections among two recent developments in the study of tensors:
the quantum functionals of complex tensors [CVZ18], a family of multiplicative monotones —similar in spirit and utility to multiplicative monotones in other settings, like the Lovász theta function in graph theory or the entropy function in information theory—which moved forward the research direction initiated by Strassen on the asymptotic111In this paper, the term “asymptotic” refers to considering large powers of tensors, which is motivated by many applications including the complexity of matrix multiplication, understanding quantum entanglement, and the sunflower and cap set problem in combinatorics. The power is taken under the tensor Kronecker product, which naturally generalizes the matrix Kronecker product. properties of tensors [Str86, Str87, Str88, Str91].
To start this off, as a central component of this paper, we introduce a new notion of rank for tensors called weighted slice rank. This notions generalizes several existing notions of rank for tensors, most notably the slice rank and the non-commutative rank. In this paper, we:
prove basic properties of the (asymptotic) weighted slice ranks,
prove, via a minimax argument, a correspondence between the asymptotic weighted slice ranks and the quantum functionals, characterizing each as an optimization problem in terms of the other (over ),
as a consequence, obtain a characterization of asymptotic tripartite-to-bipartite entanglement transformation222In this task, Alice, Bob and Charlie, given many copies of a joint tripartite quantum state, together try to create as much entanglement as possible between Alice and Bob by stochastic local operations and classical communication (SLOCC) [DVC00]. and equivalently the notion of asymptotic non-commutative rank and asymptotic commutative rank (over ),
propose a notion of quantum functionals over other fields than the complex numbers via our minimax correspondence,
and develop the quantum functionals with respect to the min-entropy instead of the entropy, which leads to a connection to the G-stable rank introduced in [Der20].
A crucial gap in previous work is that the usual construction of the quantum functionals is not suitable for extension to other fields. The notion of quantum functionals over finite fields and other fields that we propose is obtained by applying our minimax correspondence to the asymptotic weighted slice ranks, which are defined over any field. We conjecture that these new functions indeed retain the properties of the complex quantum functionals. As evidence or this we prove this conjecture for the subclass of tight tensors over arbitrary fields. Finite fields have shown to be crucial in the study of combinatorial problems like the cap set problem and algorithmic problems like matrix multiplication barriers.
Our results give a partial resolution to the fundamental structural and computational problem of understanding the asymptotic tensor restriction problem. This problem is about the interplay of two tensor notions:
reductions between tensors, called restriction333A tensor is a restriction
of another tensor if, as multilinear maps, the first can be obtained from the second by precomposing with linear maps. In other words, restriction corresponds to applying linear transformations to the legs of the tensor. In the application of tensors to the study of matrix multiplication this notion of restriction indeed amounts to a reduction between computational problems. Restriction will be denoted by.—a concept similar to reductions between computational problems in complexity theory,
large powers of tensors444We take powers under the tensor Kronecker product. This product naturally generalizes the Kronecker product for matrices.—a regime similar to parallel repetition problems, Shannon capacity, or self-reducible computational problems, like matrix multiplication.
The asymptotic tensor restriction problem asks whether, given two tensors, a large power of the first tensor is a restriction of a marginally larger power of the second tensor. While many tensor parameters are known to be NP-hard to compute, there is an intriguing possibility that the asymptotic tensor restriction problem has an efficiently computable answer. (This would have considerable consequences, as, for example, computing the value of the matrix multiplication exponent is a special case.) The asymptotic weighted slice ranks as well as the quantum functionals provide limits on which asymptotic restrictions between tensors are possible.
In the remainder of this introduction we will discuss in detail the main concepts and results of the paper. The full results, technical lemmas and proofs we will then discuss in the sections that follow after the introduction.
1.1 Weighted slice rank
We introduce a new notion of rank for tensors called the weighted slice rank. In this paper a tensor is any element in a tensor space
for finite-dimensional vector spacesover some field . In coordinates, relative to the standard product basis of , a tensor is defined by a three-dimensional array of field elements , so that we have the expansion . Our results also hold for tensors of order higher than three, but for simplicity of the exposition we will here only talk about tensors of order three.
Weighted slice rank generalizes several important notions of rank for tensors. To define the weighted slice rank we first need the concept of flattenings and slice decomposition of a tensor.
Definition 1 (Flattenings).
For any tensor we define the flattenings of by
In coordinates, for a tensor the three flattenings of are given by the three matrices , , and .
Definition 2 (Slice decompositions).
For every , we call an -slice if the flattening has rank one as a matrix. For every we say that has a slice decomposition of size if there are tensors (, ) such that
and such that each tensor is an -slice.
In other words, has a slice decomposition of size if and only if there are tensors such that and for every the flattening has rank .
We define the weighted slice rank by weighing the numbers that appear in the definition of slice rank, as follows. It will become clear later why the particular kind of weighing that we use is the appropriate one.
Definition 3 (Weighted slice rank).
Let . For any we define the -weighted slice rank of as
If any of the equals , then we require the value of the corresponding number to be and we omit the term .
The weighted slice rank naturally generalizes the following three known notions of rank for tensors.
Slice rank: The weighted slice rank with weight ,
equals the ordinary slice rank of .
Flattening rank: The matrix ranks of the flattenings of are called the flattening ranks. The flattening ranks are equal to the weighted slice rank with all weight on one of the numbers ,
Non-commutative ranks: The non-commutative ranks of a tensor are equal to the weighted slice ranks with weight on two out of the three numbers ,
Obviously, slice rank is at most the non-commutative ranks, and the non-commutative ranks are in turn upper bounded by the flattening ranks according to the following Hasse diagram:
Generally, for any weightings , if the inequality holds elementwise, then we have that holds on all tensors.
We conclude that the weighted slice ranks naturally generalize and interpolate between the slice rank, the flattening ranks and the non-commutative ranks. Each of these ranks play their own specialized role in various settings. Weighted slice rank for the other possible weightings likewise play their own role, as we will see.
1.2 Asymptotic weighted slice rank
We are interested in the behaviour of weighted slice rank under taking large powers. The product under which we take powers is the tensor Kronecker product—the natural generalization of the matrix Kronecker product to tensors. For tensors and the tensor Kronecker product is a tensor . In coordinates, if and then the coefficients of are given by all pairwise products of coefficients of and coefficients of ,
The behaviour of weighted slice rank under taking large powers is captured by the asymptotic weighted slice rank.
Definition 4 (Asymptotic weighted slice rank).
For we define the asymptotic -weighted slice rank of as
We will frequently make use of the logarithm of which we denote by , that is,
Since is defined as a lim sup, arguably does not capture all of the asymptotic behaviour of the weighted slice rank. For example, we could also consider the corresponding lim inf. However, it will turn out that in the two settings that we will consider (namely, complex tensors and so-called tight tensors), the lim sup and the lim inf coincide, and may equivalently be replaced by a limit.
The weighted slice rank is a flattening rank and is therefore multiplicative, and so . The same is true for and . For general choices of , however, the weighted slice rank will be different from the asymptotic weighted slice rank . Interesting special cases of the asymptotic weighted slice rank are the asymptotic slice rank and the asymptotic non-commutative ranks , and .
Our first main result is a dual characterization of the asymptotic weighted slice ranks for complex tensors. This dual characterization is phrased as an optimization problem over a convex polytope called the moment polytope. For every complex tensor , the moment polytope is defined as follows. Recall that for any tensor we denote the flattenings in , , by , , and , respectively. For a nonzero tensor letof the matrix , that is,
The moment polytope of a nonzero tensor is defined as the set of triples of probability vectors
In the above definition the set is the orbit closure (closure under equivalently the Zariski or Euclidean topology) of under the action of the group . This is the same as the closure of the set of all tensors such that is a restriction of . For the characterization of the asymptotic weighted slice rank we furthermore need the Shannon entropy of a probability vector , which is defined as .
Let be a tensor over . For every , the limsup in the definition of is a limit and
This result extends the result in [CVZ18] that gave the same dual description only for the asymptotic slice rank (i.e., for ).
For tensors over arbitrary fields (i.e. different from ) we prove a dual description for the asymptotic weighted slice rank for an important subclass of all tensors, namely tensors with a tight support555The support of a tensor is the set of all triples of coordinates with nonzero coefficient. We call such a support tight if there are injective maps such that for every it holds that . A tensor in is called tight if its support is tight for some choice of bases for the vector spaces .. Examples of tensors with a tight support have appeared in the applications to matrix multiplication and combinatorics. In this version of the dual description, the moment polytope (which is only available over ) is replaced by the polytope of triples of marginals of probability vectors on the support of the tensor.
Let be a tensor with tight support. For every , the limit in the definition of exists and
1.3 Quantum functionals
The quantum functionals were introduced in [CVZ18] to study the asymptotic behaviour of complex tensors, advancing a line of research initiated by Strassen in the context of the arithmetic complexity of matrix multiplication. The quantum functionals are a family of functions parametrized by the probability simplex
Each quantum functional is defined as an optimization problem over the moment polytope that we defined previously. Let . For any nonzero complex tensor we define
where, as before, denotes the Shannon entropy of a probability vector. The functions are called the quantum functionals.
The main theorem on the quantum functionals that was proved in [CVZ18] is that for any two complex tensors and the following properties hold:
Monotonicity: ,666Recall that means that is a restriction of , i.e. that can be obtained from by applying linear maps to the tensor legs.
Normalization: for any where is the unit tensor.
Tensors parameters that satisfy the above four properties were called universal spectral points in [Str88]. In particular, the quantum functionals (and universal spetral points in general) have the following useful application: if an asymptotic inequality holds for some rate , then for all . Therefore, contrapositively, if there exists a such that then the asymptotic inequality cannot hold. (Considering all currently available results and examples, it is possible that holds if and only if for all .)
The above properties of the quantum functionals were used to prove barriers for square matrix multiplication in [CVZ19], which used the quantum functional with the uniform . These barriers were concurrently obtained in [Alm19] using slice rank in an important line of work [AW18a, AW18b, BCC17b, BCC17a] improving and extending the earliest barriers in [AFLG15]. Barriers for rectangular matrix multiplication were obtained in [CGLZ20], which used the quantum functionals with non-uniform .
Our second main result is a correspondence between the quantum functionals and the asymptotic weighted slice ranks. This correspondence allows us to compute the asymptotic weighted slice ranks in terms of the quantum functionals and vice versa.
For every complex tensor and every we have
Conversely, for every complex tensor and every we have
While the quantum functionals are only defined over the complex numbers, the asymptotic weighted slice ranks are defined over any field. We conjecture that via the correspondence in Theorem 7 we can extend the quantum functionals to other fields, as follows.
Define for every the function on tensors over a field via
The following properties follow directly from properties of as we will see:
For every field and every , the function is monotone and normalized.
We make the following conjecture:
For every field and every , the function is additive and multiplicative.
Thus creftype 9 (if true), together with Proposition 8, implies that the are universal spectral points over . We provide two pieces of evidence for creftype 9, namely that the conjecture is true for (since then equals the quantum functionals for tensors over by Theorem 7) and that the conjecture is true for the subclass of tight tensors over arbitrary fields .
As another consequence of the dual description in Theorem 7 and the super-multiplicativity and super-additivity of the quantum functionals, we will obtain via a general argument the following properties of the asymptotic weighted slice rank.
Over the complex numbers the asymptotic -weighted slice rank is:
In the process of proving the above results we obtain a characterization of an important class of tensors in terms of the asymptotic weighted slice rank. A tensor is defined to be semistable, under the action of the group , if the closure of the orbit does not contain the element 0.
A tensor is semistable under the action of the group if and only if for every .
We also prove a version of Theorem 11 for tensors over arbitrary fields, with the condition that is semistable replaced by the condition that all powers of are semistable.
1.4 Asymptotic non-commutative ranks
For every complex tensor we have that
Non-commutative rank has received much interest in recent years [IQS15, DM18]. In quantum information theory, non-commutative rank corresponds to tripartite-to-bipartite entanglement transformations [CDS10], and formulas for asymptotic non-commutative rank have previously been obtained for special cases [LQWD18, Jen19].
1.5 Min-entropy quantum functionals
Finally, we explore a new variation on the quantum functionals by replacing the Shannon entropy with a smaller notion of entropy called the min-entropy. The min-entropy of a probability vector is defined as . Thus the min-entropy is the number of bits required to describe the largest coefficient in a probability vector. It is the smallest among all of the Rényi entropies and in particular satisfies for every . For , we define the min-entropy quantum functional as
It follows directly from that . Expanding the singular value definition of the moment polytope we have
where, as before, denote the ordered singular values of the matrix . In other words, we may write in terms of the Frobenius norm and the spectral norm of matrices as
We prove the following basic properties of the min-entropy quantum functionals.
The min-entropy quantum functionals are monotone, normalized and super-multiplicative.
Next we prove a minimax correspondence to a second family of functions, just like we did for the quantum functionals. We define for the function
For every and every tensor we have
For every and every tensor we have
The function is similar to the G-stable rank that was introduced in [Der20], but subtly different, as our kind of weighing by is different.
For every the function is monotone, normalized and super-multiplicative.
1.6 Organization of the paper
This concludes the introduction. In the remaining sections we discuss the results and technical lemmas in detail and provide the proofs for all claims. In Section 2 we discuss the basic properties of the weighted slice rank. In Section 3 we characterize the asymptotic weighted slice rank for arbitrary complex tensors and for tight tensors over arbitrary fields. In Section 4 we discuss a general correspondence for what we call dual pair families, of which the quantum functionals and asymptotic weighted slice ranks are an example. In Section 5 we discuss the correspondence between the quantum functionals and the asymptotic weighted slice ranks, and we discuss the min-entropy quantum functionals.
2 Weighted slice rank
We introduced in Section 1.1 for the -weighted slice rank of a tensor as the minimum value of such that there is a slice decomposition of of size . In this section we will discuss its most basic properties.
2.1 Alternative characterization
We begin with an alternative characterization of the notion of a slice decomposition that we defined in Definition 2, the straightforward proof of which we leave to the reader.
A tensor has a slice decomposition of size if and only if there are subspaces of dimension such that
In other words, a tensor has a slice decomposition of size if and only if there is a choice of bases for the such that the support of has a block structure of the form
where the missing block has dimensions . Thus a slice decomposition corresponds to a block triangularization for tensors.
As a direct consequence of Lemma 16 we have the following alternative characterization of weighted slice rank:
For every tensor and the -weighted slice rank is the minimum over such that there are subspaces such that and
2.2 Basic properties
There are three basic properties of slice decompositions which we will discuss now, followed by a discussion of basic properties of the weighted slice ranks. Recall that we defined the restriction preorder on tensors by saying that if and only if there are (not necessarily invertible) matrices such that .
Let and be tensors.
If and has a slice decomposition of size , then there are nonnegative integers such that has a slice decomposition of size .
If has a slice decomposition of size and has a slice decomposition of size , then has a slice decomposition of size .
Every tensor has a slice decomposition of size , of size and of size , where .
1 If is an -slice in the slice decomposition of , and , then is either an -slice or zero.
2 This follows from adding the two slice decompositions.
Let and be tensors and let .
If , then .
For every tensor we have where .
If elementwise, then .
If a tensor has slice rank equal to , then for all .
Example 20 (Diagonal tensors and the square matrix multiplication tensor).
Example 21 (Rectangular matrix multiplication tensor).
3 Asymptotic weighted slice rank
In this section we analyze the asymptotic behaviour of the weighted slice rank when we take large powers of a tensor. More precisely, we study the asymptotic weighted slice ranks defined in Definition 4. This section is divided into three parts corresponding to three regimes:
In Section 3.1 we give a characterization of the tensors for which the asymptotic weighted slice rank is maximal. This characterization is in terms of semistable tensors. This part works for tensors over arbitrary fields. An important tool that we introduce in this part is a semistability test based on work of Kempf, which we will use in the other parts of this section.
In Section 3.2 we go deeper than just characterizing maximality and give a description of the value of the asymptotic weighted slice rank in terms of moment polytopes. This part works for tensors over the complex numbers.
Finally, in Section 3.3
we work over an arbitrary field again and give an upper bound on the value of the asymptotic weighted slice rank in terms of the polytope of probability distributions on the support of the tensors, the torus version of the moment polytope. For the subclass of tensors with tight support (still over arbitrary fields) we prove that this upper bound is tight.
the Schur–Weyl decomposition and
the weight decomposition.
The first is precisely described by the moment polytope and the second by its torus version.
3.1 Semistable tensors over arbitrary fields
In this section we characterize when the asymptotic weighted slice rank is maximal. Let be an arbitrary field and let be vector spaces over . For any tensor it holds that where , by the upper bound on the weighted slice rank (Lemma 192). We will characterize for what tensors this upper bound is an equality in terms of a notion called semistability from the field of geometric invariant theory. This characterization generalizes the connection between semistability and slice rank for cubic tensors (i.e., tensors of format ) explored in [BCC17a, BGO18] to tensors of non-cubic format (i.e., tensors of format with the ’s potentially different).
Semistability can be defined in high generality (for actions of reductive algebraic groups on schemes), but we are only interested in the natural action of the group on the tensor space .
Let , , be vector spaces over an algebraically closed field . A tensor is called semistable if the Zariski closure777Over this coincides with the closure in the Euclidean topology. (See, e.g., [Wal17, Lem. 3.1].) of the orbit does not contain . Otherwise the tensor is called unstable. 888When is not algebraically closed, we call a tensor semistable (unstable) if it becomes semistable (unstable) after extending the field to the algebraic closure .
The following semistability test, which follows from the results of Kempf [Kem78], and the proof of which we defer to Appendix A, gives a simple but powerful necessary condition for a tensor to be semistable.
Lemma 23 (Semistability test).
Let be irreducible representations of a group . Let act diagonally on the tensor product . For any nonzero tensor , if is invariant under the action of , then is semistable under the action of the group .999We stress that the semistability of tensors is always taken with respect to the action of the group and we will not mention this explicitly anymore. The group in Lemma 23 could be any group, as long as its action on the is irreducible.101010The condition in the semistabiliy test is sufficient for semistability but not necessary as the following example shows. It can be shown that the direct sum is semistable using the fact that this tensor is tight and the ideas of Section 3.3. However, by considering the stabilizer it follows that this tensor does not satisfy the condition of the semistability test.
We will use the semistability test in proofs in Section 3.2 and Section 3.3. For now we use the test to give simple explicit examples of semistable tensors, so that the reader has some tensors to work with in the rest of this subsection.
The matrix multiplication tensor , where , is semistable, because the matrix spaces , , are irreducible representations of the group and the matrix multiplication tensor is invariant under the resulting action of this group, which is sometimes called the sandwiching action. Finally, we apply the semistability test (Lemma 23) to find that the matrix multiplication tensor is semistable.111111In fact, from the stronger Lemma 63 that we discuss in Appendix A it follows that this tensor is polystable, meaning that its orbit is closed. For the connection to slice rank we need semistability and therefore we will discuss polystability only in Appendix A.
The diagonal tensor is semistable. An argument using the semistability test is as follows. Let be an integer. Denote by the representation of on given by . If , then this representation is irreducible, since every subspace invariant under is spanned by a subset of the standard basis and the only such subspaces invariant under are and . The diagonal tensor