1. Introduction
Determining the asymptotic algebraic complexity of matrix multiplication is a central open problem in algebraic complexity theory. Recent “barrier results” show various limitations for certain approaches to yield fast matrix multiplication algorithms [AFLG15, BCC17a, BCC17b, AW18a, AW18b]. Constructions of fast matrix multiplication algorithms typically (and this is in particular true for the most successful approaches) consist of two components: an efficient reduction of matrix multiplication to an intermediate problem and an efficient algorithm for the intermediate problem. We introduce a general barrier for such constructions based on a new notion called irreversibility, providing stronger limitations than in previous work. The barrier we introduce in this paper is build on the framework of Strassen developed in [Str87, Str88, Str91, CVZ18].
The asymptotic algebraic complexity of matrix multiplication is succinctly represented by the matrix multiplication exponent , which is the infimum over all real numbers such that matrices can be multiplied with algebraic operations. The state-of-the-art is and there has been tremendous effort to obtain better upper bounds on [CW90, Sto10, Wil12, LG14, CU03, CU13].
An intuitive explanation of our barrier is as follows. In the language of tensors, the matrix multiplication exponent is the optimal “rate of transformation” from the “unit tensor” to the “matrix multiplication tensor”,
(1) |
The rate of transformation naturally satisfies a triangle inequality and thus upper bounds on can be obtained by combining the rate of transformation from the unit tensor to some intermediate tensor and the rate of transformation from the intermediate tensor to the matrix multiplication tensor; this is the two-component approach alluded to earlier,
(2) |
We define the irreversibility of the intermediate tensor as the necessary “loss” that will occur when transforming the unit tensor to the intermediate tensor followed by transforming the intermediate tensor back to the unit tensor. It is well-know that the transformation rate from the matrix multiplication tensor to the unit tensor is , so we can extend (2) to
(3) |
We thus see that is directly related to the irreversibility of the intermediate tensor, and hence the irreversibility of the intermediate tensor provides limitations on the upper bounds on that can be obtained from (2). In particular, any fixed irreversible intermediate tensor cannot show via (2), since the matrix multiplication tensor is reversible when .
To exemplify our barrier we show that the support functionals [Str91] and quantum functionals [CVZ18] give (so far, the best) lower bounds on the irreversibility of the following families of tensors:
-
[leftmargin=2em]
-
the small Coppersmith–Winograd tensors
-
the big Coppersmith–Winograd tensors
-
the reduced polynomial multiplication tensors
which for small parameters leads to the following explicit barriers:
-barrier 2 2 3 2.02.. 4 2.06.. 5 2.09.. 6 2.12.. 7 2.15.. -barrier 1 2.16.. 2 2.17.. 3 2.19.. 4 2.20.. 5 2.21.. 6 2.23.. -barrier 1 2.17.. 2 2.16.. 3 2.15.. 4 2.15.. 5 2.14.. 6 2.14.. |
Indeed, as suggested by the values in the above tables, the -barrier and -barrier increase with (converging to 3), whereas the -barrier decreases with (converging to 2).
Compared to Ambainis, Filmus and Le Gall [AFLG15] our barriers are valid for a larger class of approaches (and naturally we obtain lower barriers). Compared to Alman and Williams [AW18b] our barriers are valid for a larger class of approaches but our barriers are also higher. As a variation on our barrier we introduce a “monomial” version. Compared to Blasiak, Church, Cohn, Grochow, Naslund, Sawin and Umans [BCC17a], and Blasiak, Church, Cohn, Grochow and Umans [BCC17b] our monomial barriers are valid for a larger class of approaches. We have not tried to optimise the barriers that we obtain here, but focus instead on introducing the barrier itself.
It will become clear to the reader during the development of our ideas that they not only apply to the problem of fast matrix multiplication, but extend to give barriers for the more general problem of constructing fast rectangular matrix multiplication algorithms or even transformations between arbitrary powers of tensors. Such transformations may represent, for example, asymptotic slocc (stochastic local operations and classical communication) reductions among multipartite quantum states [BPR00, DVC00, VDDMV02, HHHH09].
2. Irreversibility
We begin by introducing some standard notation and terminology. Then we discuss a useful notion called the relative exponent and we define the irreversibility of a tensor. After that we introduce the monomial versions of these ideas and discuss so-called balanced tensors.
2.1. Standard definitions
We assume familiarity with tensors and with the tensor Kronecker product and direct sum. All our tensors will be 3-tensors over some fixed but arbitrary field . For two tensors and we write and say restricts to if there are linear maps such that . For we define the diagonal tensor (also called the rank- unit tensor) . The tensor rank of is defined as (this coincides with the definition that is the smallest size of any decomposition of into a sum of simple tensors) and the subrank of is defined as . The asymptotic rank of is defined as
(4) |
and the asymptotic subrank of is defined as
(5) |
The above limits exist and equal the respective infimum and supremum by Fekete’s lemma. For the matrix multiplication tensor is defined as
(6) |
The matrix multiplication exponent is defined as . The meaning of in terms of algorithms is: for any there is an algorithm that for any multiplies two matrices using scalar additions and multiplications. The difficulty of determining the asymptotic rank of is to be contrasted with the situation for the asymptotic subrank; to put it in Strassen’s words: Unlike the cynic, who according to Oscar Wilde knows the price of everything and the value of nothing, we can determine the asymptotic value of precisely [Str88],
(7) |
2.2. Relative exponent
For a clean exposition of our barrier we will use the notion of relative exponent, which we will define in this section. This notion is inspired by the notion of rate from information theory and alternatively can be seen as a versatile version of the notion of the asymptotic preorder for tensors of Strassen. In the context of tensors, the relative exponent previously appeared in [VC15].
Assumption 1.
To avoid irrelevant technicalities, we will from now on, without further mentioning, only consider tensors that are not of the form .
Definition 2.
For two tensors and we define the relative exponent from to as
(8) | ||||
(9) |
The limit is a supremum by Fekete’s lemma. Let us briefly relate the relative exponent to the basic notions and results stated earlier. The reader verifies directly that the identities
(10) | |||
(11) |
hold. By definition of the matrix multiplication exponent holds
(12) |
We know from (7) that
(13) |
The relative exponent has the following two basic properties that the reader verifies directly.
Proposition 3.
Let , and be tensors.
-
[label=()]
-
.
-
(triangle inequality).
2.3. Irreversibility.
Our barrier framework relies crucially on the irreversibility of a tensor, a new notion that we define now.
Definition 4.
We define the irreversibility of a tensor as the product of the relative exponent from to and the relative exponent from to , i.e.
(14) |
Thus measures the extent to which the asymptotic conversion from to is irreversible, explaining the name. Equivalently, the irreversibility is the ratio of the logarithms of the asymptotic rank and the asymptotic subrank, i.e.
(15) |
From the basic properties of the relative exponent (Proposition 3) follows directly the inequality .
Proposition 5.
For any tensor holds that
(16) |
Definition 6.
Let be a tensor.
-
If , then we say that is reversible.
-
If , then we say that is irreversible.
For example, for any the diagonal tensor is reversible. In fact, we do not know of any other reversible tensors.
2.4. Monomial relative exponent and monomial irreversibility
The following restrained version of relative exponent and irreversibility will be relevant. For two tensors and we write and say monomially restricts to if there are linear maps , the corresponding matrices of which are generalised sub-permutation matrices in the standard basis, such that [Str87, Section 6]. Replacing the preorder by in Section 2 gives the notions of monomial subrank , monomial asymptotic subrank and monomial relative exponent . (For simplicity we will use monomial restriction here, but our results will also hold with replaced by monomial degeneration defined in [Str87, Section 6].) Note that the notions and only depend on the support of the tensor, and not on the particular values of the nonzero coefficients. We define the monomial irreversibility of as the product of the (normal) relative exponent from to and the monomial relative exponent from to ,
(17) |
Equivalently, we have
(18) |
(This notion may depend on the tensor and not only on the support.)
Proposition 7.
Let , and be tensors.
-
[label=()]
-
.
-
(triangle inequality).
-
.
-
.
Definition 8.
Let be a tensor.
-
If , then we say that is monomially reversible.
-
If , then we say that is monomially irreversible.
There exist tensors that are reversible and monomially irreversible. For example, let be the structure tensor of the algebra in the natural basis,
(19) |
Then we have , and (this is proven in [EG17, Tao16], see also [CVZ18] for the connection to [Str91]), so that and
With regards to matrix multiplication, the standard construction for (13) in fact shows that
(20) |
2.5. Balanced tensors
We finish this section with a general comment on upper bounds on irreversibility. A tensor with is called balanced if the corresponding maps , and (called flattenings) are full-rank and for each there is an element such that has full-rank [Str88, page 121]. For any tensor space with cubic format over an algebraically closed field , being balanced is a generic condition, i.e. almost all elements in such a space are balanced. Balanced tensors are called 1-generic tensors in [LM17]. Let be balanced. Then [Str88, Proposition 3.6]
(21) | |||
(22) |
and so
(23) |
If moreover , then
(24) |
3. Barriers through irreversibility
With the new notion of irreversibility available, we present a barrier for approaches to upper bound via an intermediate tensor .
3.1. The irreversibility barrier
For any tensor the inequality
(25) |
holds by the triangle inequality. Any such approach to upper bound respects the following barrier in terms of the irreversibility of .
Theorem 9.
For any tensor holds
(26) |
Proof.
By the triangle inequality (Proposition 3),
(27) |
Therefore, using the fact from (13), we have
(28) |
This proves the claim. ∎
Theorem 9, in particular, implies that if , then , i.e. one cannot prove via any fixed irreversible intermediate tensor. (Of course one can consider sequences of intermediate tensors with irreversibility converging to 1.)
3.2. Better barriers through more structure
Naturally, we should expect that imposing more structure on the approach to upper bound leads to stronger barriers. In this section we impose that the final step of the approach is an application of the Schönhage -theorem. The Schönhage -theorem implies that for any and any tensor holds that
(29) |
We prove the following barrier in terms of and the irreversibility of .
Theorem 10.
For any tensor and holds
(30) |
Proof.
By the triangle inequality,
(31) |
Therefore,
(32) |
Subtracting , dividing by and using that (Proposition 5) gives the barrier
(33) |
This proves the claim. ∎
As a corollary of the above theorem we present a barrier on any approach of the following form. The Schönhage -theorem implies that for any and any tensor holds
(34) |
We prove the following barrier in terms of , and the irreversibility of the cyclically symmetrized .
Corollary 11.
For any tensor and and holds
(35) | ||||
(36) |
One verifies that . If is cyclically symmetric, then and we have the equality .
Proof.
One verifies directly that and
(37) |
Note that we are using rational powers here, which is justified by taking large enough powers of the relevant tensors. Using both inequalities and then applying Theorem 10 gives
(38) | ||||
(39) |
This proves the statement of the theorem. ∎
Remark 12.
For cyclically symmetric tensors our Corollary 11 implies the lower bound
(40) |
on the parameter (and the “universal” version ) studied in [AW18b], which is a significant improvement over the barrier
(41) |
proven in [AW18b, Theorem IV.1].
3.3. Better barriers through monomial irreversibility
Finally, we impose as an extra constraint that the transformation from the intermediate tensor to the matrix multiplication tensor happens via monomial restriction (Section 2.4), i.e. we consider the approach
(42) |
and the more structured approaches
(43) |
and
(44) |
The proofs in the previous sections can be directly adapted to prove:
Theorem 13.
For any tensor holds
(45) |
Theorem 14.
For any tensor and holds
(46) |
Corollary 15.
For any tensor and and holds
(47) | ||||
(48) |
4. Explicit irreversibility lower bounds
We have seen how barriers arise from lower bounds on irreversibility. In this section we compute lower bounds on the irreversibility of two well-known intermediate tensors that play a crucial role in the best upper bounds on : the small and big Coppersmith–Winograd tensors.
4.1. Irreversibility and the asymptotic spectrum of tensors
We begin with a general discussion of how to compute irreversibility. The asymptotic spectrum of tensors is the set of -monotone semiring homomorphisms from the semiring of tensors (with tensor product and direct sum as multiplication and addition) to the nonnegative reals,
(49) |
Strassen proves in [Str88] that and and he also proves (implicitly) that . From this we directly obtain:
Proposition 16.
Let be a tensor. Then
(50) |
In an ideal world we would know and use it to compute (or better, we would use it to compute ). In practice we currently only have partial knowledge of . This partial knowledge is easiest to describe in terms of the best known lower bounds on and the best known upper bounds on . The best known lower bounds on are simply the matrix ranks of each of the three flattenings of as described in Section 2.5. For arbitrary fields, the best general upper bounds on that we are aware of are the Strassen upper support functionals from [Str91], which we will define and use in the next section. They relate asymptotically to slice rank via [CVZ18]
(51) |
We are not aware of any example for which any of the inequalities in (51) is strict. For oblique tensors the right inequality is an equality [CVZ18] and for tight tensors both inequalities are equalities [Str91]. We thus have:
Proposition 17.
Let be a tensor. Then
(52) |
For complex tensors we have a deeper understanding of the theory of upper bounds on the asymptotic subrank, via the quantum functionals introduced in [CVZ18]. The quantum functionals satisfy and their minimum equals the asymptotic slice rank [CVZ18], i.e.
(53) |
For free tensors the right inequality in (53) is an equality [CVZ18]. We thus have:
Proposition 18.
If is complex, then
(54) |
4.2. Irreversibility of Coppersmith–Winograd tensors
We now compute lower bounds for the irreversibility of the Coppersmith–Winograd tensors. As mentioned, we will use the support functionals of Strassen [Str91] in our computation to upper bound the asymptotic subrank. For any with the upper support functional is defined as
(55) | |||
(56) |
where the minimum is over all tensors isomorphic to
, the maximum is over all probability distributions on the support of
in the standard basis, and denotes the Shannon entropy of the th marginal of . Strassen proves in [Str91] that .(Besides from the Strassen support functionals, upper bound on the asymptotic subrank of complex tensors may be obtained from the quantum functionals. For the tensors in Theorem 19 and Theorem 21, however, this will give the same bound, since these tensors are free [CVZ18, Section 4.3].)
Theorem 19 (Small Coppersmith–Winograd tensors [Cw90, Section 6]).
For the small Coppersmith–Winograd tensor
(57) |
the lower bound
(58) |
holds.
Proof.
The rank of each flattening of equals . Therefore, . To upper bound the asymptotic subrank one can upper bound the Strassen upper support functional with as in [CVZ18, Example 4.22] by
(59) |
We find that
(60) |
This proves the theorem. ∎
Remark 20.
If , then the right-hand side of (58) is at least See the table in Section 1 for more values. If , however, then the right-hand side of (58) equals 2. Theorem 19 thus does not rule out using to prove that . Indeed, as observed in [CW90, Section 11]), if , then .
Theorem 21 (Big Coppersmith–Winograd tensors [Cw90, Section 7]).
For the big Coppersmith–Winograd tensor
(63) |
the lower bound
(64) |
holds, where
(65) |
Proof.
The rank of each flattening of equals , which coincides with the well-known border rank upper bound . Therefore, .
To upper bound the asymptotic subrank we use the Strassen upper support functional with . In the standard basis, the support of is the set
(66) |
The symmetry implies that we can assign probability to each of , and , and to , and . This leads to an average marginal entropy of as defined in the theorem statement. The maximum of is attained at
(67) |
This proves the theorem. ∎
Remark 22.
Remark 23.
A lower bound on the irreversibility of the tensors mentioned in the introduction follows directly from the results in [Str91, Theorem 6.7].
4.3. Monomial irreversibility of structure tensors of finite group algebras
We now discuss irreversibility and monomial irreversibility in the context of the group-theoretic approach developed in [CU03]. This approach produces upper bounds on via intermediate tensors that are structure tensors of complex group algebras of finite groups. Let denote the structure tensor of the complex group algebra of the finite group , in the standard basis. The group-theoretic approach (in particular [CU03, Theorem 4.1]) produces an inequality of the form
(68) |
which ultimately leads to the bound
(69) |
where and are the monomial restriction and monomial relative exponent defined in Section 2.4.
Now the monomial irreversibility barrier from Section 3.3 comes into play. Upper bounds on the monomial asymptotic subrank of have (using different terminology) been obtained in [BCC17a, BCC17b, Saw17]. Those upper bounds imply that is monomially irreversible for every nontrivial finite group . Together with our results in Section 3.3 and the fact that the tensor is symmetric up to a permutation of the basis of one of the tensor legs, this directly leads to nontrivial barriers for the left-hand side of (69) for any fixed nontrivial group , thus putting the work of [BCC17a, BCC17b, Saw17] in a broader context. We have not tried to numerically optimise the monomial irreversibility barriers for group algebras.
Finally we mention that the irreversibility barrier (rather than the monomial irreversibility barrier) does not rule out obtaining via . Namely, is isomorphic to a direct sum of matrix multiplication tensors, and, therefore, . Thus, if , then is reversible,
Acknowledgements
MC acknowledges financial support from the European Research Council (ERC Grant Agreement No. 337603) and VILLUM FONDEN via the QMATH Centre of Excellence (Grant No. 10059). PV acknowledges support from the Hungarian National Research, Development and Innovation Office (NKFIH) grant no. K124152 and the Hungarian Academy of Sciences Lendület-Momentum grant for Quantum Information Theory, no. 96 141. This research was supported by the National Research, Development and Innovation Fund of Hungary within the Quantum Technology National Excellence Program (Project Nr. 2017-1.2.1-NKP-2017-00001). This material is based upon work supported by the National Science Foundation under Grant No. DMS-1638352.
References
-
[AFLG15]
Andris Ambainis, Yuval Filmus, and François Le Gall.
Fast matrix
multiplication: limitations of the Coppersmith-Winograd method (extended
abstract).
In
STOC’15—Proceedings of the 2015 ACM Symposium on Theory of Computing
, pages 585–593. ACM, New York, 2015. - [AW18a] Josh Alman and Virginia Vassilevska Williams. Further Limitations of the Known Approaches for Matrix Multiplication. In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pages 25:1–25:15, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
- [AW18b] Josh Alman and Virginia Vassilevska Williams. Limits on All Known (and Some Unknown) Approaches to Matrix Multiplication. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 580–591, Oct 2018. arXiv:1810.08671.
- [BCC17a] Jonah Blasiak, Thomas Church, Henry Cohn, Joshua A. Grochow, Eric Naslund, William F. Sawin, and Chris Umans. On cap sets and the group-theoretic approach to matrix multiplication. Discrete Anal., 2017. arXiv:1605.06702.
- [BCC17b] Jonah Blasiak, Thomas Church, Henry Cohn, Joshua A Grochow, and Chris Umans. Which groups are amenable to proving exponent two for matrix multiplication? arXiv, 2017. arXiv:1712.02302.
- [BPR00] Charles H. Bennett, Sandu Popescu, Daniel Rohrlich, John A. Smolin, and Ashish V. Thapliyal. Exact and asymptotic measures of multipartite pure-state entanglement. Phys. Rev. A, 63(1):012307, 2000.
- [CU03] Henry Cohn and Christopher Umans. A group-theoretic approach to fast matrix multiplication. In Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on, pages 438–449. IEEE, 2003.
- [CU13] Henry Cohn and Christopher Umans. Fast matrix multiplication using coherent configurations. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1074–1086. SIAM, 2013.
- [CVZ18] Matthias Christandl, Péter Vrana, and Jeroen Zuiddam. Universal points in the asymptotic spectrum of tensors. In Proceedings of 50th Annual ACM SIGACT Symposium on the Theory of Computing (STOC’18). ACM, New York, 2018. arXiv:1709.07851.
- [CW90] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. J. Symbolic Comput., 9(3):251–280, 1990.
- [DVC00] Wolfgang Dür, Guivre Vidal, and Juan Ignacio Cirac. Three qubits can be entangled in two inequivalent ways. Phys. Rev. A (3), 62(6):062314, 12, 2000.
- [EG17] Jordan S. Ellenberg and Dion Gijswijt. On large subsets of with no three-term arithmetic progression. Ann. of Math. (2), 185(1):339–343, 2017.
- [HHHH09] Ryszard Horodecki, Paweł Horodecki, Michał Horodecki, and Karol Horodecki. Quantum entanglement. Rev. Modern Phys., 81(2):865–942, 2009.
- [LG14] François Le Gall. Powers of tensors and fast matrix multiplication. In ISSAC 2014—Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pages 296–303. ACM, New York, 2014.
- [LM17] J.M. Landsberg and Mateusz Michałek. Abelian tensors. Journal de Mathématiques Pures et Appliquées, 108(3):333 – 371, 2017. arXiv:1504.03732.
- [Saw17] Will Sawin. Bounds for matchings in nonabelian groups. arXiv, 2017. arXiv:1702.00905.
- [Sto10] Andrew James Stothers. On the complexity of matrix multiplication. PhD thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/4734.
- [Str87] Volker Strassen. Relative bilinear complexity and matrix multiplication. J. Reine Angew. Math., 375/376:406–443, 1987.
- [Str88] Volker Strassen. The asymptotic spectrum of tensors. J. Reine Angew. Math., 384:102–152, 1988.
- [Str91] Volker Strassen. Degeneration and complexity of bilinear maps: some asymptotic spectra. J. Reine Angew. Math., 413:127–180, 1991.
- [Tao16] Terence Tao. A symmetric formulation of the Croot–Lev–Pach–Ellenberg–Gijswijt capset bound. https://terrytao.wordpress.com, 2016.
- [VC15] Péter Vrana and Matthias Christandl. Asymptotic entanglement transformation between W and GHZ states. J. Math. Phys., 56(2):022204, 12, 2015.
- [VDDMV02] F. Verstraete, J. Dehaene, B. De Moor, and H. Verschelde. Four qubits can be entangled in nine different ways. Phys. Rev. A (3), 65(5, part A):052112, 5, 2002.
- [Wil12] Virginia Vassilevska Williams. Multiplying matrices faster than Coppersmith-Winograd. Extended abstract. In STOC’12—Proceedings of the 2012 ACM Symposium on Theory of Computing, pages 887–898. ACM, New York, 2012.
Comments
There are no comments yet.