# An Optimal Inverse Theorem

We prove that the partition rank and the analytic rank of tensors are equivalent up to a constant, over any large enough finite field (independently of the number of variables). The proof constructs rational maps computing a partition rank decomposition for successive derivatives of the tensor, on a carefully chosen subset of the kernel variety associated with the tensor. Proving the equivalence between these two quantities is the main question in the "bias implies low rank" line of work in higher-order Fourier analysis, and was reiterated by multiple authors.

## Authors

• 3 publications
• 4 publications
• ### p-order Tensor Products with Invertible Linear Transforms

This paper studies the issues about tensors. Three typical kinds of tens...
05/23/2020 ∙ by Jun Han, et al. ∙ 0

• ### An Orthogonal Equivalence Theorem for Third Order Tensors

In 2011, Kilmer and Martin proposed tensor singular value decomposition ...
03/29/2021 ∙ by Liqun Qi, et al. ∙ 0

• ### Enhanced nonconvex low-rank representation for tensor completion

Higher-order low-rank tensor arises in many data processing applications...
05/28/2020 ∙ by Haijin Zeng, et al. ∙ 0

• ### On Multilinear Forms: Bias, Correlation, and Tensor Rank

In this paper, we prove new relations between the bias of multilinear fo...
04/24/2018 ∙ by Abhishek Bhrushundi, et al. ∙ 0

• ### Understanding Deflation Process in Over-parametrized Tensor Decomposition

In this paper we study the training dynamics for gradient flow on over-p...
06/11/2021 ∙ by Rong Ge, et al. ∙ 0

• ### Structure vs. Randomness for Bilinear Maps

We prove that the slice rank of a 3-tensor (a combinatorial notion intro...
02/09/2021 ∙ by Alex Cohen, et al. ∙ 0

• ### A unifying Perron-Frobenius theorem for nonnegative tensors via multi-homogeneous maps

Inspired by the definition of symmetric decomposition, we introduce the ...
01/12/2018 ∙ by Antoine Gautier, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The interplay between the structure and randomness of polynomials is a recurrent theme in combinatorics, analysis, computer science, and other fields. A basic question in this line of research asks: if a polynomial is biased—in the sense that its output distribution deviates significantly from uniform—must it be the case that it is structured, in the sense that it is a function of a small number of lower-degree polynomials? Quantifying the trade-off between “biased” and “structured” for polynomials has been the topic of many works, and has applications to central questions in higher-order Fourier analysis, additive combinatorics, effective algebraic geometry, number theory, coding theory, and more.

Let us state the question formally. As is often done, we will only consider polynomials that are multilinear forms; it is known that the original question for degree- polynomials reduces to the case of -linear forms, by symmetrization, with little loss in the parameters and over any field of characteristic (see, e.g., [6, 9]). A -linear form, or a -tensor, over a field is a function , with finite-dimensional linear spaces over , that is separately linear in each of its arguments; equivalently, is a degree- homogeneous polynomial of the form with . One may also identify a -tensor with the -dimensional matrix , or with an element of the tensor space . Structure and randomness for tensors are defined using the following two notions:

• The partition rank of over a field is the smallest such that is a sum of tensors of the form , where and are tensors over in complementary, nonempty subsets of the variables (i.e., and for some partition with ).

• The analytic rank of over a finite field is

 AR(P):=−log|F|Ex∈V1×⋯×Vkχ(T(x))

where is any nontrivial additive character (the definition is independent of the choice of ).

Partition rank was introduced by Naslund [18] (a similar notion for polynomials was considered earlier by Schmidt [19], and more recently by Ananyan and Hochster [2]). Analytic rank was introduced by Gowers and Wolf [6] in the context of higher-order Fourier analysis. It is known that (see Kazhdan and Ziegler [10] and Lovett [16]). The structure-vs-randomness question is the “inverse” problem, asking to bound from above in terms of .

Following Green and Tao [7], Lovett and Bhowmick [3] proved that with of Ackermann-type growth. Much better bounds were obtained for specific values of by Haramaty and Shpilka [8]: for , and for ; Lampert [14] improved the bound for to a polynomial bound. Back to , a recent result of the authors [4] gives if ; Adiprasito, Kazhdan, and Zieglar [1] independently obtained a similar result using related ideas. However, the problem of obtaining a reasonable bound for -tensors remained elusive, until the groundbreaking works of Milićević [17], and of Janzer [9] for small fields, who obtained polynomial bounds: with , thus proving a conjecture of Kazhdan and Ziegler [11].

Multiple authors (e.g., [1, 8, 12, 16]) have asked whether—or conjectured that— and are in fact equal up to a constant: that structure and randomness of polynomials are two sides of the same coin.

###### Conjecture 1.

For every there is such that the following holds for every finite field . For every -tensor over ,

 PR(T)≤C⋅AR(T).

In this paper we prove the conjecture, over any large enough field.

###### Theorem 1.1.

For every and there is such that the following holds for every finite field with . For every -tensor with ,

 PR(T)≤(2k−1−1)AR(T)+1.

We note that Theorem 1.1 applies in arbitrary characteristic. Moreover, the bound on the field size is quite effective (no more than an -fold exponential with a fixed ; possibly double exponential).

### 1.1 Proof overview

We begin our proof, as mathematicians often do, by moving beyond the given setting in Conjecture 1 of a finite field , and instead work in the algebraic closure . This enables us to use tools from algebraic geometry. The main drawback of this approach, as one might assume, is that finding a partition rank decomposition over —that is, one where the summands are tensors over —need not yield a partition rank decomposition over , which is what Conjecture 1 calls for.

The analogue of analytic rank over an algebraically closed field is the geometric rank. To define it, we view a -tensor (

are finite-dimensional vector spaces over

) as a -linear map ; this is done by considering the “slices” of , say along the axis, as -linear forms (for this is the familiar correspondence between matrices and linear maps). The geometric rank is the codimension of the kernel variety of ,

Geometric rank was introduced in [13], where in particular it was shown to be independent of the choice of axis to slice along; for this is column rank equals row rank. (A similar definition over the complex numbers was used by Schmidt [19] in the context of number theory.) That is the analogue of over is suggested by the identity , where is the set of -rational points in . Thus, moving to the algebraic closure recasts Conjecture 1 as asking to bound an algebraic notion of complexity (partition rank) by a geometric notion of complexity (the codimension of the kernel variety).

In order to circumvent the main drawback of moving to mentioned above, an important idea in our proof is that we not only find a partition rank decomposition for but also for a whole family of related tensors: the total derivatives of of all orders, , at all points in an open subset of . Specifically, for each derivative we construct a rational map that takes each point from that open subset to some partition rank decomposition. To find such a rational (partition rank) decomposition, we consider the tangent space of the variety at each of the points, parameterize it using a rational map, and take a derivative along the tangent vectors.

We note that the total derivative is given by the Jacobian matrix, which makes it a polynomial matrix, meaning a matrix whose entries are polynomials; more generally, is a polynomial -tensor. At the core of our result is a proof that if a polynomial -tensor has a rational decomposition on an open subset of a variety , then the total derivative has a rational decomposition, on an open subset of , of size bounded by the codimension of . Since the total derivative of is a polynomial -tensor, we are back in the same setting as with , enabling an iterative process. We apply this process starting with the polynomial matrix and the kernel variety . To begin the process, however, we must first obtain a rational decomposition of by other means. This turns out to require a rational map computing a rank factorization of the polynomial matrix , which we explicitly construct. We then apply the iterative process so that, by the end of it, we have a rational decomposition, on an open subset of , for the polynomial -tensor . Since the total derivative retains all the partial derivatives throughout the iterations, we can reconstruct from by evaluating the latter at any point. Now, one can obtain a partition rank decomposition over by simply evaluating at any point from the open subset of on which the rational decomposition is defined. This gives the following bound.111As we later learned through [1], Theorem 1.2 improves a previous upper bound of Schimdt [19] of roughly . Schmidt’s proof is over the complex numbers, and it is unclear whether it can be used to obtain a partition rank decomposition over finite fields.

###### Theorem 1.2.

For every -tensor over any algebraically closed field,

 PR(T)≤(2k−1−1)GR(T).

Crucially, our proof method is flexible enough to yield a partition rank decomposition over the finite field —as long as we can guarantee that our open subset of has an -rational point. In the last part of the proof of Theorem 1.1, we first observe that in order to bound the analytic rank of it suffices to apply our iterative process on any subvariety of that contains all its -rational points. We then show, using results from effective algebraic geometry, that one can indeed find such a subvariety that, moreover, can be guaranteed to have at least one -rational point in the domain of our rational decomposition. Evaluating our rational decomposition at that point yields a small partition rank decomposition over .

As is evident from the overview above, our proof is self-contained and does not apply results from additive combinatorics, nor any “regularity lemma” for polynomials or notions of quasi-randomness. This makes our proof quite different from arguments previously applied in this line of research.

Finally, we note that, among other applications, Theorem 1.1 gives the best known bounds, over large enough finite fields, for the inverse theorem for Gowers norms in the special case of polynomial phase functions (see, e.g., Green and Tao [7] and Janzer [9]).

### 1.2 Paper organization

In Section 2 we specify the space in which our decompositions live, and relate them to partition rank decompositions. In Section 3 we define the total derivative of rational maps, and prove that a total derivative of a tensor of a sufficiently high order contains enough information to reconstruct the original tensor. In Section 4 we give a rational map for a rank factorization of a matrix (used in the base case of our inductive proof) and a rational map parameterizing the kernel of a matrix (used in the induction step). In Section 5 we define tangent spaces, and obtain a rational map parameterizing the tangent space of a variety at non-singular points. In Section 6 we give our inductive proof, which yields a partition rank decomposition assuming a non-singular -rational point on the kernel variety of the tensor. Finally, in Section 7 we complete the proof of Theorem 1.1 by showing that over large enough fields, a non-singular -rational point can always be found.

### 1.3 Notation

An -polynomial is a polynomial with coefficients in (i.e., an element of ). A rational map is an -tuple of rational functions over (i.e., quotients of -polynomials). Note that a rational map can be viewed as a function whose domain is the set of inputs for which all rational functions are defined. We will use the convention that can also be evaluated at points , in which case if ().

We will need some basic algebro-geometric terminology. We say that a variety is defined over if it can be cut out by some -polynomials , meaning

 V={x∈¯¯¯Fn∣f1(x)=⋯=fm(x)=0}.

The ideal of is . Any variety can be uniquely written as the union of irreducible varieties, where a variety is said to be irreducible if it cannot be written as the union of strictly contained varieties. The dimension of a variety , denoted , is the maximal length of a chain of irreducible varieties . The codimension of is simply . A point is said to be -rational if ; we denote by the set of -rational points in .

Crucial to our proof is the simultaneous use of different aspects of a -tensor. Henceforth, we fix bases for our vector spaces, so that we will freely pass between a vector space and its dual , between matrices in and linear maps , and more generally, between -tensors in and -linear forms or -linear maps .

## 2 Constructing Space

Let be a space of -tensors over a field . The -constructing space over is the vector space

 Cr(V)=⨁∅≠S⊆[k−1](V⊗S×V⊗¯¯¯S)r

where for .222We take the tensor product in order; that is, if then Note that the direct summands correspond to unordered, non-trivial partitions of (i.e., with ). For example, the order- and order- -constructing spaces are

 Cr(V1⊗V2)=(V1×V2)r,
 Cr(V1⊗V2⊗V3)=(V1×(V2⊗V3))r⊕(V2×(V1⊗V3))r⊕((V1⊗V2)×V3)r.

We will frequently identify with the isomorphic linear space

 Cr1(V)×Cr2(V)=(⨁∅≠S⊆[k−1]V⊗S)r×(⨁∅≠S⊆[k−1]V⊗¯¯¯S)r.

For example, for ,

 Cr1(V)×Cr2(V)=(V1⊕V2⊕(V1⊗V2))r×((V2⊗V3)⊕(V1⊗V3)⊕V3)r.

Let be the inner product function,

 IP(u1,…,un,v1,…,vn)=∑i∈[n]ui⊗vi.

Observe that applying on the constructing space gives a map into the tensor space :

 IP:(u(i,S),v(i,¯¯¯S))i∈[r],∅≠S⊆[k−1]↦∑i∈[r]∑∅≠S⊆[k−1]u(i,S)⊗v(i,¯¯¯S)∈V,

where and . Note that, if we identify each vector space of with some , then is a polynomial (in fact bilinear) map into ; explicitly,

 IP:(x(i,S)I,y(i,¯¯¯S)J)i∈[r],∅≠S⊆[k−1],I∈nS,J∈n¯¯¯S↦(∑i∈[r]∑∅≠S⊆[k−1]x(i,S)ISy(i,¯¯¯S)I¯¯¯S)I∈n[k] (2.1)

where for , and denotes the projection of onto , meaning .

Since the number of sets is , we have, immediately from definition, that if a -tensor lies in the image of on the -constructing space then its partition rank is at most .

###### Fact 2.1.

Let with vector spaces over . If then .

## 3 Total Derivative

The total derivative of a rational map is the rational map given by

 Df(x)=(∂jfi(x))i∈[m],j∈[n].

That is, the total derivative maps each point to the Jacobian matrix of the rational functions evaluated at that point. In particular, is a matrix whose every entry is a rational function. The directional derivative of along is thus given by

 (Df(x))v=(∂1fi(x)v1+⋯+∂nfi(x)vn)i∈[m]∈Fm.

Observe that we may iterate the total derivative operator. For any integer , the order- total derivative of a rational map is the rational map given by

 Daf(x)=(∂j1,…,jafi(x))i∈[m],j1,…,ja∈[n].

In particular, is an -dimensional matrix, or -tensor, whose every entry is a rational map. We will need the following properties of the total derivative:

• Sum rule: .

• Product rule: .

• Quotient rule: .

• Chain rule: .

• The rational maps and have the same domain.

Note that if is a single polynomial of degree then is an -tensor whose every component is a polynomial of degree . In particular, is a constant -tensor.

Henceforth, a slicing of a -tensor is the tuple of -tensor slices of along one of the axes , which we view as a -linear map . Recall that for any and , the order- total derivative is an -tensor.

###### Claim 3.1.

Let be a -tensor over , and let be a slicing of . Then for every .

###### Proof.

If is the slicing of along then by identifying with and with , we see that it suffices to prove the following claim: For every -linear map and we have that , by which we mean that can be obtained from the -tensor by viewing it as a -linear map and evaluating it at some point . Clearly, it suffices to prove, for each of the -linear forms of , that for every . Write

 p(x)=p(x1,…,xd)=∑J=(j1,…,jd)cJx1,j1⋯xd,jd

with . The -tensor is given by

 Ddp(x) =(∂(i1,j1),…,(id,jd)p(x))(i1,j1),…,(id,jd) ={cj1,…,jdi1,…,id∈[d] distinct0otherwise.

Therefore, the corresponding -linear form is given by

 Ddp(x)=∑j1,…,jd∑i1,…,id distinctcj1,…,jdy(1)i1,j1⋯y(d)id,jd

and thus evaluates (at ) to

 (Ddp(x))((y1,0…,0),(0,y2,0,…,0),…,(0,…,0,yd))=p(y).

Therefore, for every . As explained above, this completes the proof. ∎

## 4 Rational Linear Algebra

For a matrix , we denote by the submatrix of consisting of the first rows and the first columns. A rank factorization of a matrix of rank is a pair of matrices such that . We next show that rank factorization can be done via a rational map, mapping (an open subset of) matrices of rank exactly to a rank- factorization thereof.

###### Lemma 4.1.

Let and any field. There is a rational map such that for any matrix of rank with invertible, is a rank factorization of . Moreover, has as a submatrix.

###### Proof.

We first show that if and is invertible then we have the matrix identity

 A=Am×r(Ar×r)−1Ar×n. (4.1)

Since the column rank of equals , the first columns of are not just linearly independent but also a basis for the column space. Therefore, there is a matrix such that

 Am×rX=A.

Restricting this equality to the first rows, we have . Since is invertible, we obtain

 X=(Ar×r)−1Ar×n.

Combining the above gives the identity (4.1).

Let be the rational map

where we recall that the adjugate matrix has the property that each entry is a polynomial in the entries of (a cofactor). We have (via Cramer’s rule) that . Thus, by (4.1), is a rank factorization for any rank- matrix with . Finally, for the “moreover” part, note that contains as its first columns the submatrix . ∎

Next, we show that projecting onto the kernel of a matrix can be done by a rational map. Recall that a matrix is a projection matrix onto a subspace if and .

###### Lemma 4.2.

Let and any field. There is a rational map such that for any matrix of rank with invertible, is a projection matrix onto ; moreover, in the projection matrix all but the first rows are zero.

###### Proof.

Let be the rank-factorization rational map given by Lemma 4.1. Let be a matrix of rank with invertible, and write . Then and for some . Let

 P=(0−X0In−r)n×n. (4.2)

Note that . Furthermore, for , we have that

It follows that . Thus, the nonzero columns of lie in , and since they are linearly independent and , we deduce that the column space of is , that is, . This means is a projection matrix onto . Now, observe that the function that maps any matrix —provided is of rank with invertible—to the matrix in (4.2) is given by a rational map . This completes the proof. ∎

## 5 Tangent spaces

The tangent space of a variety at is the linear space

Equivalently, if is a generating set for the ideal then the tangent space to at is the kernel , where is the Jacobian matrix

 ⎛⎜ ⎜ ⎜⎝(∂x1g1)(x)⋯(∂xng1)(x)⋮⋱⋮(∂x1gs)(x)⋯(∂x1gs)(x)⎞⎟ ⎟ ⎟⎠s×n.

A point on an irreducible variety is non-singular if . We have the following basic fact about tangent spaces (see, e.g., Theorem 2.3 in [20]).

###### Fact 5.1.

For every irreducible variety and , .

Recall that a field is perfect if each element has a root of order or . In particular, algebraically closed fields and finite fields are perfect. We will use the following known fact (which can be obtained from Lemma 9.5.19 in [22]).

###### Fact 5.2.

If an irreducible variety is defined over a perfect field then has a generating set of -polynomials.

We next show that over a perfect field, the tangent spaces—at non-singular points in a nontrivial open subset—can be parameterized by a rational map.

###### Lemma 5.3.

Let be an irreducible variety defined over a perfect field , and a non-singular point on . There is a rational map that is defined at , such that for every , is a projection matrix onto ; moreover, in the projection matrix all but the first rows are zero

###### Proof.

Since is a perfect field, Lemma 5.2 implies that the ideal has a generating set of -polynomials, . Let be the corresponding Jacobian matrix, viewed as a polynomial map . By definition, for every .

Put . Since is a non-singular point on , we have that . In particular, the submatrix is invertible for some and with . Apply Lemma 4.2 with , together with appropriate row and column permutations, to obtain a rational map such that for any matrix , with and invertible, is a projection matrix onto . By construction, is defined at . Let be the rational map given by the composition . Then is defined at ; moreover, for every , where and invertible, we have that is a projection matrix onto . This completes the proof. ∎

## 6 Partition-Rank Decomposition

Recall that a polynomial matrix is polynomial map into matrices (or a matrix whose every entry is a polynomial). For a polynomial matrix , we say that a point on a variety is -singular if . Note that this extends the notion of a singular point, which is obtained by taking to be the Jacobian of a generating set of . In this section we prove the following bound on the partition rank—over any perfect field.

###### Theorem 6.1.

Let be a -tensor over a perfect field . Let be a variety defined over which has an -rational point that is neither singular nor -singular. Then

 PR(T)≤(2k−1−1)codimX.

When the field in Theorem 6.1 is algebraically closed, taking and a generic point on it immediately gives the bound , thus proving Theorem 1.2.

The proof of Theorem 6.1 starts with a slicing of , obtains a rational decomposition for the polynomial matrix using the rank-factorization map from Lemma 4.1, then proceeds to iteratively construct a rational decomposition for by parameterizing the tangent spaces to and taking derivatives along tangent vectors, and finally evaluates the rational decomposition at the guaranteed non-singular -rational point to obtain a partition rank decomposition over for itself.

The following theorem gives the inductive step in the proof of Theorem 6.1. For rational maps and we write if for every for which and are defined.

###### Theorem 6.2.

Let be an irreducible variety defined over a perfect field , a non-singular point on , and a polynomial map. If there exists a rational map defined at such that then there exists a rational map defined at such that with .

Theorem 6.2 can be illustrated using the commutative diagrams:

The proof of Theorem 6.2 proceeds by splitting the derivative into two parts: a main term which is mapped into the portion of the constructing space that only involves tensors on , and a remainder term—controlled by the codimension of —which is mapped into the complementary portion of .

Before going into the proof, we remind the reader that if is a rational map into a space of -tensors, then the total derivative is a -tensor, while the directional derivative of along any vector is again a -tensor, .

###### Proof.

For we henceforth write . Recall from (2.1) that on the constructing space can be viewed as a bilinear map. We claim that for every we have

To see this, note that each component of the map is of the form (where the sum is over a subset of the variables on ). Using the product rule for the total derivative, we have

 (Db(x1,x2))(y1,y2) =(x2,1,…,x2,ℓ,x1,1,…,x1,ℓ)⋅(y1,1,…,y1,ℓ,y2,1,…,y2,ℓ) =ℓ∑i=1x2,iy1,i+ℓ∑i=1x1,iy2,i=b(x1,y2)+b(x2,y1),

which proves our claim. Using the chain rule for the total derivative,

 D(IP∘H)(x)=(DIP)(H(x))DH(x)=(DIP)(H1(x),H2(x))(DH1(x),DH2(x))=IP(H1(x),DH2(x))+IP(H2(x),DH1(x)). (6.1)

Put . Since , we have that is a non-empty open subset of the irreducible variety , and thus its Zariski closure is . Taking the total derivative at any along a tangent vector , we claim that

 (Df(x))v=(DIP∘H(x))v. (6.2)

Indeed, if () is any one of the components of then, since , we have . Thus, for every , and so . Since by the definition of a tangent space, we have

 Dpq(x)=1q(x)2(q(x)Dp(x)−p(x)Dq(x))=1q(x)Dp(x),

which means , and so