    # Closure of VP under taking factors: a short and simple proof

In this note, we give a short, simple and almost completely self contained proof of a classical result of Kaltofen [Kal86, Kal87, Kal89] which shows that if an n variate degree d polynomial f can be computed by an arithmetic circuit of size s, then each of its factors can be computed by an arithmetic circuit of size at most poly(s, n, d). However, unlike Kaltofen's argument, our proof does not directly give an efficient algorithm for computing the circuits for the factors of f.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Polynomial factorization is a fundamental problem at the intersection of algebra and computation and has been intensively studied in algebraic complexity theory. Given a multivariate polynomial , the goal is to find irreducible factors of . The nature of these algorithms as well as their efficiency varies depending upon on how the input polynomial is given. Two natural representations which are often used in this context are the monomial representation (where the polynomial is given as sum of its monomials) and the circuit representation (where the polynomial is given as an arithmetic circuit). In this note, we focus on the latter. In this setting, we are given an arithmetic circuit computing a multivariate polynomial, and the goal is to output arithmetic circuits for all its irreducible factors. The problem has been studied in both the whitebox setting (where we have access to the internal wirings of the input circuit) and in the blackbox setting (where we only have query access to the input circuit). In a sequence of extremely influential results in the 1980’s [Kal86, Kal87, Kal89, KT90], Kaltofen (and Kaltofen and Trager [KT90]) gave efficient randomized algorithms for this problem. A consequence of these results which has had extremely interesting applications in algebraic complexity theory [KI04] is that if an -variate degree polynomial has an arithmetic circuit of size , then each of its factors has an arithmetic circuit of size . In other words, the complexity class of polynomials is closed under taking factors.

In addition to being natural mathematical questions on their own, these closure results for polynomial factorization seem crucial to our current understanding of hardness randomness tradeoffs in algebraic complexity [KI04, DSY09, CKS18a]. In this note, we give a short, simple and almost completely self contained proof of the closure of under taking factors. More formally, we give a new111as far as we know. proof of the following result of Kaltofen.

###### Theorem 1.1 (Kaltofen).

Let be an -variate degree polynomial which can be computed by an arithmetic circuit of size . Let be a polynomial such that divides . Then, can be computed by an arithmetic circuit of size at most .

The original proof of Theorem 1.1 relies on some beautiful and neat mathematical ideas like Hensel’s lifting, effective Hilbert’s Irreducibility Theorem, etc, which are useful and interesting on their own. For our proof, we only rely on a simple and natural multivariate version of the classical Newton Iteration technique and the fact that the Resultant of two univariates tells us exactly when they have a non-trivial greatest common divisor . We hope that this simpler proof can shed some more insight on this closure result (and hopefully some others, which are yet to be discovered), and is more accessible to readers with a less detailed background in algebra. A cost of this simplicity is that unlike in the work of Kaltofen, we do not get an algorithm for factoring multivariate polynomials given by arithmetic circuits.

Besides Kaltofen’s original proof, there is a considerably simpler proof due to Bürgisser [Bür04] showing that is closed under taking factors. Bürgisser uses the classical univariate Newton Iteration to obtain a power series approximation of a root of a multivariate polynomial when it is viewed as a univariate in one of the variables. This power series approximation of the root to a sufficiently high enough accuracy is then used to obtain an appropriate irreducible factor of the input polynomial. This step requires setting up and solving an appropriate system of linear equations. A variant of this argument is also present in the works of Dvir et al. [DSY09], of Oliveira [Oli16], of Dutta et al. [DSS17] and an earlier work of the authors [CKS18a, CKS18b]. At a high level, these proofs go via an iterative step to approximate a root (or many roots), and a clean up step where a factor is recovered from this approximation.

In our proof, we directly recover the factors at the end of the slightly more complicated iterative step (a multivariate Newton iteration as opposed to a univariate one), and the clean up is essentially trivial.

Our proof follows immediately from the following two lemmas.

###### Lemma 1.2.

Let be an -variate degree polynomial which can be computed by an arithmetic circuit of size at most . If and are polynomials of degree at least such that and , then and have a circuit of size at most .

###### Lemma 1.3.

Let be an -variate degree polynomial which can be computed by an arithmetic circuit of size at most . If there is a polynomial and an integer such that , then has a circuit of size at most .

The proof of 1.2 and 1.3 will be provided in Section 3 and Section 4 respectively.

## 2 Notations and Preliminaries

• Throughout the paper, is a field of characteristic zero or sufficiently large.

• For a positive integers , denotes the set .

• We use boldface letters to denote ordered tuples of objects. For instance, , or . The length of these tuples and the precise indexing is defined before the specific notation is invoked. The sum of two such tuples of the same length is their coordinate wise sum.

• We say that a function of parameters taking values in is if there is a polynomial such that for all sufficiently large values of , is upper bounded by .

### 2.1 Arithmetic Circuits

Arithmetic circuits (also historically referred to as straight line programs) provide a succinct and compact representation for multivariate polynomials. Formally, they are defined as follows.

###### Definition 2.1 (Arithmetic Circuit).

Let be a field and be variables. An arithmetic circuit over and is a directed acyclic graph where the vertices are called gates. Every gate with in-degree zero is an input gate and is labeled by a single variable from or a field element from . The other gates are labeled by (sum gates) or (product gates). The gates with out-degree zero are called output gates.

Each gate in the circuit computes a polynomial in in a natural and inductive way. For an input gate , the polynomial it computes is the corresponding variable or field element. A (resp. ) gate computes the sum (resp. products) of the polynomials computed at the gates which have a directed edge to . The size of an arithmetic circuit is defined as the number of edges in .

The following lemma structural lemma about arithmetic circuits will be useful for our proof.

###### Lemma 2.2 (Homogenization).

Let be a multi-output arithmetic circuit of size with outputs . Then, for every , there is a homogeneous circuit of size at most which outputs the homogeneous components of degree at most of .

We refer the reader to any standard resource (such as the survey by Shpilka and Yehudayoff [SY10]) for a proof for 2.2 and for a general overview of arithmetic circuit complexity.

### 2.2 Multivariate Taylor’s Expansion

We use the following lemma which is an easy consequence of the classical multivariate Taylor expansion for polynomials.

###### Lemma 2.3 (Truncated Multivariate Taylor’s Expansion).

Let and , we have

 f(x+a)≡f(a)+m∑i=1∂f∂xi(a)⋅ximod⟨x⟩2.

In the proof for Theorem 1.1, we need a variant of this lemma where the variables are from instead of . We state it as a corollary of 2.3

###### Corollary 2.4.

Let where for each and , we have

 Q(f+p)≡Q(f)+∑i∈[m]∂Q∂zi(f)⋅pimod⟨x⟩k+1.

### 2.3 Jacobian Matrix

The Jacobian matrix is a matrix that contains the partial derivatives of a vector of multivariate functions.

###### Definition 2.5 (Jacobian Matrix).

Let , the Jacobian matrix of with respect to is a matrix denoted as where the entry is defined as for each and .

### 2.4 GCD and Resultant

For any two polynomials , we can define their greatest common divisor (GCD) as follows.

###### Definition 2.6 (Gcd).

Let be a field and . The greatest common divisor (GCD) of and is if divides both and , and for any that divides both and , we have dividing . For any , we define to be the greatest common divisor of and with respect to .

It turns out that there is a clean and useful mathematical condition to check whether the of two polynomials is non-constant using resultant.

###### Definition 2.7 (Resultant).

Let and such that and for some . The resultant is the determinant of a following matrix, called the Sylvester matrix .

 S(g,h)=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝g00⋯0h00⋯0g1g0⋯0h1h0⋯0g2g1⋮⋮h1⋮⋮⋮⋱g0hd2⋮⋱0gd1gd1−1g10hd2h00gd1g200h1⋮⋮⋮⋮⋮⋮00⋯gd100⋯hd2⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠.

Specifically, for , the column of is equal to , where there are zeroes in the prefix. For , t he column of equals , where there are zeroes in the prefix.

The following lemma shows that if and only if and have a common non-constant factor.

###### Lemma 2.8 (Capturing the GCD via the Resultant).

Let and such that and for some . Then, if and only if has degree at least in .

To keep this note short, we refer the reader to any standard resource (such as the lecture notes by Sudan [Sud98]) for a proof for 2.8.

## 3 Proof of Lemma 1.2

We have polynomials , and such that and has an arithmetic circuit of size at most . The goal is to show that and have circuits of size at most . Let be the degrees of and respectively and let . By taking a random (from a large enough grid) and replacing by , we can guarantee that the coefficient of in , in and in are all non-zero field elements. Without loss of generality, we assume that these constants are all (or else we scale everything by a constant). In the rest of this section, we view the identity as an identity in . Note that at the end of the above transformation, continues to be when viewing them as univariates in . We know that has a small circuit, and the goal is to show that and have small circuits.

Let be polynomials such that

 f:=yd+d−1∑i=0fiyi,
 g:=yd1+d1−1∑i=0giyi

and

 h:=yd2+d2−1∑i=0hiyi.

Now, comparing the coefficients of on both sides in the equality gives us a system of polynomial equations in as follows.

 f0 =g0⋅h0 f1 =g0⋅h1+g1⋅h0 ⋮ fi =min{i,d1}∑j=0gj⋅hi−j ⋮ fd−1 =gd1−1+hd2−1.

Let and be new sets of variables. For define the polynomials

 Qℓ(u,w):=min{ℓ,d1−1}∑j=0uj⋅wℓ−j−fℓ.

We view as a polynomial in with coefficients coming from the ring . In this sense, is a common zero of . Our goal is to essentially solve the system of equations given by to recover circuits for each and and prove an upper bound on their size. Note that this would not be an efficient algorithmic procedure, but we will be able to argue about the circuit complexity of the solution. To this end, we first observe some elementary properties of this system of polynomial equations.

###### Observation 3.1.

For every , can be computed by a circuit of size at most .

###### Proof.

Since has a circuit of size at most , and has degree , each can be computed by a circuit of size at by an easy application of 2.2. This immediately gives a circuit of this size for each . ∎

###### Lemma 3.2.

Let be the Jacobian of . If , then is a non-singular matrix.

###### Proof.

The key observation here is that the Jacobian matrix (see 2.5) is the same as the Sylvester matrix (see 2.7) up to a permutation of rows and columns. Concretely, let be the resultant of and when they are viewed as univariates in . Since the is equal to , their resultant is a non-zero polynomial of degree at most in . Recall that is the determinant of the following matrix, called the Sylvester matrix . For , the column of is equal to , where there are zeroes in the prefix. For , t he column of equals , where there are zeroes in the prefix. We now write the matrix .

 J(u,w)=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝∂Q0∂u0⋯∂Q0∂ud1−1∂Q0∂w0⋯∂Q0∂wd2−1∂Q1∂u0⋯∂Q1∂ud1−1∂Q1∂w1⋯∂Q1∂ud2⋮⋮⋮⋮⋮⋮∂Qd−1∂u0⋯∂fd−1∂ud1−1∂Qd−1∂w0⋯∂fd−1∂wd2−1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠

Plugging in the expressions for the partial derivatives, we get that for , the column of is equal to , where there are zeroes in the prefix. For , the column of equals , where there are zeroes in the prefix. Therefore, after substitution, the columns of are precisely the same as columns of up to a permutation of rows and columns. In other words, their ranks are equal. We know that is non-singular, so it follows that is also non-singular. ∎

###### Remark 3.3.

For the rest of the proof, we assume without loss of generality that is non-singular. This follows from the fact that since is non-singular, there is a such that is non-singular, and up to a translation of the coordinate axes, we can assume that .

### 3.1 Newton Iteration for many variables

We now show that given the constant term for each polynomial in , we can recover the polynomials completely. The argument is via a natural and well known multivariate analog of the standard Newton Iteration. Clearly, the constant terms have small circuits (trivial circuits of size ), and we show that in this iterative process, we can recover multioutput circuits for of size .

###### Lemma 3.4 (One step of Newton Iteration).

Let be any integer. Let be a multioutput circuit of size most computing polynomials such that for every and ,

 ~gi,k≡gimod⟨x⟩k,

and

 ~hj,k≡hjmod⟨x⟩k.

Then, there is a constant independent of such that the following is true: there is a multioutput circuit of size at most which computes the polynomials , such that for every and ,

 ~gi,k+1≡gimod⟨x⟩k+1,

and

 ~hj,k+1≡hjmod⟨x⟩k+1.
###### Proof.

For every and , let and be homogeneous polynomials of degree equal to such that and . Let and be the tuples associated to and . Our goal is to show that these polynomials and have small circuits. This would complete the proof of the lemma. To this end, we set up a system of linear equations in the and and show that this system has a unique solution.

Let (resp. ) denote the tuple ( resp. ). For each , since , it follows that

 Qℓ(gk+1,hk+1)≡0mod⟨x⟩k+1.

By their definition, and the hypothesis of the lemma, we know that and must satisfy

 Qℓ(~gk+p,~hk+q)≡0mod⟨x⟩k+1.

Since each and is a homogeneous polynomial of degree equal to , via the multivariate Taylor expansion for polynomials (see 2.4) for around the point , we get

Note that we used the fact that and are homogeneous polynomials of degree equal to and hence their squares and higher powers vanish modulo . Moreover, the only monomials of degree at most in are those in We also know from the hypothesis that is equal to . Applying these simplifications to the Taylor expansion for , we get

 −Qℓ(~gk,~hk) ≡ ∑i∈[d1](∂Qℓ∂yi(g(0),h(0)))⋅pi+∑j∈[d2](∂Qℓ∂zi(g(0),h(0)))⋅qjmod⟨x⟩k+1.

Let and be new sets of variables. For various values of , let us consider the affine constraint on these variables given by the equation.

 −Qℓ(~gk,~hk)mod⟨x⟩k+1 = ∑i∈[d1](∂Qℓ∂yi(g(0),h(0)))⋅vi+∑j∈[d2](∂Qℓ∂zi(g(0),h(0)))⋅zj.

So, we get a system of non-homogeneous linear equations of the form , where the matrix equals the matrix , which by 3.2 is non-singular. Thus, this system has unique solution which is given by . From our set up above, is a solution to this system of equations, and thus by uniqueness of solution, we get that that there are field constants and in such that for every and ,

 pi=∑ℓ∈[d]βi,ℓ(Qℓ(~gk,~hk)mod⟨x⟩k+1),
 qj=∑ℓ∈[d]γj,ℓ(Qℓ(~gk,~hk)mod⟨x⟩k+1).

In other words,

 pi=⎛⎝∑ℓ∈[d]βi,ℓ(Qℓ(~gk,~hk))⎞⎠mod⟨x⟩k+1,

and

 qj=⎛⎝∑ℓ∈[d]γj,ℓ(Qℓ(~gk,~hk))⎞⎠mod⟨x⟩k+1,

Now, recall that for every , is a polynomial which is zero modulo . Thus, the polynomial is equal to modulo , agrees with at monomials of degree less than , and we are adding to it the correct homogeneous polynomial of degree equal . Thus, we define and as

 ~gi,k+1:=~gi,k+⎛⎝∑ℓ∈[d]βi,ℓ(Qℓ(~gk,~hk))⎞⎠,

and

All that remains now is to argue that there is a small circuit computing and . We can obtain a circuit for computing and from the circuit computing and by adding a copy of circuits for at the top and a layer of addition gates above it with appropriate edge weights. The size therefore increases additively by a fixed polynomial in in each step. ∎

We now complete the proof of 1.2.

###### Proof of 1.2.

Observe that modulo , the and trivially have a circuit of size , since they are just constants. Now, using this as the base case, we use 3.4 for times to obtain a multioutput circuit of size at most , which computes polynomials and such that for every

 ~gi,d+1≡gimod⟨x⟩d+1.

But, each is of degree at most . Thus, we get

 ~gi,d+1mod⟨x⟩d+1=gi.

So, we recover a circuit for each and each , we homogenize (see 2.2) the circuit to get a homogeneous circuit , which has an output gate for each homogeneous component of degree at most for every output of . This incurs an additional multiplicative blow up of on the size of . From the circuit , we can just read off the polynomials (resp. ) by taking an appropriate linear combinations of the outputs of , which only incurs an additive blow up in the size of the circuit. This completes the proof of the lemma. ∎

## 4 Proof of Lemma 1.3

We have polynomials and such that where has a circuit of size at most . Since be the degree of , both and the degree of are upper bounded by . The goal is showing that has a circuit of size at most . Now, consider the following polynomial

 ~f:=ze−f=ze−ge

where is a new variable. Note that has a circuit of size at most . Next, decompose as follows.

 ~f=(z−g)⋅(ze−1+ze−2g+⋯+ge−1).

Observe that if and the characteristic of the field is zero or large enough, then the of the polynomial and the polynomial is . The reason is that is irreducible and does not divide when and the characteristic of the field is zero or large enough. Finally, by 1.2, has a circuit of size at most and thus also has a circuit of size .

## Acknowledgements

We thank Vishwas Bhargav, Swastik Kopparty, Ramprasad Saptharishi and Srikanth Srinivasan for various insightful discussions and for encouraging us to write the proof up.

## References

• [Bür04] Peter Bürgisser. The complexity of factors of multivariate polynomials. Foundations of Computational Mathematics, 4(4):369–396, 2004.
• [CKS18a] Chi-Ning Chou, Mrinal Kumar, and Noam Solomon. In 33rd Computational Complexity Conference (CCC 2018), volume 102, pages 13:1–13:17, 2018.
• [CKS18b] Chi-Ning Chou, Mrinal Kumar, and Noam Solomon. Some Closure Results for Polynomial Factorization and Applications. To appear in Theory of Computing. Preprint arXiv:1803.05933, 2018.
• [DSS17] Pranjal Dutta, Nitin Saxena, and Amit Sinhababu. Discovering the roots: Uniform closure results for algebraic classes under factoring. CoRR, abs/1710.03214, 2017.
• [DSY09] Zeev Dvir, Amir Shpilka, and Amir Yehudayoff. SIAM J. Comput., 39(4):1279–1293, 2009.
• [Kal86] Erich Kaltofen. In Proceedings of the 18th Annual ACM Symposium on Theory of Computing, May 28-30, 1986, Berkeley, California, USA, pages 330–337, 1986.
• [Kal87] Erich Kaltofen. In Proceedings of the 19th Annual ACM Symposium on Theory of Computing, 1987, New York, New York, USA, pages 443–452, 1987.
• [Kal89] Erich Kaltofen. Factorization of polynomials given by straight-line programs. Randomness and Computation, 5(375-412):2–3, 1989.
• [KI04] Valentine Kabanets and Russell Impagliazzo. Computational Complexity, 13(1-2):1–46, 2004. Preliminary version in the 20031968 Annual ACM Symposium on Theory of Computing (STOC 2003).
• [KT90] Erich Kaltofen and Barry M. Trager. J. Symb. Comput., 9(3):301–320, 1990.
• [Oli16] Rafael Oliveira. Computational Complexity, 25(2):507–561, 2016.