Proving that there are explicit polynomials which are hard to compute is the template of many open problems in algebraic complexity theory. Various instances of this problem involve different definitions of explicitness, hardness and computation.
In the most general form, this is the well known vs. question, which asks whether every “explicit” polynomial has a polynomial-size algebraic circuit. An algebraic circuit is a very natural (and the most general) algebraic computational model. Informally, it is a computational device which is given a set of indeterminates , and it can use additions and multiplications (as well as field scalars) to compute a polynomial . The complexity of the circuit is then measured by the number of operations the circuit performs.
It is trivial to give an explicit -variate polynomial which requires circuits of size . It is also not hard to show that a degree- polynomial requires circuits of size , since the degree can at most double in each operation. Thus, one trivially obtains a lower bound for an -variate degree- polynomial.
A major result of Baur and Strassen [Str73a, BS83] gives an explicit -variate degree- polynomial which requires circuits of size at least . On the one hand, this is quite impressive since when , this gives lower bound which is super-linear in . Such lower bounds for explicit functions in the analogous model of boolean circuits are a long-standing and important open problem in boolean circuit complexity. On the other hand, this lower bound is barely super-linear, whereas ideally one would hope to prove super-polynomial or even exponential lower bounds (indeed, it can be proved that “most” polynomials require circuits of size exponential in ).
Despite decades of work, this lower bound has not been improved, even though it has been reproved (using different techniques [Smo97, Ben83]). Most of the works thus deal with restricted models of algebraic computation. For some, there exist exponential or at least super-polynomial lower bounds. For other, more powerful models, merely improved polynomial lower bound. We refer the reader to [Sap15] for a comprehensive survey of lower bounds in algebraic complexity.
One such restricted model of computation for which we have better lower bounds is algebraic formulas. Formulas are simply circuits whose underlying graph is a tree. Kalorkoti [Kal85] has shown how to adapt Nechiporuk’s method [Nec66], originally developed for boolean formulas, to prove an almost quadratic lower bound for an -variate polynomial.111In his paper, Kalorkoti proves an lower bound for the determinant, which has variables, so the lower bound not quadratic in the number of variables. However, it is possible to get the statement claimed here using a straightforward application of his techniques. This is also the best lower bound obtainable using this technique.
1.1 Algebraic Branching Programs
Algebraic Branching Programs (ABPs, for short), defined below, are an intermediate model between algebraic formulas and algebraic circuits. To within polynomial factors, algebraic formulas can be simulated by ABPs, and ABPs can be simulated by circuits. It is believed that each of the reverse transformations requires a super-polynomial blow-up in the size (for some restricted models of computation, this is a known fact [Nis91, Raz06, RY08, DMPY12, HY16]).
Polynomial families which can be efficiently computed by algebraic branching programs form the complexity class , and the determinant is a complete polynomial for this class under an appropriate notion of reductions. Thus, the famous Permanent vs. Determinant problem, unbeknownst to many, is in fact equivalent to showing super-polynomial lower bound for ABPs. In this paper, we focus on the question of proving lower bounds on the size of algebraic branching programs for explicit polynomial families. We start by formally defining an algebraic branching program.
Definition 1.1 (Algebraic Branching Programs).
An Algebraic Branching Program (ABP) is a layered graph where each edge is labeled by an affine linear form and the first and the last layer have one vertex each, called the “start” and the “end” vertex respectively.
The polynomial computed by an ABP is equal to the sum of the weights of all paths from the start vertex to the end vertex in the ABP, where the weight of a path is equal to the product of the labels of all the edges on it.
The size of an ABP is the number of vertices in it.
While 1.1 is quite standard, there are some small variants of it in the literature which we now discuss. These distinctions make no difference as far as super-polynomial lower bounds are concerned, since it can be easily seen that each variant can be simulated by the other to within polynomial factors, and thus the issues described here are usually left unaddressed. However, it seems that we are very far from proving super-polynomial lower bounds for general algebraic branching programs, and in this paper we focus on proving polynomial (yet still super-linear) lower bounds. In this setting, those issues do affect the results.
Layered vs. Unlayered.
In 1.1, we have required the graph to be layered. We also consider in this paper ABPs whose underlying graphs are unlayered, which we call unlayered ABPs. We are able to prove super-linear (but weaker) lower bounds for this model as well.
One motivation for considering layered graph as the “standard” model is given by the following interpretation. From the definition, it can be observed that any polynomial computable by an ABP with layers and vertices in the -th layer can be written as the (only) entry of the matrix given by the product , where is an matrix with affine forms as entries. One natural complexity measure of such a representation is the total number of non-zero entries in those matrices, which is the number of edges in the ABP. Another natural measure, which can only be smaller, is the sums of dimensions of the matrices involved in the product, which is the same as the number of vertices in the underlying graph.
Branching programs are also prevalent in boolean complexity theory, and in particular in the context of derandomizing the class . In this setting again it only makes sense to talk about layered graphs.
Unlayered ABPs can also be thought of as (a slight generalization of) skew circuits. These are circuits in which on every multiplication gate, at least one of the operands is a variable (or more generally, a linear function).
In 1.1 we have allowed each edge to be labeled by an arbitrary affine linear form in the variables. This is again quite standard, perhaps inspired by Nisan’s characterization of the ABP complexity of a non-commutative polynomial as the rank of an associated coefficients matrix [Nis91], which requires this freedom. A more restrictive definition would only allow each edge to be labeled by a linear function in 1 variable. On the other hand, an even more general definition, which we sometimes adopt, is to allow every edge to be labeled by an arbitrary polynomial of degree at most . In this case we refer to the model as an ABP with edge labels of degree at most . Thus, the common case is , but our results are meaningful even when . Note that this is quite a powerful model, which is allowed to use polynomials with super-polynomial standard circuit complexity “for free”.
We will recall some of these distinctions in Section 1.3, where we discuss previous results, some of which apply to several of the variants discussed here.
1.2 Lower bounds for algebraic branching programs.
Our main result is an almost quadratic lower bound on the size of any algebraic branching program computing some explicit polynomial.
Let be a field and such that . Then any algebraic branching program over computing the polynomial is of size at least .
When the ABP’s edge labels are allowed to be polynomials of degree at most , our lower bound is .
For the unlayered case, we prove a weaker (but still superlinear) lower bound.
Let be a field and such that . Then any unlayered algebraic branching program over with edge labels of degree at most computing the polynomial is of size at least .
1.3 Previous work
The best lower bound known for ABPs prior to this work is a lower bound of on the number of edges for the same polynomial . This follows from the classical lower bound of by Baur and Strassen [Str73a, BS83] on the number of multiplication gates in any algebraic circuit computing the polynomial and the observation that when converting an ABP to an algebraic circuit, the number of product gates in the resulting circuit is at most the number of edges in the ABP. Theorem 1.2 improves upon this bound quantitatively, and also qualitatively, since the lower bound is on the number of vertices in the ABP.
For the case of homogeneous ABPs,222An ABP is homogeneous if the polynomial computed between the start vertex and any other vertex is a homogeneous polynomial. This condition is essentially equivalent to assuming that the number of layers in the ABP is upper bounded by the degree of the output polynomial. a quadratic lower bound for the polynomial was shown by Kumar [Kum19], and the proofs in this paper build on the ideas in [Kum19]. In a nutshell, the result in [Kum19] is equivalent to a lower bound for ABPs computing the polynomial when the number of layers in the ABP is at most . In this work, we generalize this to proving essentially the same lower bound as in [Kum19] for ABPs with an unbounded number of layers.
In general, an ABP computing an -variate homogeneous polynomial of degree can be homogenized with a polynomial blow-up in size. This is proved in a similar manner to the standard classical result which shows this statement for algebraic circuits [Str73b]. Thus, much like the discussion following 1.1, homogeneity is not an issue when one considers polynomial vs. super-polynomial sizes, but becomes relevant when proving polynomial lower bounds. In other contexts in algebraic complexity this distinction is even more sharp. For example, exponential lower bounds for homogeneous depth- circuits are well known and easy to prove [NW97], but strong enough exponential lower bounds for non-homogeneous depth- circuits would separate from [GKKS16].
For unlayered ABPs, the situation is more complex. If the edge labels are only functions of one variable, it is possible to adapt Nechiporuk’s method [Nec66] in order to obtain a lower bound of (for a different polynomial than we consider). This is an argument attributed to Pudlák and sketched by Karchmer and Wigderson [KW93] for the boolean model of parity branching programs, but can be applied to the algebraic setting. However, this argument does not extend to the case where the edge labels are arbitrary linear or low-degree polynomials in the variables. The crux of Nechiporuk’s argument is to partition the variables into disjoint sets, to argue (using counting or dimension arguments) that the number of edges labeled by variables from each set must be somewhat large333This is usually guaranteed by constructing a function or a polynomial with the property that given a fixed set in the partition, there are many subfunctions or subpolynomials on the variables of that can be obtained by different restrictions of the variables outside of ., and then to sum the contributions over all sets. This is hard to implement in models where a single edge can have a “global” access to all variables, since it is not clear how to avoid over-counting in this case.
As mentioned above, the lower bound of Baur-Strassen does hold in the unlayered case, assuming the edge labels are linear functions in the variables. When we allow edge labels of degree at most for some , their technique does not seem to carry over. Indeed, even if we equip the circuit with the ability to compute such low-degree polynomials “for free”, a key step in the Baur-Strassen proof is the claim that if a polynomial has a circuit of size , then there is a circuit of size which computes all its first order partial derivatives, and this statement does not seem to hold in this new model.
It is possible to get an lower bound for this model, for a different polynomial, by suitably extending the techniques of Ben-Or [Ben83, Ben94]. Our lower bounds are weaker by at most a doubly-logarithmic factor; however, the techniques are completely different. Ben-Or’s proofs rely as a black-box on strong modern results in algebraic geometry, whereas our proofs are much more elementary.
1.4 Proof Overview
The first part in the proof of Theorem 1.2 is an extension of the lower bound proved in [Kum19] for ABPs with at most layers. This straightforward but conceptually important adaptation shows that a similar lower bound holds for any polynomial of the form
where the suggestively named should be thought of as an “error term” which is “negligible” as far as the proof of [Kum19] is concerned. The exact structure we require is that is of the form , where are polynomials with no constant term and . The parameter measures the “size” of the error, which we want to keep small, and the lower bound holds if, e.g., .
To argue about ABPs with layers, with , we show that unless the size of the ABP is too large to begin with (in which case there is nothing to prove), it is possible to find a small set of vertices (of size about ) whose removal adds a small error term as above with at most summands, but also reduces the depth of the ABP by a constant factor. Repeatedly applying this operation times eventually gives an ABP of depth at most while ensuring that we have not accumulated too much “error”,444It takes some care in showing that the total number of error terms accumulated is at most as opposed to the obvious upper bound of . In particular, we observe that the number of error terms can be upper bounded by a geometric progression with first term roughly and common ratio being a constant less than . so that we can apply the lower bound from the previous paragraph.
In the full proof we have to be a bit more careful when arguing about the ABP along the steps of the proof above. The details are presented in Section 3.
The proof of Theorem 1.3 follows the same strategy, although the main impediment is that general undirected graphs can have much more complex structure then layered graphs. One of the main ingredients in our proof is (a small variant of) a famous lemma of Valiant [Val77], which shows that for every graph of depth with edges, it is possible to find a set of edges, of size at most , whose removal reduces the depth of the graph to . This lemma helps us identify a small set of vertices which can reduce the depth of the graph by a constant factor while again accumulating small error terms.
Interestingly, Valiant originally proved this lemma in a different context, where he showed that linear algebraic circuits of depth and size can be reduced to a special type of depth- circuits (and thus strong lower bounds on such circuits imply super-linear lower bounds on circuits of depth ). This lemma can be also used to show that boolean circuits of depth and size can be converted to depth- circuits of size , and thus again strong lower bounds on depth- circuits will imply super-linear lower bounds on circuits of depth . Both of these questions continue to be well known open problems in algebraic and boolean complexity, and to the best of our knowledge, our proof is the first time Valiant’s lemma is successfully used in order to prove circuit lower bounds for explicit functions or polynomials.
2 Notations and Preliminaries
All logarithms in the paper are base 2.
We use some standard graph theory terminology: If is a directed graph and is an edge, is called the head of the edge and the tail. Our directed graphs are always acyclic with designated source vertex and sink vertex . The depth of a vertex , denoted , is the length (in edges) of a longest path from to . The depth of the graph, denoted by , is the depth of .
For any two vertices and in an ABP, the polynomial computed between and is the sum of weights of all paths between and in the ABP. We denote this by .
The formal degree of a vertex in an ABP denoted , is defined inductively as follows: If is the start vertex of the ABP, . If is a vertex with incoming edges from , labeled by non-zero polynomials , respectively, then
It follows by induction that for every vertex , (however, cancellations can allow for arbitrary gaps between the two). The formal degree of the ABP is the maximal formal degree of any vertex in it.
We sometimes denote by
the vector of variables, where is understood from the context. Similarly we use to denote the -dimensional vector .
2.1 A decomposition lemma
The following lemma gives a decomposition of a (possibly unlayered) ABP in terms of the intermediate polynomials it computes. Its proof closely resembles that of Lemma 3.5 of [Kum19]. For completeness we prove it here for a slightly more general model.
Lemma 2.1 ([Kum19]).
Let be a (possibly unlayered) algebraic branching program whose edge labels are arbitrary polynomials of degree at most , which computes a degree polynomial , and has formal degree . Set .
For any , let be the set of all vertices in whose formal degree is in the interval .
Then, there exist polynomials and , each of degree at most such that
Fix as above and set as above (observe that since each edge label is of degree at most , is non empty). Further suppose, without loss of generality, that the elements of are ordered such that there is no directed path from to for .
Consider the unlayered ABP obtained from by erasing all incoming edges to , and multiplying all the labels of the outgoing edges from by a new variable . The ABP now computes a polynomial of the form
where . is the polynomial obtained from by setting to zero, or equivalently, removing and all its outgoing edges. We continue in the same manner with to obtain
Indeed, observe that since there is no path from to for , removing does not change . The bound on the degrees of is immediate from the fact that the formal degree of the ABP is at most and . It remains to argue the .
The polynomial is obtained from by erasing all the vertices in and the edges touching them. We will show that every path in the corresponding ABP computes a polynomial of degree at most . Let be such a path, which is also a path in . Let be the minimal vertex in the path whose degree (in ) is at least (if no such exists, the proposition follows). As , the formal degree of is at most . The degree of the polynomial computed by this path is thus at most , where is the degree of product of the labels on the path . To complete the proof, it remains to be shown that .
Indeed, if then since the degree of is at least , there would be in a path of formal degree at least , contradicting the assumption on . ∎
3 A lower bound for Algebraic Branching Programs
In this section we prove Theorem 1.2. We start by restating it.
Let and let be a field such that . If is an algebraic branching program with edge labels of degree at most that computes the polynomial , then the size of is at least
For technical reasons, we work with a slightly more general model which we call multilayered ABPs, which we now define.
Definition 3.2 (Multilayered ABP).
Let be ABPs with layers and vertices, respectively. A multilayered ABP , denoted by , is the ABP obtained by placing in parallel and identifying their start and end vertices respectively. Thus, the polynomial computed by is , where is the polynomial computed by .
The number of layers of is . The size of is the number of vertices in , and thus equals
This model is an intermediate model between (layered) ABPs and unlayered ABPs: given a multilayered ABP of size it is straightforward to construct an unlayered ABP of size which computes the same polynomial.
3.1 A robust lower bound for ABPs of formal degree at most
In this section, we prove a lower bound for the case where the formal degree of every vertex in the ABP is at most . In fact, Kumar [Kum19] has already proved a quadratic lower bound for this case.
Theorem 3.3 ([Kum19]).
Let and let be a field such that . Then any algebraic branching program of formal degree at most which computes the polynomial has at least vertices.
However, to prove Theorem 3.1, we need the following more “robust” version of Theorem 3.3, which gives a lower bound for a larger class of polynomials. For completeness, we also sketch an argument for the proof which is a minor variation of the proof of Theorem 3.3.
Let and let be field such that . Let and be polynomials such that for every , and is a polynomial of degree at most . Then, any algebraic branching program over , of formal degree at most and edge labels of degree at most , which computes the polynomial
has at least vertices.
Let , and let be an algebraically closed field such that . Let be a set of polynomials in such that the set of their common zeros
is non-empty. Finally, suppose is a polynomial in of degree at most , such that
Since , (see, e.g., Section 2.8 of [Smi14]). Thus, the set of zeros with multiplicity two of
has dimension at least . In other words, if is the set of common zeros of the set of all first order partial derivatives of , then . Up to scaling by (which is non-zero in , by assumption), the set of all first order partial derivatives of is given by
Thus, the statement of this lemma immediately follows from the following claim.
Claim 3.6 (Lemma 3.2 in [Kum19]).
Let be an algebraically closed field, and a positive natural number. For every choice of polynomials of degree at most , the dimension of the variety
Indeed, the above claim shows that , and so . This completes the proof of 3.5. ∎
Proof of Theorem 3.4.
Let be an algebraic branching program of formal degree at most , edge labels of degree at most , and with start vertex and end vertex , which computes
We may assume without loss of generality that is algebraically closed, by interpreting as an ABP over the algebraic closure of , if necessary.
Let , fix , and let be the set of all vertices in whose formal degree lies in the interval . Letting , by 2.1, there exist polynomials and , each of degree at most such that
Let , be the constant terms in , respectively. Then by defining
we have that
Here, . We now have that for every , the constant terms of , are zero and . Let
Then , and so . Thus by 3.5, we know that .
Finally, for , if . Thus, the number of vertices in must be at least
3.2 A lower bound for the general case
The following lemma shows how we can obtain, given an ABP with layers which computes a polynomial , a multilayered ABP, whose number of layers is significantly smaller, which computes plus a small “error term”.
Let be an ABP over a field with layers, which computes the polynomial and has vertices. Let and be the start and end vertices of respectively, and let be the set of vertices in the -th layer of . For every , let and be the constant terms of and respectively. Furthermore, let and be polynomials such that and .
Then, there is a multilayered ABP , with at most layers and size at most that computes the polynomial
Let be the vertices in as described, so that
Further, for every , and , where the constant terms of and are zero (by definition). Having set up this notation, we can thus express the polynomial computed by as
On further rearrangement, this gives
This is equivalent to the following expression.
Now, observe that the polynomial is computable by an ABP with layers, obtained by just keeping the vertices and edges within first layers of and the end vertex , deleting all other vertices and edges, and connecting the vertex in the -th layer to by an edge of weight . Similarly, the polynomial is computable by an ABP with at most layers, whose set of vertices is along the vertices in the layers of . From the definition of and , it follows that the multilayered ABP obtained by taking the sum of and has at most layers.
We are almost done with the proof of the lemma, except for the upper bound on the number of vertices of the resulting multilayered ABP , and the fact that the upper bound on the depth is slightly weaker than claimed. Both these issues can be solved simultaneously.
The vertices in appear in both the ABP and the ABP and are counted twice in the size of . However, every other vertex is counted exactly once. Hence,
In order to fix this issue, we first observe that the edges between the vertices in the -th layer of and the end vertex are labeled by , all of which are field constants. In the following claim, we argue that for ABPs with this additional structure, the last layer is redundant and can be removed.
Let be an ABP over with layers and edge labels of degree at most such that the labels of all the edges between the -th layer of and its end vertex are scalars in . Then, there is an ABP with layers computing the same polynomial as , with edge labels of degree at most , such that
where is the set of vertices in the -th layer of .
An analogous statement, with an identical proof, is true if we assume that all edge labels between the first and second layer are scalars in .
We first use 3.9 to complete the proof of the lemma. As observed above, the edge labels between the last layer of and its end vertex are all constants. Hence, by 3.9, there is an ABP which computes the same polynomial as such that , and has only layers. Similarly, we can obtain an ABP with at most layers.
Proof of 3.9.
For the proof of the claim, we focus on the -th and -st layer of . To this end, we first set up some notation. Let be the set of vertices in the -th layer of , be the set of vertices in -st layer of , and , denote the start and the end vertices of respectively. Then, the polynomial computed by , can be decomposed as
Note that is an edge in the ABP. Similarly, the polynomial can be written as
Combining the two expressions together, we get
which on further rearrangement, gives us
From the hypothesis of the claim, we know that for every , the edge label is a field constant, and the edge label is a polynomial of degree at most . Thus, for every , the expression is a polynomial of degree at most .
This gives us the following natural construction for the ABP from . We delete the vertices in (and hence, all edges incident to them), and for every , we connect the vertex with the end vertex using an edge with label . The upper bound on the size and the number of layers of is immediate from the construction, and that it computes the same polynomial as follows from Equation 3.10. ∎
We now state and prove a simple generalization of 3.7 for a multilayered ABP.
Let be a multilayered ABP with layers over a field computing the polynomial , such that each is an ABP with layers. Also, let be the number of vertices in the -th layer of ( if has fewer than layers), and .
Then, there is a multilayered ABP with at most layers and size at most that computes a polynomial of the form
where is a set of non-constant polynomials with constant term zero and .
Let be the natural number which minimizes the quantity , and let be the set of all indices such that has at least layers. Let and . Thus,
Here, is a multilayered ABP with at most layers. Moreover, .
To complete the proof of this lemma, we will now apply 3.7 to every ABP in . For every , we know that there exist some polynomials with constant terms zero and a constant , such that
can be computed by a multilayered ABP. Let us denote this multilayered ABP by . From 3.7, we know that has at most layers and size at most . Taking a sum over all and re-indexing the summands, we get that there exist polynomials with constant terms zero and a constant such that the polynomial
is computable by a multilayered ABP with at most layers and size at most . Now, by combining the multilayered ABPs and , we get that the polynomial
is computable by a multilayered ABP with at most layers and size at most . ∎
Proof of Theorem 3.1.
Let be a multilayered ABP with layers which computes the polynomial . As before we may assume without loss of generality that the underlying field is algebraically closed. Note that if is at most , then by Theorem 3.4, we know that is at least and we are done. Also, if , then again we have our lower bound since each layer of must have at least one vertex. Thus, we can assume that .
The proof idea is to iteratively make changes to till we get a multilayered ABP of formal degree at most that computes a polynomial of the type
where and are polynomials such that for every , and has degree at most . Once we have this, we can invoke Theorem 3.4 and get the required lower bound.
We now explain how to iteratively obtain from . In one step, we ensure the following.
Let be a multilayered ABP with edge labels of degree at most , layers and size at most that computes a polynomial of the form where are polynomials such that for every , and has degree at most .
If , then there exists a multilayered ABP with at most layers and size at most which computes a polynomial of the form
such that and are polynomials such that for every , and has degree at most .
If , the statement of the theorem follows. Otherwise, we apply 3.12 iteratively times, as long as the number of layers is more than , to eventually get a multilayered ABP with layers. Let denote the number of layers in each ABP in this sequence, so that , and for . is an ABP with at most layers and size at most , which by induction, computes a polynomial of the form
where are polynomials such that for every , and has degree at most . Further, the number of error terms, , is at most
Since , we have that for all , so that
At this point, since the formal degree is at most , using Theorem 3.4 we get
Proof of 3.12.
Let , and for , let be the number of vertices in layer of . Recall that if the number of layers in is strictly less than , then we set . Let