Is there an efficient computational task that cannot be perfectly parallelized? Equivalently, is ? The answer is still unknown. The question can be rephrased as follows: is there a function in that does not have a (De Morgan) formula of polynomial size?
The history of formula lower bounds for functions in goes back to the 1960s, with the seminal result of Subbotovskaya [Sub61] that introduced the technique of random restrictions. Subbotovskaya showed that the Parity function on variables requires formulas of size at least . Khrapchenko [Khr72], using a different proof technique, showed that in fact the Parity function on variables requires formulas of size . Later, Andreev [And87] came up with a new explicit function (now known as the Andreev function) for which he was able to obtain an size lower bound. This lower bound was subsequently improved by [IN93, PZ93, Hås98, Tal14] to .
The line of work initiated by Subbotovskaya and Andreev relies on the shrinkage of formulas under -random restrictions. A -random restriction is a randomly chosen partial assignment to the inputs of a function. Set a parameter . We fix each variable independently with probability to a uniformly random bit, and we keep the variable alive with probability . Under such a restriction, formulas shrink (in expectation) by a factor more significant than . Subbotovskaya showed that De Morgan formulas shrink to at most times their original size, whereas subsequent works of [PZ93, IN93] improved the bound to and , respectively. Finally, Håstad [Hås98] showed that the shrinkage exponent of De Morgan formulas is , or in other words, that De Morgan formulas shrink by a factor of under -random restrictions. Tal [Tal14] improved the shrinkage factor to — obtaining a tight result, as exhibited by the Parity function.
In a nutshell, shrinkage results are useful to proving lower bounds as long as the explicit function being analyzed maintains structure under such restrictions and does not trivialize. For example, the Parity function does not become constant as long as at least one variable remains alive. Thus any formula that computes Parity must be of at least quadratic size, or else the formula under restriction, keeping each variable alive with probability , would likely become a constant function, whereas Parity would not. Andreev’s idea is similar, though he manages to construct a function such that under a random restriction keeping only of the variables, the formula size should be at least (in expectation). This ultimately gives the nearly cubic lower bound.
The KRW Conjecture.
Despite much effort, proving , and even just breaking the cubic barrier in formula lower bounds, have remained a challenge for more than two decades. An approach to solve the versus problem was suggested by Karchmer, Raz and Wigderson [KRW95]. They conjectured that when composing two Boolean functions, and , the formula size of the resulting function, , is (roughly) the product of the formula sizes of and .222More precisely, the original KRW conjecture [KRW95] concerns depth complexity rather than formula complexity. The variant of the conjecture for formula complexity, which is discussed above, was posed in [GMWW17]. We will refer to this conjecture as the “KRW conjecture”. Under the KRW conjecture (and even under weaker variants of it), [KRW95] constructed a function in with no polynomial-size formulas. It remains a major open challenge to settle the KRW conjecture.
A few special cases of the KRW conjecture are known to be true. The conjecture holds when either or is the AND or the OR function. Håstad’s result [Hås98] and its improvement [Tal14] show that the conjecture holds when the inner function is the Parity function and the outer function is any function. This gives an alternative explanation to the lower bound for the Andreev function. Indeed, the Andreev function is at least as hard as the composition of a maximally-hard function on bits and , where the formula size of is and the formula size of is . Since the KRW conjecture holds for this special case, the formula size of the Andreev function is at least . In other words, the state-of-the-art formula size lower bounds for explicit functions follow from a special case of the KRW conjecture — the case in which is the Parity function. Moreover, this special case follows from the shrinkage of De Morgan formulas under -random restrictions.
Bottom-Up versus Top-Down Techniques.
Whereas random restrictions are a “bottom-up” proof technique [HJP95], a different line of work suggested a “top-down” approach using the language of communication complexity. The connection between formula size and communication complexity was introduced in the seminal work of Karchmer and Wigderson [KW90]. They defined for any Boolean function a two-party communication problem : Alice gets an input such that , and Bob gets an input such that . Their goal is to identify a coordinate on which , while minimizing their communication. It turns out that there is a one-to-one correspondence between any protocol tree solving and any formula computing the function . Since protocols naturally traverse the tree from root to leaf, proving lower bounds on their size or depth is done usually in a top-down fashion. This framework has proven to be very useful in proving formula lower bounds in the monotone setting (see, e.g., [KW90, GH92, RW92, KRW95, RM99, GP18, PR17]) and in studying the KRW conjecture (see, e.g., [KRW95, EIRS01, HW93, GMWW17, DM18, KM18, Mei20, dRMN20, MS20]). Moreover, a recent work by Dinur and Meir [DM18] was able to reprove Håstad’s cubic lower bound using the framework of Karchmer and Wigderson. As Dinur and Meir’s proof showed that top-down techniques can replicate Håstad’s cubic lower bound, a natural question (which motivated this project) arose:
Are top-down techniques superior to bottom-up techniques?
Towards that, we focused on a candidate problem: prove a cubic lower bound for an explicit function in .333Recall that is the class of functions computed by constant depth polynomial size circuits composed of AND and OR gates of unbounded fan-in, with variables or their negation at the leaves. Based on the work of Dinur and Meir [DM18], we suspected that such a lower bound could be achieved using top-down techniques. We were also certain that the problem cannot be solved using the random restriction technique. Indeed, in order to prove a lower bound on a function using random restrictions, one should argue that remains hard under a random restriction, however, it is well-known that functions in trivialize under -random restrictions [Ajt83, FSS84, Yao85, Hås86]. Based on this intuition, surely random restrictions cannot show that a function in requires cubic size. Our intuition turned out to be false.
1.2 Our results
In this work, we construct an explicit function in which requires De Morgan formulas of size . Surprisingly, our proof is conducted via the bottom-up technique of random projections, which is a generalization of random restrictions (more details below).
There exists a family of Boolean functions for such that
can be computed by uniform depth- unbounded fan-in formulas of size .
The formula size of is at least .
Our hard function is a variant of the Andreev function. More specifically, recall that the Andreev function is based on the composition , where is a maximally-hard function and is the Parity function. Since Parity is not in , we cannot take to be the Parity function in our construction. Instead, our hard function is obtained by replacing the Parity function with the Surjectivity function of [BM12].
As in the case of the Andreev function, we establish the hardness of our function by proving an appropriate special case of the KRW conjecture. To this end, we introduce a generalization of the complexity measure of Khrapchenko [Khr72], called the min-entropy Khrapchenko bound. We prove the KRW conjecture for the special case in which the outer function is any function, and is a function whose formula complexity is bounded tightly by the min-entropy Khrapchenko bound. We then obtain Theorem 1.1 by applying this version of the KRW conjecture to the case where is the Surjectivity function. We note that our KRW result also implies the known lower bounds in the cases where is the Parity function [Hås98] and the Majority function [GTN19].
Our KRW result in fact applies more generally, to functions whose formula complexity is bounded tightly by the “soft-adversary method”, denoted , which is a generalization of Ambainis’ unweighted adversary method [Amb02] (see Section 6.2).
Our proof of the special case of the KRW conjecture follows the methodology of Håstad [Hås93], who proved the special case in which is Parity on variables. Håstad proved that De Morgan formulas shrink by a factor of (roughly) under -random restrictions. Choosing shrinks a formula for by a factor of roughly , which coincides with the formula complexity of . On the other hand, on average each copy of simplifies to a single input variable, and so simplifies to . This shows that .
Our main technical contribution is a new shrinkage theorem that works in a far wider range of scenarios than just -random restrictions. Given a function with soft-adversary bound , we construct a random projection444A projection is a mapping from the set of the variables to the set , where are formal variables. which, on the one hand, shrinks De Morgan formulas by a factor of , and on the other hand, simplifies to . We thus show that , and in particular, if , then , just as in Håstad’s proof. Our random projections are tailored specifically to the structure of the function , ensuring that simplifies to under projection. This enables us to overcome the aforementioned difficulty. In contrast, -random restrictions that do not respect the structure of would likely result in a restricted function that is much simpler than and in fact would be a constant function with high probability.
Our shrinkage theorem applies more generally to two types of random projections, which we call fixing projections and hiding projections. Fixing projections are random projections in which fixing the value of a variable results in a projection which is much more probable. Hiding projections are random projections in which fixing the value of a variable hides which coordinates it appeared on. We note that our shrinkage theorem for fixing projections captures Håstad’s result for -random restrictions as a special case.
The proof of our shrinkage theorem is based on Håstad’s proof [Hås98], but also simplifies it. In particular, we take the simpler argument that Håstad uses for the special case of completely balanced trees, and adapt it to the general case. As such, our proof avoids a complicated case analysis, at the cost of slightly worse bounds. Using our bounds, it is nevertheless easy to obtain the lower bound for the Andreev function. Therefore, one can see the specialization of our shrinkage result to -random restrictions as an exposition of Håstad’s cubic lower bound.
An example: our techniques when specialized to .
To illustrate our choice of random projections, we present its instantiation to the special case of , where is non-constant and
for some odd integer. In this case, the input variables to are composed of disjoint blocks, , each containing variables. We use the random projection that for each block , picks one variable in the block uniformly at random, projects this variable to the new variable , and fixes the rest of the variables in the block in a balanced way so that the number of zeros and ones in the block is equal (i.e., we have exactly zeros and ones). It is not hard to see that under this choice, simplifies to . On the other hand, we show that this choice of random projections shrinks the formula complexity by a factor of . Combining the two together, we get that . Note that in this distribution of random projections, the different coordinates are not independent of one another, and this feature allows us to maintain structure.
1.3 Related work
Our technique of using tailor-made random projections was inspired by the celebrated result of Rossman, Servedio, and Tan [RST15, HRST17] that proved an average-case depth hierarchy. In fact, the idea to use tailor-made random restrictions goes back to Håstad’s thesis [Hås87, Chapter 6.2]. Similar to our case, in [Hås87, RST15, HRST17], -random restrictions are too crude to separate depth from depth circuits. Given a circuit of depth , the main challenge is to construct a distribution of random restrictions or projections (tailored to the circuit ) that on the one hand maintains structure for , but on the other hand simplify any depth circuit .
The paper starts with brief preliminaries in Section 2. We prove our shrinkage theorem for fixing projections in Section 3, and our shrinkage theorem for hiding projections in Section 4. In Section 5 we provide a brief interlude on concatenation of projections. Khrapchenko’s method, the quantum adversary bound and their relation to hiding projections are discussed in Section 6. Finally, Section 7 contains a proof of Theorem 1.1, as a corollary of a more general result which is a special case of the KRW conjecture. In the same section we also rederive the cubic lower bound on Andreev’s function, and the cubic lower bound on the Majority-based variant considered in [GTN19].
Throughout the paper, we use bold letters to denote random variables. For any, we denote by the set . Given a bit , we denote its negation by . We assume familiarity with the basic definitions of communication complexity (see, e.g., [KN97]). All logarithms in this paper are base .
A (De Morgan) formula (with bounded fan-in) is a binary tree, whose leaves are labeled with literals from the set , and whose internal vertices are labeled as AND () or OR () gates. The size of a formula , denoted , is the number of leaves in the tree. The depth of the formula is the depth of the tree. A formula with unbounded fan-in is defined similarly, but every internal vertex in the tree can have any number of children. Unless stated explicitly otherwise, whenever we say “formula” we refer to a formula with bounded fan-in.
A formula computes a Boolean function in the natural way. The formula complexity of a Boolean function , denoted , is the size of the smallest formula that computes . The depth complexity of , denoted , is the smallest depth of a formula that computes . For convenience, we define the size and depth of the constant function to be zero.
A basic property of formula complexity is that it is subadditive:
For every two functions it holds that and .
The following theorem shows that every small formula can be “balanced” to obtain a shallow formula.
For every , the following holds: For every formula of size , there exists an equivalent formula of depth at most and size at most .
With a slight abuse of notation, we will often identify a formula with the function it computes. In particular, the notation denotes the formula complexity of the function computed by , and not the size of (which is denoted by ).
Given a Boolean variable , we denote by and the literals and , respectively. In other words, .
Given a literal , we define to be the underlying variable, that is, .
Let be a deterministic communication protocol that takes inputs from , and recall that the leaves of the protocol induce a partition of to combinatorial rectangles. For every leaf of , we denote by the combinatorial rectangle that is associated with .
We use the framework of Karchmer–Wigerson relations [KW90], which relates the complexity of to the complexity of a related communication problem .
Let be a Boolean function. The Karchmer–Wigderson relation of , denoted , is the following communication problem: The inputs of Alice and Bob are strings and , respectively, and their goal is to find a coordinate such that . Note that such a coordinate must exist since and hence .
Let . The communication complexity of is equal to , and the minimal number of leaves in a protocol that solves is .
We use the following two standard inequalities.
Fact (the AM-GM inequality).
For every two non-negative real numbers it holds that .
Fact (special case of Cauchy-Schwarz inequality).
For every non-negative real numbers it holds that .
It holds that , as required. ∎
3 Shrinkage theorem for fixing projections
In this section we prove our main result on the shrinkage of De Morgan formulas under fixing projections, which we define below. We start by defining projections and the relevant notation.
Let and be Boolean variables. A projection from to is a function from the set to the set . Given such a projection and a Boolean function over the variables , we denote by the function obtained from by substituting each input variable with in the natural way. Unless stated explicitly otherwise, all projections in this section are from to , and all functions from to are over the variables . A random projection is a distribution over projections.
Let be a projection. For every and bit , we denote by the projection that is obtained from by substituting with .
With a slight abuse of notation, if a projection maps all the variables to constants in , we will sometimes treat it as a binary string in .
We use a new notion of random projections, which we call -fixing projections. Intuitively, a -fixing projection is a random projection in which for every variable , the probability that maps a variable to a literal is not much larger than the probability that fixes that literal to a constant, regardless of the values that assigns to the other variables. This property is essentially the minimal property that is required in order to carry out the argument of Håstad [Hås98]. Formally, we define -fixing projections as follows.
Let . We say that a random projection is a -fixing projection if for every projection , every bit , and every variable , it holds that
For shorthand, we say that is a -fixing projection, for .
If needed, one can consider without loss of generality only variables such that , as otherwise Equation 1 holds trivially with the left-hand side equaling zero.
In order to get intuition for the definition of fixing projections, let us examine how this definition applies to random restrictions. In our terms, a restriction is a projection from to that maps every variable either to itself or to . Suppose that is any distribution over restrictions, and that is some fixed restriction. In this case, the condition of being -fixing can be rewritten as follows:
Denote by the restrictions obtained from by truncating (i.e., ). Using this notation, we can rewrite the foregoing equation as
Now, observe that it is always the case , and therefore the probability on the left-hand side is non-zero only if . Hence, we can restrict ourselves to the latter case, and the foregoing equation can be rewritten again as
Finally, if we divide both sides by , we obtain the following intuitive condition:
This condition informally says the following: is a fixing projection if the probability of leaving unfixed is at most times the probability of fixing it to , and this holds regardless of what the restriction assigns to the other variables.
In particular, it is now easy to see that the classic random restrictions are fixing projections. Recall that a -random restriction fixes each variable independently with probability to a random bit. Due to the independence of the different variables, the foregoing condition simplifies to
and it is easy to see that this condition is satisfied for .
We prove the following shrinkage theorem for -fixing projections, which is analogous to the shrinkage theorem of [Hås98] for random restrictions in the case of balanced formulas.
Theorem 3.1 (Shrinkage under fixing projections).
Let be a formula of size and depth , and let be a -fixing projection. Then
Our shrinkage theorem has somewhat worse parameters compared to the theorem of [Hås98]: specifically, the factor of does not appear in [Hås98]. The reason is that the proof of [Hås98] uses a fairly-complicated case-analysis in order to avoid losing that factor, and we chose to skip this analysis in order to obtain a simpler proof. We did not check if the factor of in our result can be avoided by using a similar case-analysis. By applying formula balancing (Theorem 2.1) to our shrinkage theorem, we can obtain the following result, which is independent of the depth of the formula.
Let be a function with formula complexity , and let be a -fixing projection. Then
3.1 Proof of Theorem 3.1
In this section, we prove our main shrinkage theorem, Theorem 3.1. Our proof is based on the ideas of [Hås98], but the presentation is different. Fix a formula of size and depth , and let be a -fixing projection. We would like to upper-bound the expectation of . As in [Hås98], we start by upper-bounding the probability that the projection shrinks a formula to size . Specifically, we prove the following lemma in Section 3.2.
Let be a Boolean function, and let be a -fixing projection. Then,
Next, we show that to upper-bound the expectation of , it suffices to upper-bound the probability that the projection shrinks two formulas to size simultaneously. In order to state this claim formally, we introduce some notation.
Let be a gate of . We denote the depth of in by (the root has depth ), and omit if it is clear from context. If is an internal node, we denote the sub-formulas that are rooted in its left and right children by and , respectively.
We prove the following lemma, which says that in order to upper-bound it suffices to upper-bound, for every internal gate , the probability that and shrink to size under .
For every projection it holds that
We would like to use Sections 3.1 and 3.1 to prove the shrinkage theorem. As a warm-up, let us make the simplifying assumption that for every two functions , the events and are independent. If this was true, we could have upper-bounded as follows:
The last sum counts every leaf of once for each internal ancestor of , so the last expression is equal to
which is the bound we wanted. However, the above calculation only works under our simplifying assumption, which is false: the events and will often be dependent. In particular, in order for the foregoing calculation to work, we need to the following inequality to hold:
Let be a -fixing projection. Let , let , and let be a variable. Then,
Intuitively, Section 3.1 breaks the dependency between the events and by fixing in the single literal to which has shrunk. We would now like to use Section 3.1 to prove the theorem. To this end, we prove an appropriate variant of Section 3.1, which allows using the projection rather than in the second function. This variant is motivated by the following “one-variable simplification rules” of [Hås98], which are easy to verify.
Fact (one-variable simplification rules).
Let be a function over the variables , and let . We denote by the function obtained from by setting to the bit . Then:
The function is equal to the function .
The function is equal to the function .
In order to use the simplification rules, we define, for every internal gate of and projection , an event as follows: if is an OR gate, then is the event that there exists some literal (for ) such that and . If is an AND gate, then is defined similarly, except that we replace with . We have the following lemma, which is proved in Section 3.4.
For every projection it holds that
We can now use the following corollary of Section 3.1 to replace our simplifying assumption.
For every internal gate of it holds that
3.2 Proof of Section 3.1
Let , and let be the set of projections such that . We prove that the probability that is at most . Our proof follows closely the proof of [Hås98, Lemma 4.1].
Let be a protocol that solves and has leaves (such a protocol exists by Theorem 2.2). Let and be the sets of projections for which is the constants and , respectively. We extend the protocol to take inputs from as follows: when Alice and Bob are given as inputs the projections and , respectively, they construct strings from by substituting in all the variables , and invoke on the inputs and . Observe that and are indeed legal inputs for (since and ). Moreover, recall that the protocol induces a partition of to combinatorial rectangles, and that we denote the rectangle of the leaf by (see Section 2).
Our proof strategy is the following: We associate with every projection a leaf of , denoted . We consider the two disjoint events that correspond to the event that is a single positive literal or a single negative literal, respectively, and show that for every leaf it holds that
Together, the two inequalities imply that
The desired bound on will follow by summing the latter bound over all the leaves of .
We start by explaining how to associate a leaf with every projection . Let . Then, it must be the case that for some . We define the projections and , and observe that and . We now define to be the leaf to which arrives when invoked on inputs and . Observe that the output of at must be a variable that satisfies , and thus .
Next, fix a leaf . We prove that . Let be the output of the protocol at . Then,
Similarly, it can be proved that . Together, the two bounds imply that
for every leaf of . We define for projections in an analogous way, and then a similar argument shows that
It follows that
Finally, let denote the set of leaves of . It holds that
|(Cauchy-Schwarz – see Section 2)|
We conclude the proof by showing that . To this end, let