Shrinkage under Random Projections, and Cubic Formula Lower Bounds for 𝐀𝐂^0

12/03/2020 ∙ by Yuval Filmus, et al. ∙ Technion berkeley college University of Haifa 0

Håstad showed that any De Morgan formula (composed of AND, OR and NOT gates) shrinks by a factor of O(p^2) under a random restriction that leaves each variable alive independently with probability p [SICOMP, 1998]. Using this result, he gave an Ω(n^3) formula size lower bound for the Andreev function, which, up to lower order improvements, remains the state-of-the-art lower bound for any explicit function. In this work, we extend the shrinkage result of Håstad to hold under a far wider family of random restrictions and their generalization – random projections. Based on our shrinkage results, we obtain an Ω(n^3) formula size lower bound for an explicit function computed in 𝐀𝐂^0. This improves upon the best known formula size lower bounds for 𝐀𝐂^0, that were only quadratic prior to our work. In addition, we prove that the KRW conjecture [Karchmer et al., Computational Complexity 5(3/4), 1995] holds for inner functions for which the unweighted quantum adversary bound is tight. In particular, this holds for inner functions with a tight Khrapchenko bound. Our random projections are tailor-made to the function's structure so that the function maintains structure even under projection – using such projections is necessary, as standard random restrictions simplify 𝐀𝐂^0 circuits. In contrast, we show that any De Morgan formula shrinks by a quadratic factor under our random projections, allowing us to prove the cubic lower bound. Our proof techniques build on the proof of Håstad for the simpler case of balanced formulas. This allows for a significantly simpler proof at the cost of slightly worse parameters. As such, when specialized to the case of p-random restrictions, our proof can be used as an exposition of Håstad's result.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Background

Is there an efficient computational task that cannot be perfectly parallelized? Equivalently, is ? The answer is still unknown. The question can be rephrased as follows: is there a function in  that does not have a (De Morgan) formula of polynomial size?

The history of formula lower bounds for functions in  goes back to the 1960s, with the seminal result of Subbotovskaya [Sub61] that introduced the technique of random restrictions. Subbotovskaya showed that the Parity function on variables requires formulas of size at least . Khrapchenko [Khr72], using a different proof technique, showed that in fact the Parity function on variables requires formulas of size . Later, Andreev [And87] came up with a new explicit function (now known as the Andreev function) for which he was able to obtain an size lower bound. This lower bound was subsequently improved by [IN93, PZ93, Hås98, Tal14] to .

The line of work initiated by Subbotovskaya and Andreev relies on the shrinkage of formulas under -random restrictions. A -random restriction is a randomly chosen partial assignment to the inputs of a function. Set a parameter . We fix each variable independently with probability to a uniformly random bit, and we keep the variable alive with probability . Under such a restriction, formulas shrink (in expectation) by a factor more significant than . Subbotovskaya showed that De Morgan formulas shrink to at most times their original size, whereas subsequent works of [PZ93, IN93] improved the bound to and , respectively. Finally, Håstad [Hås98] showed that the shrinkage exponent of De Morgan formulas is , or in other words, that De Morgan formulas shrink by a factor of under -random restrictions. Tal [Tal14] improved the shrinkage factor to — obtaining a tight result, as exhibited by the Parity function.

In a nutshell, shrinkage results are useful to proving lower bounds as long as the explicit function being analyzed maintains structure under such restrictions and does not trivialize. For example, the Parity function does not become constant as long as at least one variable remains alive. Thus any formula that computes Parity must be of at least quadratic size, or else the formula under restriction, keeping each variable alive with probability , would likely become a constant function, whereas Parity would not. Andreev’s idea is similar, though he manages to construct a function such that under a random restriction keeping only of the variables, the formula size should be at least (in expectation). This ultimately gives the nearly cubic lower bound.

The KRW Conjecture.

Despite much effort, proving , and even just breaking the cubic barrier in formula lower bounds, have remained a challenge for more than two decades. An approach to solve the versus problem was suggested by Karchmer, Raz and Wigderson [KRW95]. They conjectured that when composing two Boolean functions, and , the formula size of the resulting function, , is (roughly) the product of the formula sizes of and .222More precisely, the original KRW conjecture [KRW95] concerns depth complexity rather than formula complexity. The variant of the conjecture for formula complexity, which is discussed above, was posed in [GMWW17]. We will refer to this conjecture as the “KRW conjecture”. Under the KRW conjecture (and even under weaker variants of it), [KRW95] constructed a function in with no polynomial-size formulas. It remains a major open challenge to settle the KRW conjecture.

A few special cases of the KRW conjecture are known to be true. The conjecture holds when either or is the AND or the OR function. Håstad’s result [Hås98] and its improvement [Tal14] show that the conjecture holds when the inner function  is the Parity function and the outer function  is any function. This gives an alternative explanation to the lower bound for the Andreev function. Indeed, the Andreev function is at least as hard as the composition of a maximally-hard function on bits and , where the formula size of is and the formula size of is . Since the KRW conjecture holds for this special case, the formula size of the Andreev function is at least . In other words, the state-of-the-art formula size lower bounds for explicit functions follow from a special case of the KRW conjecture — the case in which is the Parity function. Moreover, this special case follows from the shrinkage of De Morgan formulas under -random restrictions.

Bottom-Up versus Top-Down Techniques.

Whereas random restrictions are a “bottom-up” proof technique [HJP95], a different line of work suggested a “top-down” approach using the language of communication complexity. The connection between formula size and communication complexity was introduced in the seminal work of Karchmer and Wigderson [KW90]. They defined for any Boolean function a two-party communication problem : Alice gets an input such that , and Bob gets an input such that . Their goal is to identify a coordinate on which , while minimizing their communication. It turns out that there is a one-to-one correspondence between any protocol tree solving and any formula computing the function . Since protocols naturally traverse the tree from root to leaf, proving lower bounds on their size or depth is done usually in a top-down fashion. This framework has proven to be very useful in proving formula lower bounds in the monotone setting (see, e.g., [KW90, GH92, RW92, KRW95, RM99, GP18, PR17]) and in studying the KRW conjecture (see, e.g., [KRW95, EIRS01, HW93, GMWW17, DM18, KM18, Mei20, dRMN20, MS20]). Moreover, a recent work by Dinur and Meir [DM18] was able to reprove Håstad’s cubic lower bound using the framework of Karchmer and Wigderson. As Dinur and Meir’s proof showed that top-down techniques can replicate Håstad’s cubic lower bound, a natural question (which motivated this project) arose:

Are top-down techniques superior to bottom-up techniques?

Towards that, we focused on a candidate problem: prove a cubic lower bound for an explicit function in .333Recall that is the class of functions computed by constant depth polynomial size circuits composed of AND and OR gates of unbounded fan-in, with variables or their negation at the leaves. Based on the work of Dinur and Meir [DM18], we suspected that such a lower bound could be achieved using top-down techniques. We were also certain that the problem cannot be solved using the random restriction technique. Indeed, in order to prove a lower bound on a function  using random restrictions, one should argue that  remains hard under a random restriction, however, it is well-known that functions in  trivialize under -random restrictions [Ajt83, FSS84, Yao85, Hås86]. Based on this intuition, surely random restrictions cannot show that a function in requires cubic size. Our intuition turned out to be false.

1.2 Our results

In this work, we construct an explicit function in which requires De Morgan formulas of size . Surprisingly, our proof is conducted via the bottom-up technique of random projections, which is a generalization of random restrictions (more details below).

Theorem 1.1.

There exists a family of Boolean functions for such that

  1. can be computed by uniform depth- unbounded fan-in formulas of size .

  2. The formula size of is at least .

Prior to our work, the best formula size lower bounds on an explicit function in were only quadratic [Nec66, CKK12, Juk12, BM12].

Our hard function is a variant of the Andreev function. More specifically, recall that the Andreev function is based on the composition , where is a maximally-hard function and is the Parity function. Since Parity is not in , we cannot take to be the Parity function in our construction. Instead, our hard function is obtained by replacing the Parity function with the Surjectivity function of [BM12].

As in the case of the Andreev function, we establish the hardness of our function by proving an appropriate special case of the KRW conjecture. To this end, we introduce a generalization of the complexity measure of Khrapchenko [Khr72], called the min-entropy Khrapchenko bound. We prove the KRW conjecture for the special case in which the outer function  is any function, and  is a function whose formula complexity is bounded tightly by the min-entropy Khrapchenko bound. We then obtain Theorem 1.1 by applying this version of the KRW conjecture to the case where  is the Surjectivity function. We note that our KRW result also implies the known lower bounds in the cases where is the Parity function [Hås98] and the Majority function [GTN19].

Our KRW result in fact applies more generally, to functions whose formula complexity is bounded tightly by the “soft-adversary method”, denoted , which is a generalization of Ambainis’ unweighted adversary method [Amb02] (see Section 6.2).

Our proof of the special case of the KRW conjecture follows the methodology of Håstad [Hås93], who proved the special case in which is Parity on variables. Håstad proved that De Morgan formulas shrink by a factor of (roughly) under -random restrictions. Choosing shrinks a formula for by a factor of roughly , which coincides with the formula complexity of . On the other hand, on average each copy of simplifies to a single input variable, and so simplifies to . This shows that .

Our main technical contribution is a new shrinkage theorem that works in a far wider range of scenarios than just -random restrictions. Given a function with soft-adversary bound , we construct a random projection444A projection is a mapping from the set of the variables to the set , where are formal variables. which, on the one hand, shrinks De Morgan formulas by a factor of , and on the other hand, simplifies to . We thus show that , and in particular, if , then , just as in Håstad’s proof. Our random projections are tailored specifically to the structure of the function , ensuring that simplifies to under projection. This enables us to overcome the aforementioned difficulty. In contrast, -random restrictions that do not respect the structure of would likely result in a restricted function that is much simpler than and in fact would be a constant function with high probability.

Our shrinkage theorem applies more generally to two types of random projections, which we call fixing projections and hiding projections. Fixing projections are random projections in which fixing the value of a variable results in a projection which is much more probable. Hiding projections are random projections in which fixing the value of a variable hides which coordinates it appeared on. We note that our shrinkage theorem for fixing projections captures Håstad’s result for -random restrictions as a special case.

The proof of our shrinkage theorem is based on Håstad’s proof [Hås98], but also simplifies it. In particular, we take the simpler argument that Håstad uses for the special case of completely balanced trees, and adapt it to the general case. As such, our proof avoids a complicated case analysis, at the cost of slightly worse bounds. Using our bounds, it is nevertheless easy to obtain the lower bound for the Andreev function. Therefore, one can see the specialization of our shrinkage result to -random restrictions as an exposition of Håstad’s cubic lower bound.

An example: our techniques when specialized to .

To illustrate our choice of random projections, we present its instantiation to the special case of , where is non-constant and

for some odd integer

. In this case, the input variables to are composed of disjoint blocks, , each containing variables. We use the random projection that for each block , picks one variable in the block uniformly at random, projects this variable to the new variable , and fixes the rest of the variables in the block in a balanced way so that the number of zeros and ones in the block is equal (i.e., we have exactly zeros and ones). It is not hard to see that under this choice, simplifies to . On the other hand, we show that this choice of random projections shrinks the formula complexity by a factor of . Combining the two together, we get that . Note that in this distribution of random projections, the different coordinates are not independent of one another, and this feature allows us to maintain structure.

1.3 Related work

Our technique of using tailor-made random projections was inspired by the celebrated result of Rossman, Servedio, and Tan [RST15, HRST17] that proved an average-case depth hierarchy. In fact, the idea to use tailor-made random restrictions goes back to Håstad’s thesis [Hås87, Chapter 6.2]. Similar to our case, in [Hås87, RST15, HRST17], -random restrictions are too crude to separate depth from depth circuits. Given a circuit of depth , the main challenge is to construct a distribution of random restrictions or projections (tailored to the circuit ) that on the one hand maintains structure for , but on the other hand simplify any depth circuit .

Paper outline

The paper starts with brief preliminaries in Section 2. We prove our shrinkage theorem for fixing projections in Section 3, and our shrinkage theorem for hiding projections in Section 4. In Section 5 we provide a brief interlude on concatenation of projections. Khrapchenko’s method, the quantum adversary bound and their relation to hiding projections are discussed in Section 6. Finally, Section 7 contains a proof of Theorem 1.1, as a corollary of a more general result which is a special case of the KRW conjecture. In the same section we also rederive the cubic lower bound on Andreev’s function, and the cubic lower bound on the Majority-based variant considered in [GTN19].

2 Preliminaries

Throughout the paper, we use bold letters to denote random variables. For any

, we denote by the set . Given a bit , we denote its negation by . We assume familiarity with the basic definitions of communication complexity (see, e.g., [KN97]). All logarithms in this paper are base .

Definition .

A (De Morgan) formula (with bounded fan-in) is a binary tree, whose leaves are labeled with literals from the set , and whose internal vertices are labeled as AND () or OR () gates. The size of a formula , denoted , is the number of leaves in the tree. The depth of the formula is the depth of the tree. A formula with unbounded fan-in is defined similarly, but every internal vertex in the tree can have any number of children. Unless stated explicitly otherwise, whenever we say “formula” we refer to a formula with bounded fan-in.

Definition .

A formula computes a Boolean function  in the natural way. The formula complexity of a Boolean function , denoted , is the size of the smallest formula that computes . The depth complexity of , denoted , is the smallest depth of a formula that computes . For convenience, we define the size and depth of the constant function to be zero.

A basic property of formula complexity is that it is subadditive:

Fact .

For every two functions it holds that and .

The following theorem shows that every small formula can be “balanced” to obtain a shallow formula.

Theorem 2.1 (Formula balancing, [Bb94], following [Spi71, Bre74]).

For every , the following holds: For every formula  of size , there exists an equivalent formula  of depth at most  and size at most .

Notation .

With a slight abuse of notation, we will often identify a formula  with the function it computes. In particular, the notation denotes the formula complexity of the function computed by , and not the size of  (which is denoted by ).

Notation .

Given a Boolean variable , we denote by and the literals and , respectively. In other words, .

Notation .

Given a literal , we define to be the underlying variable, that is, .

Notation .

Let be a deterministic communication protocol that takes inputs from , and recall that the leaves of the protocol induce a partition of to combinatorial rectangles. For every leaf  of , we denote by the combinatorial rectangle that is associated with .

We use the framework of Karchmer–Wigerson relations [KW90], which relates the complexity of  to the complexity of a related communication problem .

Definition ([Kw90]).

Let be a Boolean function. The Karchmer–Wigderson relation of , denoted , is the following communication problem: The inputs of Alice and Bob are strings and , respectively, and their goal is to find a coordinate such that . Note that such a coordinate must exist since and hence .

Theorem 2.2 ([Kw90], see also [Raz90]).

Let . The communication complexity of  is equal to , and the minimal number of leaves in a protocol that solves  is .

We use the following two standard inequalities.

Fact (the AM-GM inequality).

For every two non-negative real numbers it holds that .

Fact (special case of Cauchy-Schwarz inequality).

For every  non-negative real numbers it holds that .

Proof.

It holds that , as required. ∎

3 Shrinkage theorem for fixing projections

In this section we prove our main result on the shrinkage of De Morgan formulas under fixing projections, which we define below. We start by defining projections and the relevant notation.

Definition .

Let and be Boolean variables. A projection  from to is a function from the set to the set . Given such a projection  and a Boolean function  over the variables , we denote by the function obtained from  by substituting each input variable  with in the natural way. Unless stated explicitly otherwise, all projections in this section are from to , and all functions from  to are over the variables . A random projection is a distribution over projections.

Notation .

Let be a projection. For every and bit , we denote by the projection that is obtained from by substituting with .

Notation .

With a slight abuse of notation, if a projection  maps all the variables to constants in , we will sometimes treat it as a binary string in .

We use a new notion of random projections, which we call -fixing projections. Intuitively, a -fixing projection is a random projection in which for every variable , the probability that maps a variable  to a literal is not much larger than the probability that  fixes that literal to a constant, regardless of the values that  assigns to the other variables. This property is essentially the minimal property that is required in order to carry out the argument of Håstad [Hås98]. Formally, we define -fixing projections as follows.

Definition .

Let . We say that a random projection is a -fixing projection if for every projection , every bit , and every variable , it holds that

(1)

For shorthand, we say that is a -fixing projection, for .

If needed, one can consider without loss of generality only variables such that , as otherwise Equation 1 holds trivially with the left-hand side equaling zero.

Example .

In order to get intuition for the definition of fixing projections, let us examine how this definition applies to random restrictions. In our terms, a restriction is a projection from to that maps every variable either to itself or to . Suppose that is any distribution over restrictions, and that is some fixed restriction. In this case, the condition of being -fixing can be rewritten as follows:

Denote by the restrictions obtained from by truncating (i.e., ). Using this notation, we can rewrite the foregoing equation as

Now, observe that it is always the case , and therefore the probability on the left-hand side is non-zero only if . Hence, we can restrict ourselves to the latter case, and the foregoing equation can be rewritten again as

Finally, if we divide both sides by , we obtain the following intuitive condition:

This condition informally says the following: is a fixing projection if the probability of leaving unfixed is at most times the probability of fixing it to , and this holds regardless of what the restriction assigns to the other variables.

In particular, it is now easy to see that the classic random restrictions are fixing projections. Recall that a -random restriction fixes each variable independently with probability to a random bit. Due to the independence of the different variables, the foregoing condition simplifies to

and it is easy to see that this condition is satisfied for .

We prove the following shrinkage theorem for -fixing projections, which is analogous to the shrinkage theorem of [Hås98] for random restrictions in the case of balanced formulas.

Theorem 3.1 (Shrinkage under fixing projections).

Let be a formula of size  and depth , and let be a -fixing projection. Then

Our shrinkage theorem has somewhat worse parameters compared to the theorem of [Hås98]: specifically, the factor of does not appear in [Hås98]. The reason is that the proof of [Hås98] uses a fairly-complicated case-analysis in order to avoid losing that factor, and we chose to skip this analysis in order to obtain a simpler proof. We did not check if the factor of  in our result can be avoided by using a similar case-analysis. By applying formula balancing (Theorem 2.1) to our shrinkage theorem, we can obtain the following result, which is independent of the depth of the formula.

Corollary .

Let be a function with formula complexity , and let be a -fixing projection. Then

Proof.

By assumption, there exists a formula  of size  that computes . We balance the formula  by applying Theorem 2.1 with , and obtain a new formula  that computes  and has size  and depth . The required result now follows by applying Theorem 3.1 to . ∎

3.1 Proof of Theorem 3.1

In this section, we prove our main shrinkage theorem, Theorem 3.1. Our proof is based on the ideas of [Hås98], but the presentation is different. Fix a formula of size  and depth , and let be a -fixing projection. We would like to upper-bound the expectation of . As in [Hås98], we start by upper-bounding the probability that the projection  shrinks a formula to size . Specifically, we prove the following lemma in Section 3.2.

Lemma .

Let be a Boolean function, and let be a -fixing projection. Then,

Next, we show that to upper-bound the expectation of , it suffices to upper-bound the probability that the projection  shrinks two formulas to size  simultaneously. In order to state this claim formally, we introduce some notation.

Notation .

Let be a gate of . We denote the depth of  in  by (the root has depth ), and omit  if it is clear from context. If is an internal node, we denote the sub-formulas that are rooted in its left and right children by and , respectively.

We prove the following lemma, which says that in order to upper-bound it suffices to upper-bound, for every internal gate , the probability that and shrink to size  under .

Lemma .

For every projection  it holds that

We would like to use Sections 3.1 and 3.1 to prove the shrinkage theorem. As a warm-up, let us make the simplifying assumption that for every two functions , the events and are independent. If this was true, we could have upper-bounded  as follows:

(Section 3.1)
( is of depth )
(simplifying assumption)
(Section 3.1)
(AM–GM inequality)

The last sum counts every leaf  of  once for each internal ancestor of , so the last expression is equal to

which is the bound we wanted. However, the above calculation only works under our simplifying assumption, which is false: the events and will often be dependent. In particular, in order for the foregoing calculation to work, we need to the following inequality to hold:

This inequality holds under our simplifying assumption by Section 3.1, but may not hold in general. Nevertheless, we prove the following similar statement in Section 3.3.

Lemma .

Let be a -fixing projection. Let , let , and let be a variable. Then,

Intuitively, Section 3.1 breaks the dependency between the events and by fixing in  the single literal to which has shrunk. We would now like to use Section 3.1 to prove the theorem. To this end, we prove an appropriate variant of Section 3.1, which allows using the projection rather than in the second function. This variant is motivated by the following “one-variable simplification rules” of [Hås98], which are easy to verify.

Fact (one-variable simplification rules).

Let be a function over the variables , and let . We denote by the function obtained from by setting  to the bit . Then:

  • The function is equal to the function .

  • The function is equal to the function .

In order to use the simplification rules, we define, for every internal gate  of  and projection , an event as follows: if is an OR gate, then is the event that there exists some literal  (for ) such that and . If is an AND gate, then is defined similarly, except that we replace with . We have the following lemma, which is proved in Section 3.4.

Lemma .

For every projection  it holds that

We can now use the following corollary of Section 3.1 to replace our simplifying assumption.

Corollary .

For every internal gate  of  it holds that

Proof.

Let be an internal gate of . We prove the corollary for the case where is an OR gate, and the proof for the case that is an AND gate is similar. It holds that

(Section 3.1)
(Section 3.1)

as required. ∎

The shrinkage theorem now follows using the same calculation as above, replacing Section 3.1 with Section 3.1 and the simplifying assumption with Section 3.1:

(Section 3.1)
(Section 3.1)
(AM–GM inequality)

In the remainder of this section, we prove Sections 3.1, 3.1 and 3.1.

Remark .

In this paper, we do not prove Section 3.1, since we do not actually need it for our proof. However, this lemma can be established using the proof of Section 3.1, with some minor changes.

3.2 Proof of Section 3.1

Let , and let be the set of projections  such that . We prove that the probability that is at most . Our proof follows closely the proof of [Hås98, Lemma 4.1].

Let be a protocol that solves  and has leaves (such a protocol exists by Theorem 2.2). Let and be the sets of projections  for which is the constants and , respectively. We extend the protocol  to take inputs from  as follows: when Alice and Bob are given as inputs the projections and , respectively, they construct strings from by substituting  in all the variables , and invoke  on the inputs and . Observe that and  are indeed legal inputs for  (since and ). Moreover, recall that the protocol  induces a partition of  to combinatorial rectangles, and that we denote the rectangle of the leaf  by (see Section 2).

Our proof strategy is the following: We associate with every projection  a leaf of , denoted . We consider the two disjoint events that correspond to the event that is a single positive literal or a single negative literal, respectively, and show that for every leaf  it holds that

(2)
(3)

Together, the two inequalities imply that

The desired bound on will follow by summing the latter bound over all the leaves  of .

We start by explaining how to associate a leaf with every projection . Let . Then, it must be the case that for some . We define the projections and , and observe that and . We now define to be the leaf to which  arrives when invoked on inputs and . Observe that the output of  at  must be a variable  that satisfies , and thus .

Next, fix a leaf . We prove that . Let be the output of the protocol  at . Then,

Similarly, it can be proved that . Together, the two bounds imply that

for every leaf  of . We define for projections in an analogous way, and then a similar argument shows that

It follows that

Finally, let denote the set of leaves of . It holds that

(Cauchy-Schwarz – see Section 2)

We conclude the proof by showing that . To this end, let