The query model is arguably the simplest model for computation of Boolean functions. Its simplicity is convenient for showing lower bounds for the amount of time required to accomplish a computational task. In this model, an algorithm computing a function on bits is given query access to the input . The algorithm can query different bits of , possibly in an adaptive fashion, and finally produces an output. The complexity of the algorithm is the number of queries made; in particular, the algorithm does not incur additional cost for any computation other than the queries.
Unlike the more general models of computation (e.g. Boolean circuits, Turing machines), it is often possible to completely determine the query complexity of explicit functions using existing tools and techniques. The study of query algorithms can thus be a natural first step towards understanding the computational power and limitations of more general and complex models. Query complexity has seen a long line of research by computational complexity theorists. We refer the reader to the survey by Buhrman and de Wolf[BdW02] for a comprehensive introduction to this line of work.
To understand query algorithms, researchers have defined many complexity measures of Boolean functions and investigated their relationship to query complexity, and to one another. For a summary of the current state of knowledge about these measures, see [ABDK16]. In this work, we focus on characterizing the bounded-error query complexity and the zero-error query complexity .
The following measures are known to lower bound : block sensitivity , fractional certificate complexity (also known as fractional block sensitivity , [Tal13]), and certificate complexity . They are related as follows:
It is known that , and the Tribes function (an And of Ors on bits) demonstrates that this relation is tight [JK10]. It is also known that [Nis89, BBC01]. A quadratic separation between and is also achieved by Tribes. Aaronson posed a question whether holds [Aar08] (stated in terms of the randomized certificate complexity , which later has been shown to be equivalent to [GSS16]). A positive answer to this question would imply that [ABDK16], where and stand for approximate polynomial degree and quantum query complexity respectively.
One approach to showing is to consider the natural generalization of the proof to the randomized case; the analysis of this algorithm, however, has met some unresolved obstacles [KT16]. We define a new complexity measure expectational certificate complexity that is specifically designed to avert these problems and is of a similar form to . We show that gives a quadratically tight bound for :
For all total Boolean functions ,
In fact, is a relaxation of , and we show that Moreover, we show that lies closer to than does: While we don’t know whether is a lower bound on , the last property gives .
As mentioned earlier, bounds from above. But for specific functions, can be an asymptotically tighter upper bound than . We demonstrate that by showing that the same example that provides a quadratic separation between and [GSS16] also gives . This is the widest separation possible between and , because .
In the second part of the paper, we upper bound the distributional query complexity for product distributions in terms of the minimum product query corruption bound and the block sensitivity (see Definition 9 and Section 2).
Let and a product distribution over the inputs. Then
We contrast Theorem 2 with the past work by Harsha, Jain and Radhakrishnan [HJR15], who showed that for product distributions, the distributional query complexity is bounded above by the square of the smooth corruption bound corresponding to inverse polynomial error. Theorem 2 improves upon their result, firstly by upper bounding the distributional complexity by minimum query corruption bound, which is an asymptotically smaller measure than the smooth corruption bound, and secondly by losing a constant factor in the error as opposed to a polynomial worsening in their work. Theorem 17, a consequence of Theorem 2, shows that for product distribution over the inputs, the distributional query complexity is asymptotically bounded above by the square of the query corruption bound. Thus Theorem 17 resolves a question that was open after the work of Harsha et. al. The analogous question in communication complexity is still open.
Jain and Klauck showed that is a powerful lower bound on . In the same work, was used to give a tight lower bound on for the Tribes function on bits. The authors proved that is asymptotically larger than . This implies that , since . While a quadratic separation between and is known [AKK16], it is open whether . Theorem 4 proves a distributional version of this quadratic relation, for the special case in which the input is sampled from a product distribution. We remark here that Jain, Harsha and Radhakrishnan proved in their work that ; Theorem 4 achieves polylogarithmic improvement over this bound. We note here that an analogous statement for an arbitrary distribution together with the Minimax Principle (see Fact 4) will imply that .
The paper is organized as follows. In Section 2, we give the definitions for some of the complexity measures. In Section 3, we define the expectational certificate complexity and prove the results concerning this measure, starting with Theorem 1. In Section 4, we define the minimum query corruption bound and prove Theorems 2 and 3. In Section 5, we list some open problems concerning our measures.
In this section we recall the definitions of some known complexity measures. For detailed introduction on the query model, see the survey [BdW02]. For the rest of this paper, is any total Boolean function on bits, .
Definition 1 (Randomized Query Complexity).
Let be a randomized algorithm that as an input takes and returns a Boolean value , where is any random string used by . With one query can ask the value of any input variable , for . The complexity of on is the number of queries the algorithm performs under randomness , given . The worst-case complexity of is .
The zero-error randomized query complexity is defined as , where is any randomized algorithm such that for all , we have .
The one-sided error randomized query complexity is defined as , where is any randomized algorithm such that for every such that , we have , and for all such that , we have . Similarly we define .
The two-sided error randomized query complexity is defined as , where is any randomized algorithm such that for every , we have . We denote simply by .
Definition 2 (Distributional Query Complexity).
be a probability distribution over, and . The distributional query complexity is the minimum number of queries made in the worst case (over inputs) by a deterministic query algorithm for which .
The Minimax Principle relates the randomized query complexity and distributional query complexity measures of Boolean functions.
Fact 4 (Minimax Principle).
For any Boolean function
Definition 3 (Product Distribution).
A probability distribution over is a product distribution if there exist functions such that for all and for all ,
Definition 4 (Certificate Complexity).
An assignment is a map . All inputs consistent with form a subcube . The length or size of an assignment, denoted by , is defined to be the co-dimension of the subcube it corresponds to. Let be the set of variables fixed by .
For , a -certificate for is an assignment such that . The certificate complexity of on is the size of the shortest -certificate that is consistent with . The certificate complexity of is defined as . The -certificate complexity of is defined as .
Definition 5 (Sensitivity and Block Sensitivity).
For and , let be flipped on locations in . The sensitivity of on is the number of different such that . The sensitivity of is defined as .
The block sensitivity of on is the maximum number of disjoint subsets such that for each . The block sensitivity of is defined as .
Definition 6 (Fractional Certificate Complexity).
The fractional certificate complexity of on
is defined as the optimal value of the following linear program:
Here and for each and . The fractional certificate complexity of is defined as .
Definition 7 (Fractional Block Sensitivity).
Let be the set of sensitive blocks of . The fractional block sensitivity of on is defined as the optimal value of the following linear program:
Here and for each and . The fractional block sensitivity of is defined as .
The linear programs and are duals of each other, hence their optimal solutions are equal and [GSS16].
3 Expectational Certificate Complexity
In this section, we give the results for the expectational certificate complexity. The measure is motivated by the well-known deterministic query algorithm which was independently discovered several times [BI87, HH87, Tar90]. In each iteration, the algorithm queries the set of variables fixed by some consistent 1-certificate. Either the query answers agree with the fixed values of the 1-certificate, in which case the input must evaluate to 1, or the algorithm makes progress as the 0-certificate complexity of all 0-inputs still consistent with the query answers is decreased by at least 1. The latter property is due to the crucial fact that the set of fixed values of any 0-certificate and 1-certificate must intersect.
In hopes of proving , a straightforward generalization to a randomized algorithm would be to pick a consistent 1-input and query each variable independently with probability , where is a fractional certificate for . To show that such an algorithm makes progress, one needs a property analogous to the fact that 0-certificates and 1-certificates overlap. Kulkarni and Tal give a similar intersection property for the fractional certificates:
Lemma 5 ([Kt16], Lemma 6.2).
Let be a total Boolean function and be an optimal solution for the linear program. Then for any two inputs such that , we have
However, it is not clear whether the algorithm makes progress in terms of reducing the fractional certificates of the 0-inputs. We get around this problem by replacing with the product and putting that the sum of these terms over where is at least 1 as a constraint:
Definition 8 (Expectational Certificate Complexity).
The expectational certificate complexity of is defined as the optimal value of the following program:
We use the term “expectational” because the described algorithm on expectation queries at least weight 1 in total from input , when querying the variables with probabilities being the weights of . While the informally described algorithm shows a quadratic upper bound on the worst-case expected complexity, in the next section we show a slight modification that directly makes a quadratic number of queries in the worst case.
3.1 Quadratic Upper Bound on Randomized Query Complexity
In this section we prove Theorem 1 (restated below).
The first inequality follows from Lemma 10 and .
To prove the second inequality, we give randomized query algorithms for with 1-sided error .
For any , we have .
Proof of Claim 6.
We prove the claim for . The case is similar.
Let be an optimal solution to the program. We say that an input is consistent with the queries made by on if for all queries that have been made. Also define a probability distribution for each input .
The complexity bound is clear as always performs at most queries.
For correctness, note that the algorithm outputs 1 on all 1-inputs. Thus assume is a 0-input from here on in the analysis. Then we have to prove that outputs 0 with probability at least . This amounts to showing that the function reduces to a constant 0 function and the algorithm terminates within iterations with probability at least . (For notational convenience, in what follows we will drop the ceilings and assume is an integer.)
Define a random variableas
Let . As by definition, implies that has terminated before point 2. Then it has returned 0, and the answer is correct. Let . We will prove that , in which case we would be done.
We continue by showing an upper and a lower bound on .
The maximum possible value of is at most
Let be the event that has terminated before the -th iteration. In case performs the -th iteration, let be consistent 1-input chosen and the random variable be the position that queries.
The first inequality here follows from the fact that any such that has not been queried yet, because and are both consistent with the queries made so far. Thus, the inequality holds regardless of the randomness chosen by . The second inequality follows from the expectational certificate properties and . By the linearity of expectation, we have that
Combining the two bounds together, we get Thus, . ∎
3.2 Relation with the Fractional Certificate Complexity
We show that a feasible solution for is also feasible for . Since for any ,
and we are done. ∎
Let be an optimal solution to the fractional certificate linear program for . We first modify each to a new feasible solution by eliminating the entries that are very small, and boosting the large entries by a constant factor. Namely, let
We first claim that is still a feasible solution. Fix any , and let be a minimal sensitive block for . As is part of a feasible solution, we have
The second line follows because , as is a minimal sensitive block and therefore every index in is sensitive. Rearranging the last inequality, we have and therefore,
Next, is a feasible solution to the expectational certificate program, as
The second inequality holds by Lemma 5.
Now that we have shown that forms a feasible solution to the expectation certificate program, it remains to bound its objective value:
where the first inequality follows from for .
Since and , we immediately get
3.3 Relation with the Certificate Complexity
We construct a feasible solution for from . Let be the shortest certificate for . Assign iff , otherwise let . Let be any two inputs such that . There is a position where , otherwise there would be an input consistent with both and , which would give a contradiction. Therefore, . The value of this solution is . ∎
As , there can be at most quadratic separation between and . We show that this is achieved by the example of Gilmer et. al. that separates and quadratically:
Theorem 11 ([Gss16], Theorem 32).
For every sufficiently large, there is a function such that and .
Their construction for is as follows. First a function is exhibited such that , and . The function is defined as a composition . This gives and (both properties follow by Proposition 31 in their paper).
Let us construct a feasible solution for . For any such that , let be the first index such that . Let be the set of positions that correspond to . Let for each position in , and for all other positions. Then .
On the other hand, let be an optimal solution to . For any such that , let for all . Then .
Now, for any two inputs such that and , let be the smallest index such that , then we have . By construction,
Hence is a feasible solution to the expectational certificate and .
4 Minimum Query Corruption Bound and Partition Bound
In this section we prove Theorem 2. We first consider the query corruption bound and minimum query corruption bound.
Definition 9 (Query Corruption Bound and Minimum Query Corruption Bound for product distributions).
Let and be a probability distribution over the inputs. For a , let an assignment be an -error -certificate under , if
Define the query corruption bound for , distribution and error as
The query corruption bound of is defined as , where ranges over all distributions on . The minimum query corruption bound of for product distributions is defined as , where ranges over all product distributions on .
We now proceed to the proof of Theorem 2 (restated below).
Let and a product distribution over the inputs. Then
In the proof we will have restrictions of probability distributions. Let be a probability distribution over , be a -bit string, and be a set of indices. The restriction of to the indices of , , will be denoted by . Then the distribution is the distribution obtained by conditioning on the event that the bits in the locations in agree with . Formally, for each
Proof of Theorem 2.
We present a deterministic query algorithm, and analyse its performance for inputs sampled according to . Examine the following algorithm:
For each , define to be the event that completes at least iterations and define to be the true event. Let be arbitrary, and assume that occurs. Then denotes the -error certificate (under ) picked in the -th iteration in step 2a. Let be the value approximately certified by under . Let denote the set of inputs such that . Recall from Section 2 is the set of variables set by . For each assignment to the variables fixed by and subset , let denote the shift of
by the vector. Formally (’ stands for bitwise exlusive or),
For , define to be the set of variables queried in first iterations and define . Note that , and is a product distribution.
Define all the above random variables to be if does not take place. Now define
First we bound the number of queries made by . Since terminates when either or , it performs at most many iterations. On the other hand since is a product distribution for each , therefore . Therefore, the algorithm makes many queries.
Now we prove that it errs on at most fraction of the inputs according to .
For every and .
Condition on the events . Furthermore, condition on . Notice that under this conditioning, the distribution of the input is .
If occurs, is an -error -certificate under . So . Since is a product distribution as observed before, we have that for each . The claim follows. ∎
In particular, Claim 12 implies that for all ,
Since runs for at most steps, by Equation (1), linearity of expectation and Markov’s inequality we have that
For such that occurs, define . The following claim will play a central role in our analysis.
Let . For each , let happen and . Then . In particular, if and then and are disjoint sensitive blocks for .
Clearly, . Also, since