Parity Decision Tree Complexity is Greater Than Granularity

10/19/2018 ∙ by Anastasiya Chistopolskaya, et al. ∙ 0

We prove a new lower bound on the parity decision tree complexity D_⊕(f) of a Boolean function f. Namely, granularity of the Boolean function f is the smallest k such that all Fourier coefficients of f are integer multiples of 1/2^k. We show that D_⊕(f)≥ k+1. This lower bound is an improvement of the known lower bound through the sparsity of f. Using our lower bound we determine the exact parity decision tree complexity of several important Boolean functions including majority, recursive majority and MOD^3 function. For majority the complexity is n - B(n)+1, where B(n) is the number of ones in the binary representation of n. For recursive majority the complexity is n+1/2. For MOD^3 the complexity is n-1 for n divisible by 3 and is n otherwise. Finally, we provide an example of a function for which our lower bound is not tight.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Parity decision trees is a computational model in which we compute a known Boolean function on an unknown input and in one query we can check the parity of arbitrary subset of inputs. The computational cost in this model is the number of queries we have made. The model is a natural generalization of a well-known decision trees model (in which only the value of a variable can be asked in one query) [4, 7].

Apart from being natural and interesting on its own parity decision trees model was studied mainly in connection with Communication Complexity and more specifically, with Log-rank Conjecture. In Communication Complexity most standard model there are two players Alice and Bob. Alice is given and Bob is given and they are trying to compute some fixed function on input . The question is how many communication is needed to compute in the worst case. It is known that the deterministic communication complexity of the function is lower bounded by , where is a communication matrix of  [11]. It is a long standing conjecture and one of the key open problems in Communication Complexity, called Log-rank Conjecture [12], to prove that is upper bounded by a polynomial of .

An important special case of Log-rank Conjecture addresses the case of XOR-functions for some , where

is a bit-wise XOR of Boolean vectors

and . On one hand, this class of functions is wide and captures many important functions (including equality, inner product, Hamming distance), and on the other hand the structure of XOR-functions allows to use analytic tools. For such functions is equal to the Fourier sparsity , the number of non-zero Fourier coefficients of . Thus, the Log-rank Conjecture for XOR-functions can be restated: is it true that is bounded by a polynomial of ?

Given a XOR-function a natural way for Alice and Bob to compute the value of the function is to use a parity decision tree for . They can simulate each query in the tree by computing parity of bits in their parts of the input separately and sending the results to each other. One query requires two bits of communication and thus . This leads to an approach to establish Log-rank Conjecture for XOR-function [22]: show that is bounded by a polynomial of .

This approach received a lot of attention in recent years and drew attention to parity decision trees themselves [22, 18, 17, 21, 19, 6]. In a recent paper [6] it was shown that actually and are polynomially related. This means that the simple protocol described above is not far from being optimal and that the parity decision tree version of Log-rank Conjecture stated above is actually equivalent to the original Log-rank Conjecture for XOR-functions.

All this motivates further research on parity decision trees. As for the lower bounds for parity decision tree complexity, one technique follows from the discussion above: . Although, if Log-rank conjecture for XOR-functions is true, this approach gives optimal bounds up to a polynomial, in many cases it does not help to determine the precise parity decision tree complexity of Boolean functions. For example, this approach always gives bounds of at most for functions of variables.

Another known approach is of a more combinatorial flavor. For standard decision trees there are several combinatorial measures known that lower bound decision tree complexity. Among them the most common are certificate complexity and block sensitivity. In [22] these measures were generalized to the setting of parity decision tree complexity. Parity decision tree complexity versions of these measures are actually known to be polynomially related to parity decision tree complexity [22]. However, they also do not give tight lower bounds for some interesting functions.

Yet another standard approach is through the degree of polynomials. It is well known that the complexity of a function in standard decision trees model is lower bounded by the degree of the function over (see, e.g. [4]). Completely analogously it can be shown that the parity decision tree complexity of a function is lower bounded by the degree of the function over (although, the adaptation to parity decision trees is straightforward we have not seen it mentioned in the literature; we provide a proof in Section 2 for the sake of completeness). This approach also does not give tight lower bounds for some interesting functions.

Examples of well-known functions for which the precise parity decision trees complexity is unknown include the majority function (playing a crucial role in many areas of Theoretical Computer Science, including Fourier analysis of Boolean functions) and recursive majority (interesting, in particular, from decision tree complexity point of view as it provides a gap for deterministic and randomized decision tree complexity [16, 13]).

Our Results

In this paper we address the problem of improving known lower bounds for parity decision tree complexity. Our main result is a new lower bound in terms of the granularity of a Boolean function.

Granularity of is the smallest such that all Fourier coefficients of are integer multiples of . We show that

It is a simple corollary of Parseval’s Identity that . Thus our lower bound is an improvement over the bound through sparsity. On the other hand, it was shown in [5] (see also [19]) that . Thus, this is an improvement by at most a factor of .

We also observe that , where by we denote the degree of over . Thus, our lower bound is also not weaker than the lower bound through the degree of the function.

Despite for our lower bound being close to the lower bound through sparsity, it allows to prove tight lower bounds for several important functions. Also unlike the lower bound through sparsity, new approach allows to prove lower bounds up to (the largest possible parity decision tree complexity of a function).

We hope that the connection between parity decision tree complexity and granularity will help to shed more light on the parity decision tree complexity.

We apply our lower bound to study the parity decision tree complexity of several well-known Boolean functions. We start with the majority function . We show that , where is the number of variables and is the number of ones in the binary representation of . The upper bound in this result is a simple adaptation of a folklore algorithm for the following problem (see, e.g. [15]

). Suppose that for odd

we are given balls of red and blue colors and we do not see the colors of the balls. In one query for any pair of balls we can check whether their colors are the same. Our goal is to find a ball of the same color as the majority of balls. We want to minimize the number of queries asked in the worst case. There is a folklore algorithm to solve this task in queries. It was shown in [15] that this is in fact optimal. On the idea level our lower bound for parity decision tree complexity is inspired by the proof of [15].

Due to the connection between parity decision tree complexity and multiplicative complexity communicated to us by Alexander Kulikov [10] from our results it follows that multiplicative complexity of is at least . This is an improvement of the lower bound of [3]. Previously our lower bound was known only in the case when is the power of  [3].

Next we proceed to recursive majority that computes an iteration of majority of three variables. We show that the parity decision tree complexity of this function is .

Finally, we show a series of examples of functions, for which our lower bound is not optimal. Namely, we consider threshold functions that check whether there are at least ones in the input. We show that for for and our lower bound implies that at least queries are needed to compute the function, whereas the actual parity decision tree complexity is . To prove this gap we combine our lower bound with an additional inductive argument allowing for a weak form of hardness amplification for the parity decision tree complexity of functions.

The rest of the paper is organized as follows. In Section 2 we provide necessary definition and preliminary information. In Section 3 we prove the lower bound on parity decision tree complexity. In Sections 4 and 5 we study the parity decision tree complexity of majority and recursive majority respectively. Finally, in Section 6 we provide an example of a function for which our lower bound is not tight. Some of the technical proofs are moved to Appendix.

2 Preliminaries

2.1 Fourier Analysis

Throughout the paper we assume that Boolean functions are functions of the form . That is, input bits are treated as 0 and 1 and to them we will usually apply operations over . Output bits are treated as and and the arithmetic will be over . The value correspond to ‘true’ and corresponds to ‘false’.

We denote the variables of functions by . We use the notation .

We briefly review the notation and needed facts from Boolean Fourier analysis. For extensive introduction see [14].

For functions consider an inner product

where the expectation is taken over uniform distribution of

on .

For a subset we denote by the Fourier character corresponding to . We denote by the corresponding Fourier coefficient of .

It is well-known that for any we have .

If (that is, if is Boolean) then the well-known Parseval’s Identity holds:

By the support of the Boolean function we denote

The sparsity of is . Basically, the sparsity of is the -norm of the vector of its Fourier coefficients.

Consider a binary fraction , that is is a rational number that can be written in a form that its denominator is a power of 2. By the granularity of we denote the minimal integer such that is an integer.

We will also frequently use the following closely related notation. For an integer denote by the maximal power of 2 that divides . It is convenient to set .

Note that for Boolean the Fourier coefficients of are binary fractions. By the granularity of we call the following value

It is easy to see that for any it is true that

and both of these bounds are achievable (for example, for and respectively).

It is known that is always not far from the logarithm of :

The first inequality can be easily obtained from Parseval’s identity. The second is a non-trivial result implicit in [5, Theorem 3.3 for ] (see also [19]). Again, both inequalities are tight (the first one is tight for inner product or any other bent function [14]; the second one is tight for example for ).

For a Boolean function denote by the degree of the multilinear polynomial computing as a Boolean function, that is for all we have if and otherwise. It is well known that such multilinear polynomial is unique for any and thus is well defined.

It is known that for any  [1]. We observe that the granularity is also lower bounded by the degree of the function.

Lemma 1.

For any we have .

Proof.

The proof strategy is similar to the one of [1].

For a function consider two subfunctions and on varaibles obtained from by setting variable to 0 and to 1 respectively. Note that for any we have

and

Thus,

and

In particular, the granularity of both and is not larger than the granularity of . From this we conclude that the granularity of a subfunction of is at most the granularity of .

Denote and consider a monomial of degree in the polynomial for . For simplicity of notation assume that this is the monomial . Fix all variables for to . We get a subfunction of of variables and degree . As discussed above , so it is enough to show that . For this note that since the function is of maximal degree we have that is odd (see, e.g. [7, Section 2.1]). Thus,

and the granularity of is . ∎

2.2 Parity Decision Trees

A parity decision tree is a rooted directed binary tree. Each of its leaves is labeled by or 1, each internal vertex is labeled by a parity function for some subset . Each internal node has two outgoing edges, one labeled by and another by 1. A computation of on input is the path from the root to one of the leaves that in each of the internal vertices follows the edge, that has label equal to the value of . Label of the leaf that is reached by the path is the output of the computation. The tree computes the function iff on each input the output of is equal to . Parity decision tree complexity of is the minimal depth of a tree computing . We denote this value by .

One known way to lower bound parity decision tree complexity goes through communication complexity of XOR functions. We state the bound in the following lemma (see, e.g. [6]).

Lemma 2.

For any function we have

This lower bound turns out to be useful in many cases, especially when we are interested in the complexity up to a multiplicative constant or up to a polynomial factor. However, it does not always help to find an exact value of the complexity of the function and in principle cannot give lower bounds greater than .

Another more combinatorial approach goes through analogs of certificate complexity and block sensitivity for parity decision trees [22]. Since parity block sensitivity is always less or equal then parity certificate complexity and we are interested in lower bounds, we will introduce only certificate complexity here.

For a function and denote by the minimal co-dimension of an affine subspace in that contains and on which is constant. The parity certificate complexity of is .

Lemma 3 ([22]).

For any function we have

This approach allows to show strong lower bounds for some functions. For example, it can be used to show that . However, for more complicated functions like majority or recursive majority this lemma does not give tight lower bounds.

Yet another approach to lower bounds for parity decision tree complexity is through polynomials. Although it is very similar to analogous connection for standard decision trees, we have not observed it in the literature.

Lemma 4.

For any we have .

Proof.

The proof of this lemma follows closely the proof connecting standard decision tree complexity of a function with its degree over (see, e.g. [4]).

Consider a parity decision tree computing with depth equal to . Consider arbitrary leaf of this tree and consider the path in leading from the root to . For computation to follow this path on input in each internal vertex the input must satisfy some linear restriction ( is the parity labeling if the path follows the edge labeled by out of and if the path follows the edge labeled by ). Denote all these linear forms in these restrictions along the path by , where . Thus, on input we follow the path to iff is satisfied. Denote this expression by .

Denote by the set of all leaves of that are labeled by . For any input we have that iff the computation path in reaches a leaf labeled with iff

It is left to observe that the latter expression is a multilinear polynomial over of degree at most . ∎

2.3 Multiplicative Complexity

Multiplicative complexity of a Boolean function is the minimal number of -gates in a circuit computing and consisting of , and gates, each gate of fan-in at most 2 (for formal definitions from circuit complexity see, e.g. [7]). This measure was studied in Circuit Compexity [3, 8, 2] as well as in connection to Cryptography [9, 20] and providing an explicit function on variables with is an important open problem.

The following lemma was communicated to us by Alexander Kulikov [10] and with his permission we include it with a proof.

Lemma 5.

For any on variables

Proof.

The proof is by induction on .

If , then is computed by a circuit consisting of and gates and thus is a linear form of its variables. We can compute it by one query in parity decision tree model.

For the step of induction, consider an arbitrary and consider a circuit computing with the number of -gates equal to . Consider the first -gate in . Both of its inputs compute linear forms over . Our decision tree algorithm queries one of inputs of . Depending on the answer to the query, computes either constant 0, or its second input. In both cases the gate computes a linear form over , so we can simplify the circuit and obtain a new circuit computing the same function on inputs consistent with the answer to the first query and with at most -gates. By induction hypothesis in both cases the function computed by is computable in parity decision tree model with at most queries. Overall, we make queries. ∎

3 Lower Bound on Parity Decision Trees

Through the connection to communication complexity it is known that for any . In our main result we improve this bound.

Theorem 6.

For any non-constant we have

Proof.

We prove the theorem by an adversary argument. That is, we will describe the strategy for the adversary to answer queries of a parity decision tree in order to make the tree to make many queries to compute the output.

Denote and denote by the subset on which the granularity is achieved, that is . We have that

Note that the first sum in the last expression is equal to if and is equal to otherwise. Thus for the granularity of to be equal to the sum should be divisible by and should not be divisible by . In other words (recall that is the maximal power of that divides ),

(1)

After each step of the computation the query fixes some parity of inputs to be equal to some fixed value. Denote by the set of inputs that are still consistent with the current node of a tree after step , and on which the function is equal to . We have that .

We will show that we can answer the queries in such a way that

(2)

To see this observe that the -st query splits the current set into two disjoint subsets and . In particular,

If both sums in the right-hand side are divisible by some power of , then the left-hand side also is. Thus,

Pick for the set, on which the minimum in the left-hand side is achieved.

Suppose the protocol makes queries. The set of inputs that reach the leaf forms an affine subspace of Boolean cube of dimension at least , on which the function must be constant. Thus the sum

is the sum of a character over an affine subspace, and thus is equal to either 0, or . In both cases

(3)

Combining (1)-(3) we get

and the theorem follows. ∎

4 Majority Function

In this section we analyze parity decision tree complexity of the majority function . The function is defined as follows:

To state our results we will need the following notation: let be the number of ones in a binary representation of .

We start with an upper bound. The following lemma is a simple adaptation of the folklore algorithm (see, e.g. [15]).

Lemma 7.
Proof.

Our parity decision tree will mostly make queries of the form for a pair of variables. Note that such a query basically checks whether and are equal.

Our algorithm will maintain splitting of input variables into blocks of two types. We will maintain the following properties:

  • the size of each block is a power of 2;

  • all variables in each block of type 1 are equal;

  • blocks of type 2 are balanced, that is they have equal number of ones and zeros.

In the beginning of the computation each variable forms a separate block of size one. During each step the algorithm will merge two blocks into a new one. Thus, after steps the number of blocks is .

The algorithms works as follows. On each step we pick two blocks of type 1 of equal size. We pick one variable from each block and query the parity of these two variables. If the variables are equal, we merge the blocks into a new block of type 1. If the variables are not equal, the new block is of type 2. The process stops when there are no blocks of type 1 of equal size.

It is easy to see that all of the properties listed above are maintained. In the end of the process we have some blocks of the second type (possibly none of them) and some blocks of the first type (possibly none of them) of pairwise non-equal size. Note that the value of the majority function is determined by the value of variables in the largest block of type 1. Indeed, all blocks of type 2 are balanced and the largest block of type 1 has more variables then all other blocks of type 1 in total. Thus, to find the value of it remains to query one variable from the largest block of type 1. Note, that the case when there are no blocks of type 1 in the end of the process correspond to balanced input (and even ). In this case we can tell that the output is without any additional queries.

Note that the sum of sizes of all blocks is equal to . Since the size of each block is a power of , there are at least blocks in the end of the computation (one cannot break in the sum of less then powers of ). Thus, overall we make at most queries and the lemma follows. ∎

Before proceeding with the lower bound we briefly discuss lower bounds that can be obtained by other approaches. It is known that  [14]. Thus from the sparsity lower bound we can only get .

Note also that each input to lies in the subcube of dimension at least . Indeed, if just pick a subcube on some subset of variables of size containing all ones of the input. The case is symmetrical. Thus, in the approach through certificate complexity we get .

Finally, we observe that the degree approach also does not give a matching lower bound.

Lemma 8.

For any we have where is the largest integer such that .

The proof of this lemma is provided in Appendix.

It is not hard to see that this lower bound matches the upper bound of Lemma 7 only for and . On the other hand, for example it is far from optimal by approximately a factor of 2 for for some .

We next show that Theorem 6 gives a tight lower bound for parity decision tree complexity of .

Lemma 9.

.

Proof.

We will show that . The inequality in the other direction follows from Lemma 7.

We consider the Fourier coefficient and show that its granularity is at least . Let . Note that is the smallest number such that is on inputs with ones.

Then we have

From this we can see that

We proceed to simplify the sum of binomials (a very similar analysis is presented in [15]):

Thus it remains to compute . For even we have and . For odd we have and .

By [15, Proposition 3.4] we have (alternatively this can be seen from Kummer’s theorem). Finally, notice that and . It follows that

and

Overall, we have the following theorem.

Theorem 10.

As a corollary from this result and Lemma 5 we get the following lower bound on the multiplicative complexity of majority.

Corollary 11.

This improves a lower bound of [3]. Previously our lower bound was known only for for some  [3].

5 Recursive Majority

Next we study the parity decision tree complexity of recursive majority . This is a function on variables and it can be defined recursively. For we just let . For we let

where each is applied to a separate block of variables.

We start with an upper bound.

Lemma 12.

.

Proof.

Basically, recursive majority is a function computed by a Boolean circuit which graph is a complete ternary tree of depth , each internal vertex is labeled by the function and each leaf is labeled by a (fresh) variable.

To construct an algorithm we first generalize the problem. We consider functions computed by Boolean circuits which graphs are ternary tree, where each non-leaf has fan-in and is labeled by , and each leaf is labeled by a fresh variable. We will show that if the number of non-leaf variables in the circuit is , then the function can be computed by a parity decision tree of size .

The proof is by induction on . If , then the function in question is just and by the results of Section 4 it can be computed by a parity decision tree of size .

For the step of induction consider a tree with non-leaf vertices. Consider a non-leaf vertex of the largest depth. All of its three inputs must be variables, lets denote them by , and , and in this vertex the function is computed. Our first query will be . It will tell us whether and are equal. If are equal, then , and if , then . Thus, we can substitute the gate in our vertex by the corresponding variable and reduce the problem to the circuit with non-leaf vertices. By induction hypothesis, the function computed by this circuit can be computed by at most queries. Thus, our original function is computable by queries.

It is left to observe that a complete ternary tree of depth has non-leaf vertices and for this tree our algorithm makes queries. ∎

Before proceeding to the lower bound we again discuss lower bounds that can be obtained by other techniques.

First note that each input lies in the subspace of co-dimension at most on which the function is constant. For this it is enough to show that in each we can flip variables without changing the value of the function. This is easy to check by induction on . For there are two variables that are equal to each other and we can flip the third variable without changing the value of the function. For consider inputs to the at the top of the circuit. Two of them are equal and by induction hypothesis we can flip variables in each of them without changing the value of the function. The last input to the top gate does not affect the value of the function and we can flip all variables in it. Overall this gives us variables. This gives us which does not give a matching lower bound.

Also note that the polynomial computing is . The polynomial for can be computed by a simple composition of with itself. It is easy to see that its degree is . Thus, an approach through polynomials over does not give strong lower bounds.

For Fourier analytic considerations it is convenient to switch to Boolean inputs. For a variable let us denote by the variable . For now we will use new variables as inputs to Boolean functions.

The Fourier decomposition of is

(4)

From this the Fourier decomposition of can be obtain by recursion:

(5)

where are blocks of variables.

Lemma 2 can give lower bounds up to and thus in principle might give at least almost matching lower bound. However, this is not the case as we discuss below.

Note that since there is no free coefficient in the polynomial (4), Fourier coefficients arising from all three summands in the right-hand side of (5) will not cancel out with each other: none two of them have equal set of variables. Thus, if we denote we have that and

(6)

for . On one hand, this means that . This gives . Thus and .

On the other hand if we let , it is easy to check that (6) implies

Since this gives . Thus,

Thus Lemma 2 can give us a lower bound of at most . We note that this upper bound on the sparsity can be further improved by letting for smaller .

Now we proceed to the tight lower bound. Again we will estimate

. Observe that this Fourier coefficient can be easily computed from (4) and (5). Indeed, from (4) we have that . From (5) we have that

The numerator of this Fourier coefficient equals to for any . Thus, denoting for we have and

It is straightforward to check that . From this, Theorem 6 and Lemma 12 the following theorem follows.

Theorem 13.

, where is the number of variables.

6 A Function with

In this section we provide an example of a function for which our lower bound is not tight. For this we study the family of threshold functions.

For arbitrary and we let

where . Note that .

Our examples will form a subfamily of this family of functions.

To show that our lower bound is not tight we need an approach to prove even better lower bounds. We will do it via the following theorem.

Theorem 14.

For any if , then .

Proof.

We will argue by a contradiction. Assume that . We will construct a parity decision tree for making no more than queries.

Denote the input variables to by . We introduce one more variable (which we will fix later) and consider the sequence as inputs to the algorithm for . Note that . Our plan is to simulate the algorithm for on and save one query on our way.

Consider the first query that the algorithm makes to . Suppose first that the query does not ask the parity of all variables (we will deal with this case later). Since the function is symmetric we can rename the input bits in such a way that the query contains input and does not contain , that is the query asks the parity for some . Now it is time for us to fix the value of . We let . Then the answer to the first query is , we can skip it and proceed to the second query. For each next query of the algorithm for if it contains or (or both) we substitute them by and respectively. The result is the parity of some variables among and we make this query to our original input . Clearly the answer to the query to is the same as the answer to the original query to . Thus, making at most queries we reach the leaf of the tree for and thus compute .

It remains to consider the case when the first query to is . This parity is equal to and we make this query to . Now we proceed to the second query in the computation of and this query is not equal to