# Quantum algorithms and approximating polynomials for composed functions with shared inputs

We give new quantum algorithms for evaluating composed functions whose inputs may be shared between bottom-level gates. Let f be a Boolean function and consider a function F obtained by applying f to conjunctions of possibly overlapping subsets of n variables. If f has quantum query complexity Q(f), we give an algorithm for evaluating F using Õ(√(Q(f) · n)) quantum queries. This improves on the bound of O(Q(f) ·√(n)) that follows by treating each conjunction independently, and is tight for worst-case choices of f. Using completely different techniques, we prove a similar tight composition theorem for the approximate degree of f. By recursively applying our composition theorems, we obtain a nearly optimal Õ(n^1-2^-d) upper bound on the quantum query complexity and approximate degree of linear-size depth-d AC^0 circuits. As a consequence, such circuits can be PAC learned in subexponential time, even in the challenging agnostic setting. Prior to our work, a subexponential-time algorithm was not known even for linear-size depth-3 AC^0 circuits. We also show that any substantially faster learning algorithm will require fundamentally new techniques.

## Authors

• 15 publications
• 12 publications
• 10 publications
• ### Bounds on the QAC^0 Complexity of Approximating Parity

QAC circuits are quantum circuits with one-qubit gates and Toffoli gates...
08/17/2020 ∙ by Gregory Rosenthal, et al. ∙ 0

• ### Algorithmic Polynomials

The approximate degree of a Boolean function f(x_1,x_2,...,x_n) is the m...
01/14/2018 ∙ by Alexander A. Sherstov, et al. ∙ 0

• ### Quantum Algorithm for the Multicollision Problem

The current paper presents a new quantum algorithm for finding multicoll...
11/07/2019 ∙ by Akinori Hosoyamada, et al. ∙ 0

• ### Quantum hardness of learning shallow classical circuits

In this paper we study the quantum learnability of constant-depth classi...
03/07/2019 ∙ by Srinivasan Arunachalam, et al. ∙ 0

• ### Degree vs. Approximate Degree and Quantum Implications of Huang's Sensitivity Theorem

Based on the recent breakthrough of Huang (2019), we show that for any t...
10/23/2020 ∙ by Scott Aaronson, et al. ∙ 0

• ### A New Minimax Theorem for Randomized Algorithms

The celebrated minimax principle of Yao (1977) says that for any Boolean...
02/25/2020 ∙ by Shalev Ben-David, et al. ∙ 0

• ### Approximating the Determinant of Well-Conditioned Matrices by Shallow Circuits

The determinant can be computed by classical circuits of depth O(log^2 n...
12/09/2019 ∙ by Enric Boix-Adserà, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the query, or black-box, model of computation, an algorithm aims to evaluate a known Boolean function on an unknown input by reading as few bits of as possible. One of the most basic questions one can ask about query complexity, or indeed any complexity measure of Boolean functions, is how it behaves under composition. Namely, given functions and , and a method of combining these functions to produce a new function , how does the query complexity of depend on the complexities of the constituent functions and ?

The simplest method for combining functions is block composition, where the inputs to are obtained by applying the function to independent sets of variables. That is, if and , then the block composition is defined by where each is a -bit string. In most reasonable models of computation, one can evaluate by running an algorithm for , and using an algorithm for to compute the inputs to as needed. Thus, the query complexity of is at most the product of the complexities of and .111In some “reasonable models,” such as those with bounded error, one must take care to ensure that errors in computing each copy of do not propagate, but we elide these issues for this introduction. Addressing this concern typically adds at most a logarithmic overhead.

For many query models, including those capturing deterministic and quantum computation, this is known to be tight. In particular, letting denote the bounded-error quantum query complexity of a function , it is known that for all Boolean functions and  [HLŠ07, Rei11]. This result has the flavor of a direct sum theorem: When computing many copies of the function (in this case, as many as are needed to generate the necessary inputs to ), one cannot do better than just computing each copy independently.

### 1.1 Quantum algorithms for shared-input compositions

While we have a complete understanding of the behavior of quantum query complexity under block composition, little is known for more general compositions. What is the quantum query complexity of a composed function where inputs to are generated by applying to overlapping sets of variables? We call these more general compositions shared-input compositions. Not only does answering this question serve as a natural next step for improving our understanding of quantum query complexity, but it may lead to more unified algorithms and lower bounds for specific functions of interest in quantum computing. Many of the functions that have played an influential role in the study of quantum query complexity can be naturally expressed as compositions of simple functions with shared inputs, including -distinctness, -sum, surjectivity, triangle finding, and graph collision.

In this work, we study shared-input compositions between an arbitrary function and the function . If , then we let be any function obtained by generating each input to as an over some subset of (possibly negated) variables from , as depicted in Figure 1.

Of course, one can compute the function by ignoring the fact that the gates depend on shared inputs, and instead regard each gate as depending on its own set of copies of the input variables. Using the quantum query upper bound for block compositions, together with the fact that  [Gro96, BBBV97], one obtains

 Q(h)=O(Q(f)⋅Q(ANDn))=O(Q(f)⋅√n). (1)

Observe that this bound on is non-trivial only if . A priori, one may conjecture that this bound is tight in the worst case for shared-input compositions. After all, if the variables overlap in some completely arbitrary way with no structure, it is unclear from the perspective of an algorithm designer how to use the values of already-computed gates to reduce the number of queries needed to compute further gates. It might even be the case that every pair of gates shares very few common input bits, suggesting that evaluating one gate yields almost no information about the output of any other gate. This intuition even suggests a path for proving a matching lower bound: Using a random wiring pattern, combinatorial designs, etc., construct the set of inputs to each gate so that evaluating any particular gate leaks almost no useful information that could be helpful in evaluating the other gates.

In this work, we show that this intuition is wrong: the overlapping structure of the gates can always be exploited algorithmically (so long as ).

#### Results.

Our main result shows that a shared-input composition between a function and the function always has substantially lower quantum query complexity than the block composition . Specifically, instead of having quantum query complexity which is the product

, a shared-input composition has quantum query complexity which is, up to logarithmic factors, the geometric mean

between and the number of input variables . This bound is nontrivial whenever is significantly smaller than .

###### Theorem 1.

Let be computed by a depth-2 circuit where the top gate is a function and the bottom level gates are gates on a subset of the input bits and their negations (as depicted in Figure 1). Then we have

 Q(h)=O(√Q(f)⋅n⋅log2(mn)). (2)

Note that Theorem 1 is nearly tight for every possible value of . For a parameter , consider the block composition (i.e., the composition with disjoint inputs) . Since  [BBC01], this function has quantum query complexity

 Q(PARITYt∘ANDn/t)=Θ(t⋅√n/t)=Θ(√Q(PARITYt)⋅n), (3)

matching the upper bound provided by Theorem 1 up to log factors. This shows that Theorem 1 cannot be significantly improved in general.

The proof of Theorem 1 makes use of an optimal quantum algorithm for computing and Grover’s search algorithm for evaluating gates. Surprisingly, it uses no other tools from quantum computing. The core of the argument is entirely classical, relying on a recursive gate and wire-elimination argument for evaluating gates with overlapping inputs.

At a high level, the algorithm in Theorem 1 works as follows. The overall goal is to query enough input bits such that the resulting circuit is simple enough to apply the composition upper bound . To apply this upper bound and obtain the claimed upper bound in Theorem 1, we require to be . Since is just an gate on some subset of inputs, this means we want the fan-in of each gate in our circuit to be . If we call gates with fan-in “high fan-in” gates, then the goal is to eliminate all high fan-in gates. Our algorithm achieves this by judiciously querying input bits that would eliminate a large number of high fan-in gates if they were set to 0.

Besides the line of work on the quantum query complexity of block compositions, our result is also closely related to work of Childs, Kimmel, and Kothari [CKK12] on read-many formulas. Childs et al. showed that any formula on inputs consisting of gates from the de Morgan basis can be evaluated using quantum queries. In the special case of DNF formulas, our result coincides with theirs by taking the top function to be the function. However, even in this special case, the result of Childs et al. makes critical use of the top function being . Specifically, their result uses the fact that the quantum query complexity of the function is the square root of its formula size. Our result, on the other hand, applies without making any assumptions on the top function . This level of generality is needed when using Theorem 1 to understand circuits (rather than just formulas) of depth 3 and higher, as discussed in Section 1.3.

### 1.2 Approximate degree of shared-input compositions

We also study shared-input compositions under the related notion of approximate degree. For a Boolean function , an -approximating polynomial for is a real polynomial such that for all . The -approximate degree of , denoted , is the least degree among all -approximating polynomials for . We use the term approximate degree without qualification to refer to choice , and denote it .

A fundamental observation due to Beals et al. [BBC01] is that any -query quantum algorithm for computing a function implicitly defines a degree- approximating polynomial for . Thus, . This relationship has led to a number of successes in proving quantum query complexity lower bounds via approximate degree lower bounds, constituting a technique known as the polynomial method in quantum computing. Conversely, quantum algorithms are powerful tools for establishing the existence of low-degree approximating polynomials that are needed in other applications to theoretical computer science. For example, the deep result that every de Morgan formula of size has quantum query complexity, and hence approximate degree,  [FGG08, CCJYM09, ACR10, Rei11] underlies the fastest known algorithm for agnostically learning formulas [KKMS08, Rei11] (See Section 1.4 and Section 5 for details on this application). It has also played a major role in the proofs of the strongest formula and graph complexity lower bounds for explicit functions [Tal17].

#### Results.

We complement our result on the quantum query complexity of shared-input compositions with an analogous result for approximate degree.

###### Theorem 2.

Let be computed by a depth-2 circuit where the top gate is a function and the bottom level gates are gates on a subset of the input bits and their negations (as depicted in Figure 1). Then

 degε(h)=O(√degε(f)⋅n⋅logm+√nlog(1/ε)). (4)

In particular, .

Note that our result for approximate degree is incomparable with Theorem 1, even for bounded error, since both sides of the equation include the complexity measure under consideration.

Like Theorem 1, Theorem 2 can be shown to be tight by considering the block composition of with , since [She13b, She11b].

Our proof of Theorem 2 abstracts and generalizes a technique introduced by Sherstov [She18], who very recently proved an upper bound on the approximate degree of an important depth-3 circuit of nearly quadratic size called Surjectivity [She18]. Despite the similarity between Theorem 2 and Theorem 1, and the close connection between approximating polynomials and quantum algorithms, the proof of Theorem 2 is completely different from Theorem 1, making crucial use of properties of polynomials that do not hold for quantum algorithms.222Any analysis capable of yielding a sublinear upper bound on the approximate degree of Surjectivity requires moving beyond quantum algorithms, as its quantum query complexity is known to be  [BM12, She15]. In our opinion, this feature of the proof of Theorem 2 makes Theorem 1 for quantum algorithms even more surprising.

We remark that a different proof of the upper bound for the approximate degree of Surjectivity was discovered in [BKT18], who also showed a matching lower bound. It is also possible to prove Theorem 2 by generalizing the techniques developed in that work, but the techniques of [She18] lead to a shorter and cleaner analysis.

### 1.3 Application: Evaluating and approximating linear-size AC0 circuits

The circuit class consists of constant-depth, polynomial-size circuits over the de Morgan basis with unbounded fan-in gates. The full class is known to contain very hard functions from the standpoint of both quantum query complexity and approximate degree. The aforementioned Surjectivity function is in depth-3 and has quantum query complexity  [BM12, She15], while for every positive constant , there exists a depth- circuit with approximate degree  [BT17].

Nevertheless, contains a number of interesting subclasses for which nontrivial quantum query and approximate degree upper bounds might still hold. Here, we discuss applications of our composition theorem to understanding the subclass , consisting of circuits of linear size.

The class is one of the most interesting subclasses of . It has been studied by many authors in various complexity-theoretic contexts, ranging from logical characterizations [KLPT06] to faster-than-brute-force satisfiability algorithms [CIP09, SS12]. turns out to be a surprisingly powerful class. For example, the -threshold function that asks if the input has Hamming weight greater than is clearly in for constant , by computing the of all possible certificates. But this yields a circuit of size , which one might conjecture is optimal. However, it turns out that -threshold is in even when is as large as  [RW91]. Another surprising fact is that every regular language in can be computed by an circuit of almost linear size (e.g., size suffices) [Kou09].

By recursively applying Theorem 1, we obtain the following sublinear upper bound on the quantum query complexity of depth- circuits, denoted by :

###### Theorem 3.

For all and all functions in , we have .

Our upper bound is nearly tight for every depth , as shown in [CKK12].

###### Theorem 4 (Childs, Kimmel, and Kothari).

For all , there exists a function in with .

By recursively applying Theorem 2, we obtain a similar sublinear upper bound for the -approximate degree of , even for subconstant values of .

###### Theorem 5.

For all , , and all functions in , we have

 degε(h)=˜O(n1−2−dlog(1/ε)2−d). (5)

For constant , we prove a lower bound of the same form with quadratically worse dependence on the depth .

###### Theorem 6.

For all , there exists a function in with .

A lower bound of was already known for general functions [BT17, BKT18], but the circuits constructed in these prior works are not of linear size. Previously, for any , [BKT18] exhibited a circuit of depth at most , size at most , and approximate degree . We show how to transform this quadratic-size circuit into a linear-size circuit of depth roughly , whose approximate degree is close to that of . Our transformation adapts that of [CKK12], but requires a more intricate construction and analysis. This is because, unlike quantum query complexity, approximate degree is not known to increase multiplicatively under block composition.

### 1.4 Application: Agnostically learning linear-size AC0 circuits

The challenging agnostic model [KSS94]

of computational learning theory captures the task of binary classification in the presence of adversarial noise. In this model, a learning algorithm is given a sequence of labeled examples of the form

drawn from an unknown distribution . The goal of the algorithm is to learn a hypothesis which does “almost as well” at predicting the labels of new examples drawn from

as does the the best classifier from a known concept class

. Specifically, let the Boolean loss of a hypothesis be . For a given accuracy parameter , the goal of the learner is to produce a hypothesis such that .

Very few concept classes are known to be agnostically learnable, even in subexponential time. For example, the best known algorithm for agnostically learning disjunctions runs in time  [KKMS08]. Moreover, several hardness results are known. Proper agnostic learning of disjunctions (where the output hypothesis itself must be a disjunction) is NP-hard [KSS94]. Even improper agnostic learning of disjunctions is at least as hard as PAC learning DNF [LBW95], which is a longstanding open question in learning theory.

The best known general result for more expressive classes of circuits is that all de Morgan formulas of size can be learned in time  [KKMS08, Rei11] (Section 5.1 contains a detailed overview of prior work on agnostic and PAC learning). Both of the aforementioned results make use of the well-known linear regression framework of [KKMS08] for agnostic learning. This algorithm works whenever there is a “small” set of “features” (where each feature is a function mapping to ) such that each concept in the concept class can be approximated to error in the norm by a linear combination of features in . (See Section 5 for details.) If every function in a concept class has approximate degree at most , then one obtains an agnostic learning algorithm for with running time by taking to be the set of all monomials of degree at most . Applying this algorithm using the approximate degree upper bound of Theorem 5 yields a subexponential time algorithm for agnostically learning .

###### Theorem 7.

The concept class of -bit functions computed by circuits of depth can be learned in the distribution-free agnostic PAC model in time . More generally, size- circuits can be learned in time .

Prior to our work, no subexponential time algorithm was known even for agnostically learning . Moreover, since our upper bound on the approximate degree of circuits is nearly tight, new techniques will be needed to significantly surpass our results, and in particular, learn all of in subexponential time. (Note that standard techniques [She11a] automatically generalize the lower bound of Theorem 6 from the feature set of low-degree monomials to arbitrary feature sets. See Section 5.2 for details.)

### 1.5 Discussion and future directions

Summarizing our results, we established shared-input composition theorems for quantum query complexity (Theorem 1) and approximate degree (Theorem 2), roughly showing that for compositions between an arbitrary function and the function , it is always possible to leverage sharing of inputs to obtain algorithmic speedups. We applied these results to obtain the first sublinear upper bounds on the quantum query complexity and approximate degree of .

#### Generalizing our composition theorems.

Although considering the inner function is sufficient for our applications to , an important open question is to generalize our results to larger classes of inner functions. The proof of our composition theorem for approximate degree actually applies to any inner function that can be exactly represented as a low-weight sum of s (for example, it applies to any strongly unbalanced function , meaning that ). Extending this further would be a major step forward in our understanding of how quantum query complexity and approximate degree behave under composition with shared inputs.

While our paper considers the composition scenario where the top function is arbitrary and the bottom function is , the opposite scenario is also interesting. Here the top function is and the bottom functions are , each acting on the same set of input variables. Now the question is whether we can do better than the upper bound obtained using results on block composition that treat all the input variables as being independent. More concretely, for such a function , the upper bound that follows from block composition is . However, this upper bound cannot be improved in general, because the Surjectivity function is an example of such a function. Here the bottom functions check if the input contains a particular range element , and the upper bound obtained from this argument is , which matches the lower bound [BM12, She15]. Surprisingly, this lower bound only holds for quantum query complexity, as we know that the approximate degree of Surjectivity is . We do not know if the upper bound obtained from block composition can be improved for approximate degree.

#### Quantum query complexity of LC0 and DNFs.

For quantum query complexity, we obtain the upper bound , nearly matching the lower bound from [CKK12]. However, the bounds do not match for any fixed value of . The lack of matching lower bounds can be attributed to the fact that the Surjectivity function, which is known to have linear quantum query complexity, is computed by a quadratic-size depth-3 circuit, rather than a quadratic-size depth-2 circuit (i.e., a DNF). If one could prove a linear lower bound on the quantum query complexity of some quadratic-size DNF, the argument of [CKK12] would translate this into a lower bound for , matching our upper bound. Unfortunately, no linear lower bound on the quantum query complexity of any polynomial size DNFs is known; we highlight this as an important open problem (the same problem was previously been posed by Troy Lee with different motivations [Lee12]).

###### Open Problem 1.

Is there a polynomial-size DNF with quantum query complexity?

The quantum query complexity of depth-2 , or linear-size DNFs also remains open. The best upper bound is , but the best lower bound is  [CKK12]. Any improvement in the lower bound would also imply, in a black-box way, an improved lower bound for the Boolean matrix product verification problem. Improving the lower bound all the way to would imply optimal lower bounds for all of using the argument in [CKK12]. We conjecture that there is a linear-size DNF with quantum query complexity , matching the known upper bound.

#### Approximate degree of LC0 and DNFs.

For approximate degree, we obtain the upper bound , and prove a new lower bound of . The reason our approximate degree lower bound approaches more slowly than the quantum query lower bound from [CKK12] is that, while the quantum query complexity of is known to be , such a result is not known for approximate degree. This remains an important open problem.

###### Open Problem 2.

Is there a problem in with approximate degree ?

Our lower bound argument would translate, in a black-box manner, any linear lower bound on the approximate degree of a general circuit into a nearly tight lower bound for .

Alternatively, it would be very interesting if one could improve our approximate degree upper bound for . Even seemingly small improvements to our upper bound would have major implications. Specifically, standard techniques (see, e.g., [Kop13]) imply that there are approximate majority functions computable by depth- circuits of size , where the constant hidden in the Big-Oh notation is independent of . This means that, for sufficiently large constant , if one could improve our upper bound on the approximate degree of from to , one would obtain a sublinear upper bound on the approximate degree of some total function computing an approximate majority. This would answer a question of Srinivasan [FHH14], and would be considered a very surprising result, as approximate majorities are currently the only natural candidate functions that may exhibit linear approximate degree [BKT18].

### 1.6 Paper organization and notation

This paper is organized so as to be accessible to readers without familiarity with quantum algorithms. Section 2 assumes the reader is somewhat familiar with quantum query complexity and Grover’s algorithm [Gro96], but only uses Grover’s algorithm as a black box. In Section 2 we show our main result on the quantum query complexity of shared-input compositions (Theorem 1). Section 3 proves our result about the approximate degree of shared-input compositions (Theorem 2). Section 4 uses the results of these sections (in a black-box manner) to upper bound the quantum query complexity and approximate degree of circuits, and proves related lower bounds. Section 5 uses the results of Section 4 to obtain algorithms to agnostically PAC learn circuits.

In this paper we use the and notation to suppress logarithmic factors. More formally, means there exists a constant such that , and similarly means there exists a constant such that . For a string , we use to denote the Hamming weight of , i.e., the number of entries in equal to . For any positive integer , we use to denote the set . Given two functions , let denote their block composition, i.e., , where for every , is a -bit string.

## 2 Quantum algorithm for composed functions

### 2.1 Preliminaries

As described in the introduction, our quantum algorithm only uses variants of Grover’s algorithm [Gro96] and is otherwise classical. To make this section accessible to those without familiarity with quantum query complexity, we only state the minimum required preliminaries to understand the algorithm. Furthermore, we do not optimize the logarithmic factors in our upper bound to simplify the presentation. For a more comprehensive introduction to quantum query complexity, we refer the reader to the survey by Buhrman and de Wolf [BdW02].

In quantum or classical query complexity, the goal is to compute some known function on some unknown input while reading as few bits of as possible. Reading a bit of is also referred to as “querying” a bit of , and hence the goal is to minimize the number of queries made to the input.

For example, the deterministic query complexity of a function is the minimum number of queries needed by a deterministic algorithm in the worst case. A deterministic algorithm must be correct on all inputs, and can decide which bit to query next based on the input bits it has seen so far. Another example of a query model is the bounded-error randomized query model. The bounded-error randomized query complexity of a function , denoted

, is the minimum number of queries made by a randomized algorithm that computes the function correctly with probability greater than or equal to

on each input. In contrast to a deterministic algorithm, such an algorithm has access to a source of randomness, which it may use in deciding which bits to query.

The bounded-error quantum query complexity of , denoted , is similar to bounded-error randomized query complexity, except that the algorithm is now quantum. In particular, this means the algorithm may query the inputs in superposition. Since quantum algorithms can also generate randomness, for all functions we have .

An important example of the difference between the two models is provided by the function, which asks if any of the input bits is equal to 1. We have , because intuitively if the algorithm only sees a small fraction of the input bits and they are all , we do not know whether or not the rest of the input contains a . However, Grover’s algorithm is a quantum algorithm that solves this problem with only queries [Gro96]. The algorithm is also known to be tight, and we have  [BBBV97].

There are several variants of Grover’s algorithm that solve related problems and are sometimes more useful than the basic version of the algorithm. Most of these can be derived from the basic version of Grover’s algorithm (and this sometimes adds logarithmic overhead).

In this work we need a variant of Grover’s algorithm that finds a in the input faster when there are many s. Let the Hamming weight of the input be . If we know , then we can use Grover’s algorithm on a randomly selected subset of the input of size , and one of the s will be in this set with high probability. Hence the algorithm will have query complexity . With some careful bookkeeping, this can be done even when is unknown, and the algorithm will have expected query complexity . More formally, we have the following result of Boyer, Brassard, Høyer, and Tapp [BBHT98].

###### Lemma 8.

Given query access to a string , there is a quantum algorithm that when , always outputs an index such that and makes queries in expectation. When , the algorithm does not terminate.

Note that because we do not know , we only have a guarantee on the expected query complexity of the algorithm, not the worst-case query complexity. Note also that this variant of Grover’s algorithm is a zero-error algorithm in the sense that it always outputs a correct index with when such an index exists.

In our algorithm we use an amplified version of the algorithm of Lemma 8, which adds a log factor to the running time and always terminates in time.

###### Lemma 9.

Given query access to a string , there is a quantum algorithm that

1. when , the algorithm always outputs “”,

2. when , it outputs an index with with probability , and

3. terminates after queries with probability .

###### Proof.

This algorithm is quite straightforward. We simply run instances of the algorithm of Lemma 8 in parallel and halt if any one of them halts. If we reach our budget of queries, then we halt and output “”.

Let us argue that the algorithm has the claimed properties. First, since the algorithm of Lemma 8 does not terminate when , our algorithm will correctly output “” at the end for such inputs. When , we know that the algorithm of Lemma 8 will find an index with with high probability after time. The probability that copies of this algorithm do not find such an is exponentially small in , or polynomially small in . Finally, our algorithm makes only queries when by construction. When , we know that the algorithm of Lemma 8 terminates after an expected queries, and hence halts with high probability after queries by Markov’s inequality. The probability that none of copies of the algorithm halt after making queries each is inverse polynomially small in again. ∎

### 2.2 Quantum algorithm

We are now ready to present our main result for quantum query complexity, which we restate below.

See 1

While Theorem 1 allows the bottom gates to depend on negated variables, it will be without loss of generality in the proof to assume that all input variables are unnegated. This is because we can instead work with the function obtained by treating the positive and negative versions of a variable separately, increasing our final quantum query upper bound by a constant factor.

We now define some notation that will aid with the description and analysis of the algorithm. We know that our circuit has gates and input bits . We say an gate has high fan-in if the number of inputs to that gate is greater than or equal to . Note that if our circuit has no high fan-in gates, then we are done, because we can simply use the upper bound for block composition, i.e., , to compute , since we will have .

Our goal is to reduce to this simple case. More precisely, we will start with the given circuit , make some queries to the input, and then simplify the given circuit to obtain a new circuit . The new circuit will have no high fan-in gates, but will still have on the given input . Note that and have the same output only for the given input , and not necessarily for all inputs.

For any such circuit , let be the set of all high fan-in gates, and let be the total fan-in of , which is the sum of fan-ins of all gates in . In other words, it is the total number of wires incident to the set . Since the set only has gates with fan-in at least , we have

 w(S)≥n|S|/Q(f). (6)

We now present our first algorithm, which is a subroutine in our final algorithm. This algorithm’s goal is to take a circuit , with high fan-in gates and wires incident on , and reduce the size of by a factor of . Ultimately we want to have , and hence if we can decrease the size of by , we can repeat this procedure logarithmically many times to get .

###### Lemma 10.

Let be a depth-2 circuit where the top gate is a function and the bottom level gates are gates on a subset of the input bits and their negations (as depicted in Figure 1). Let be the total fan-in of all high fan-in gates in (i.e., gates with fan-in ).

Then there is a quantum query algorithm that makes queries to and outputs a new circuit of the same form such that , where is the total fan-in of all high fan-in gates in , and such that with probability we have .

###### Proof.

The overall structure of the claimed algorithm is the following: We query some well-chosen input bits, and on learning the values of these bits, we simplify the circuit accordingly. If an input bit is 0, then we delete all the gates that use that input bit. If an input bit is 1, we delete all outgoing wires from that input bit since a 1-input does not affect the output of an gate.

Since the circuit will change during the algorithm, let us define to be the initial set of high fan-in (i.e., gates with fan-in ) gates in .

We also define the degree of an input , denoted , to be the number of high fan-in gates that it is an input to. Note that this is not the total number of outgoing wires from , but only those that go to high fan-in gates, i.e., gates in the set . With this definition, note that , for any circuit. We say an input bit is high degree if . This value is chosen since it is at least half the average degree of all in the initial circuit . As the algorithm progresses, the circuit will change, and some inputs that were initially high degree may become low degree as the algorithm progresses, but a low degree input will never become high degree. But note that the definition of a high-degree input bit does not change, since it only depends on and , which are fixed for the duration of the algorithm.

Finally, we call an input bit is marked if . We are now ready to describe our algorithm by the following pseudocode (see Algorithm 1).

In more detail, we repeatedly use the version of Grover’s algorithm in Lemma 9 to find a high-degree marked input, which is an input such that and . If we find such an input, we delete all the gates that use as an input, and repeat this procedure. Note that when we repeat this procedure, the circuit has changed, and hence the set of high-degree input bits may become smaller. The algorithm halts when Grover’s algorithm is unable to find any high-degree marked inputs. At this point, all the high-degree inputs are necessarily unmarked with very high probability, which means they are set to . We can now delete all these input bits and their outgoing wires because gates are unaffected by input bits set to .

Let us now argue that this algorithm is correct. Let denote the set of high fan-in gates in the new circuit obtained at the end of the algorithm, and be the total fan-in of gates in . Note that when the algorithm terminates, there are no high-degree inputs (marked or unmarked). Hence every input bit that has not been deleted has . Since there are at most input bits, we have

 w(S′)=∑i∈[n]deg(i)

But we also know that we started with , since each gate in has fan-in at least . Hence , which proves that the algorithm is correct.

We now analyze the query complexity of this algorithm. Let the loop in the algorithm execute times. It is easy to see that because each time a high-degree marked input is found, we delete all the gates that use it as an input, which is at least gates. Since there were at most gates to begin with, this procedure can only repeat times.

When we run Grover’s algorithm to search for a high-degree marked input bit in the first iteration of the loop, suppose there are high-degree marked inputs. Then the variant of Grover’s algorithm in Lemma 9 finds a marked high-degree input and makes queries with probability . In the second iteration of the loop, the number of high-degree marked inputs, , has decreased by at least one. It can also decrease by more than 1 since we deleted several gates, and some high-degree inputs can become low-degree. In this iteration, our variant of Grover’s algorithm (Lemma 9) makes queries, and we know that . This process repeats and we have . Since there was at least one high-degree marked input in the last iteration, . Combining these facts we have for all , . Thus the total expected query complexity is

 O(r∑j=1√nkjlogn)=O(r∑j=1√nr−j+1logn)=O(√nr∑j=11√jlogn)=O(√nrlogn), (8)

which is We now have a quantum query algorithm that satisfies the conditions of the lemma with probability at least . ∎

We are now ready to prove Theorem 1.

###### Proof of Theorem 1.

We start by applying the algorithm in Lemma 10 to our circuit as many times as needed to ensure that set is empty. Since each run of the algorithm reduces by a factor of 2, and can start off being as large as , where is the number of gates and is the number of inputs, we need to run the algorithm times. Since the algorithm of Lemma 10 is correct with probability , we do not need to boost the success probability of the algorithm. The total number of queries needed to ensure is empty is .

Now we are left with a circuit with no high fan-in gates. That is, all gates have fan-in at most . We now evaluate using the standard composition theorem for disjoint sets of inputs, which has query complexity

 O(Q(f)⋅Q(ANDn/Q(f))=O(Q(f)⋅√n/Q(f))=O(√Q(f)⋅n). (9)

The total query complexity is . ∎

Note that we have not attempted to reduce the logarithmic factors in this upper bound. We believe it is possible to make the quantum upper bound match the upper bound for approximate degree with a more careful analysis and slightly different choice of parameters in the algorithm.

## 3 Approximating polynomials for composed functions

### 3.1 Preliminaries

We now define the various measures of Boolean functions and polynomials that we require in this section. Since we only care about polynomials approximating Boolean functions, we focus without loss of generality on multilinear polynomials as any polynomial over the domain can be converted into a multilinear polynomial (since it never helps to raise a Boolean variable to a power greater than ).

The approximate degree of a Boolean function, commonly denoted

, is the minimum degree of a polynomial that entrywise approximates the Boolean function. It is a basic complexity measure and is known to be polynomially related to a host of other complexity measures such as decision tree complexity, certificate complexity, and quantum query complexity

[BdW02]. We also use another complexity measure of polynomials, which is the sum of absolute values of all the coefficients of the polynomial. This is the query analogue of the so-called -norm used in communication complexity [LS09, Definition 2.7]. We now formally define these measures.

###### Definition 11.

Let be a multilinear polynomial

 p(x1,…,xn)=∑s∈{0,1}nαsxs11⋯xsnn. (10)

We define the following complexity measures of the polynomial :

 (11)

For a Boolean function , we define the following complexity measures:

 degε(f) =min{deg(p):∀x∈{0,1}n, |f(x)−p(x)|≤ε} (12) με(f) =min{μ(p):∀x∈{0,1}n, |f(x)−p(x)|≤ε} (13)

Finally, we define and .

We use the following standard relationship between the two measures in our results.

###### Lemma 12.

For any multilinear polynomial such that for all , we have

 logμ(p)=O(deg(p)logn). (14)

Consequently, for any Boolean function and , we have

 logμε(f)=O(degε(f)logn). (15)
###### Proof.

First let us switch to the representation instead of the representation we have used so far. Let , and replace every occurrence of in the polynomial with to obtain a multilinear polynomial . In this representation, a coefficient is simply the expectation over the hypercube of the product of and a parity function, and hence is at most in magnitude. Since there are only monomials, the sum of absolute values of all coefficients is .

When we switch from this representation back to the representation, we replace every with . Consider this transformation on a single monomial with coefficient . This converts the monomial of degree into a polynomial over those variables, such that the sum of coefficients in this polynomial is at most . Thus the sum of absolute values of all coefficients is , which proves (14).

Now consider any Boolean function , and a multilinear polynomial that minimizes . We can apply (14) to this polynomial to obtain . Since by assumption, and , since minimizes over all -approximating polynomials, we get . ∎

This shows that is at most (up to log factors). However, may be much smaller than , as evidenced by the polynomial . Similarly, may be much smaller than , as evidenced by the function on bits, which has [NS94], but .

### 3.2 Polynomial upper bound

In this section we prove Theorem 2, which follows from the following more general composition theorem.

###### Theorem 13.

Let be computed by a depth-2 circuit where the top gate is a function and the bottom level gates are gates on a subset of the input bits and their negations (as depicted in Figure 1). Then

 degε(h)=O(√nlogμε(f)+√nlog(1/ε))=O(√ndegε(f)logm+√nlog(1/ε)). (16)
###### Proof.

Let us first fix some notation. We will use to refer to the input of the full circuit . Let the inputs to the top gate be called .

Let be a polynomial that minimizes . Thus we have for all , . More explicitly, , where , and each is the of some subset of bits in . Since the product of s of variables is just an of all the variables involved in the product, for each , there is a subset such that .

Using this we can replace all the variables in the polynomial , to obtain

 q(x)=∑s∈{0,1}mαs⋀i∈Tsxi. (17)

Since was an approximation to , is an approximation to . Now we can replace every occurrence of with a low error approximating polynomial for the of the bits in . We know that the approximate degree of the function to error is  [BCdWZ99]. If we approximate each to error , then by the triangle inequality the total error incurred by this approximation is at most . Choosing , each is approximated by a polynomial of degree . Hence the resulting polynomial has this degree and approximates the function to error

. By standard error reduction techniques

[BNRdW07], we can make this error smaller than at a constant factor increase in the degree. This establishes the first equality in (16), and the second equality follows from Lemma 12. ∎

## 4 Applications to linear-size AC0 circuits

### 4.1 Preliminaries

A Boolean circuit is defined via a directed acyclic graph. Vertices of fan-in 0 represent input bits, vertices of fan-out 0 represent outputs, and all other vertices represent one of the following logical operations: a operation (of fan-in 1), or an unbounded fan-in or operation. The size of the circuit is the total number of and gates. The depth of the circuit is the length of the longest path from an input bit to an output bit.

or any constant integer , refers to the class of all such circuits of polynomial size and depth .