Agreement tests are a type of PCP tests and capture a fundamental local-to-global phenomenon. In this paper, we study an agreement testing question that is a new extension of direct product testing to higher dimensions.
It is a basic fact of computation that any global computation can be broken down into a sequence of local steps. The PCP theorem [AS98, ALM98] says that moreover, this can be done in a robust fashion, so that as long as most steps are correct, the entire computation checks out. At the heart of this is a local-to-global argument that allows deducing a global property from local pieces that fit together only approximately.
low degree test in the proof of the PCP theorem. In the PCP construction, a function on a large vector space is replaced by an ensemble of (supposed) restrictions to all possible affine lines. These restrictions are supplied by a prover and are not a priori guaranteed to agree with any single global function. This is taken care of by the “low degree test”, which checks that restrictions on intersecting lines agree with each other, i.e. they give the same value to the point of intersection. The crux of the argument is the fact that the local agreement checksimply agreement with a single global function. Thus, the low degree test captures a local-to-global phenomenon.
In what other scenarios does such a local-to-global theorem hold? This question was first asked by Goldreich and Safra [GS00] who studied a combinatorial analog of the low degree test. Let us describe the basic framework of agreement testing in which we will study this question. In agreement testing, a global function is given by an ensemble of local functions. There are two key aspects of agreement testing scenarios:
Combinatorial structure: for a given ground set of size , the combinatorial structure is a collection of subsets such that for each we get a local function. For example, if is the points of a vector space then can be the collection of affine lines.
Allowed functions: for each subset , we can specify a space of functions on that are allowed. The input to the agreement test is an ensemble of functions such that for every , . For example, in the line vs. line low degree test we only allow local functions on each line that have low degree.
Given the ensemble , the intention is that is the restriction to of a global function . Indeed, a local ensemble is called global if there is a global function such that
An agreement check for a pair of subsets checks whether their local functions agree, denoted . Formally,
A local ensemble which is global passes all agreement checks. The converse is also true: a local ensemble that passes all agreement checks must be global.
An agreement test is specified by giving a distribution over pairs (or triples, etc.) of subsets
. We define the agreement of a local ensemble to be the probability of agreement:
An agreement theorem shows that if is a local ensemble with then it is close to being global.
Example: direct product tests
Perhaps the simplest agreement test to describe is the direct product test, in which contains all possible -element subsets of . For each , we let be all possible functions on , that is . The input to the test is an ensemble of local functions , and a natural testing distribution is to choose so that they intersect on elements. Suppose that . Is there a global function such that for most subsets ? This is the content of the direct product testing theorem of Dinur and Steurer [DS14]:
Theorem 1.1 (Agreement theorem, dimension 1).
There exists constants such that for all satisfying , all positive integers satisfying and and , and all finite alphabets , the following holds: Let be an ensemble of local functions satisfying , that is,
where is the uniform distribution over pairs of
is the uniform distribution over pairs of-sized subsets of of intersection exactly .
Then there exists a global function satisfying .
The qualitatively strong aspect of this theorem is that in the conclusion, the global function agrees perfectly with of the local functions. Achieving a weaker result where perfect agreement is replaced by approximate one would be significantly easier but also less useful. Quantitatively, this is manifested in that the fraction of local functions that end up disagreeing with the global function is at most and is independent of and . It would be significantly easier to prove a weaker result where the closeness is (via a union bound on the event that ). This theorem is proven [DS14] by imitating the proof of the parallel repetition theorem [Raz98]. This theorem is also used as a component in the recent work on agreement testing on high dimensional expanders [DK17].
In order to motivate our extension of Theorem 1.1, let us describe it in a slightly different form. The global function can be viewed as specifying the coefficients of a linear form over variables . For each , the local function specifies the partial linear form only over the variables in . This is supposed to be equal to on the part of the domain where for all . Given an ensemble whose elements are promised to agree with each other on average, the agreement theorem allows us to conclude the existence of a global linear function that agrees with most of the local pieces.
This description naturally leads to the question of extending this to higher degree polynomials. Now, the global function is a degree polynomial with coefficients in , namely , where we sum over subsets , . The local functions will be polynomials of degree , supposedly obtained by zeroing out all variables outside . Two local functions are said to agree, denoted , if every monomial that is induced by has the same coefficient in both polynomials. Our new agreement theorem says that in this setting as well, local agreement implies global agreement.
Theorem 1.2 (Main).
For every positive integer and alphabet , there exists a constant such that for all satisfying and all positive integers satisfying and and , the following holds: Let be an ensemble of local functions satisfying , that is,
where is the uniform distribution over pairs of -sized subsets of of intersection exactly .
Then there exists a global function satisfying .
Here, refers to the restriction .
Furthermore, we may assume that the global function is the one given by “popular vote”, namely for each set to be the most frequently occurring value among (breaking ties arbitrarily).
For , this theorem is precisely Theorem 1.1 (but for the “furthermore” clause). The additional “furthermore” clause strengthens our theorem by naming the popular vote function as a candidate global function that explains most of the local functions. This addendum strengthens also Theorem 1.1 and turns out important for an application [DFH17] of our theorem which we describe later in the introduction.
Let us spell out how this theorem fits into the framework described above. The ground set is , and the collection of subsets is the collection of all induced hypergraphs on elements. In particular, if we focus on , we can view the local function of a subset , , as specifying a hypergraph on the vertices of with hyperedges of size up to . The theorem says that if these small hypergraphs agree with each other most of the time, then there is a global hypergraph that they nearly all agree with.
For the special case of and , we get an interesting statement about combining small pieces of a graph into a global one.
Corollary 1.3 (Agreement test for graphs).
There exist a constant such that for all satisfying and all for all positive integers satisfying , and the following holds:
Let be an ensemble of graphs, where is a element subset of and is a graph on vertex set . Suppose that
Then there exists a single global graph satisfying .
Here too we emphasize that the strength of the statement is in that the conclusion talks about exact agreement between the global graph and the local graphs, i.e. and not , for a fraction of of the sets . It is also important that there is no dependence in the on either or . A similar agreement testing statement can be made for hypergraphs of any uniformity .
A technical component in our proof which we wish to highlight is a new hypergraph pruning lemma, which may be of independent interest. The lemma can be interpreted by viewing a hypergraph as specifying the minterms of a monotone DNF (of width at most ). The lemma allows to prune the DNF so that the new sub-DNF still has similar density (the fraction of inputs on which it is ), but also has a structural property which we call bounded branching factor and which implies that for typical inputs, only a single minterm is responsible for the function evaluating to .
Lemma 1.4 (hypergraph pruning lemma).
Fix constants and . There exists (depending on ) such that for every satisfying and every -uniform hypergraph on there exists a subhypergraph obtained by removing hyperedges such that
For every , .
Here is the hypergraph induced on the vertices of .
We illustrate an application of this lemma later on.
Context and Motivation
Agreement tests were first studied implicitly in the context of PCP theorems. In fact, every PCP construction that has a composition step invariably relies on an agreement theorem. This is because in a typical PCP construction, the proof is broken into small pieces that are further encoded e.g. by composition or by a gadget. The soundness analysis decodes each gadget separately, thereby obtaining a collection of local views. Then, essentially through an agreement theorem, these are stitched together into one global NP witness. Similar to locally testable codes, agreement tests are a combinatorial question that is related to PCPs. Interestingly, this relation has recently been made formal by Dinur et al. [DKK16], where it is proved that a certain agreement test (whose correctness is hypothesized there) formally implies a certain rather strong unique games PCP theorem. Such a formal connection is not known to exist between LTCs and PCPs. For example, even if someone manages to construct the “ultimate” locally testable codes with linear length and distance, and testable with a constant number of queries, this is not known to have any implications for constructing linear size PCPs (although one may hope that such codes will be useful toward that goal).
Beyond their role in PCPs, we believe that agreement tests capture a fundamental local-to-global phenomenon, and merit study on their own. Exploring new structures that support agreement theorems seems to be an important direction.
Application for structure theorems
In a very recent work [DFH17], the authors have found a totally different application for agreement tests (in particular, for Theorem 1.2) that is outside the PCP domain. Theorem 1.2 is applied towards proving a certain structure theorem on Boolean functions in the -biased hypercube. Given a function on the -biased hypercube, the key is to look at restrictions of the global function to small sub-cubes that are identical to the uniform hypercube. On the uniform hypercube, there are previously known structure theorems which give us a local approximation of our function separately on each sub-cube. One ends up with an ensemble of simple functions (juntas, actually) that locally approximate the function, and then Theorem 1.2 is used to stitch all of the local junta approximations into one nice global function.
The interplay between the global structure of a function and how it behaves on (random) restrictions is a powerful tool that is well studied for proving circuit complexity lower bounds. Although agreement tests have not so far been useful in that arena, this seems like an interesting possibility.
Relation to Property Testing
Agreement testing is similar to property testing in that we study the relation between a global object and its local views. In property testing we have access to a single global object, and we restrict ourselves to look only at random local views of it. In agreement tests, we don’t get access to a global object but rather to an ensemble of local functions that are not apriori guaranteed to come from a single global object. Another difference is that unlike in property testing, in an agreement test the local views are pre-specified and are a part of the problem description, rather than being part of the algorithmic solution.
Still, there is an interesting interplay between Corollary 1.3, which talks about combining an ensemble of local graphs into one global graph, and graph property testing. Suppose we focus on some testable graph property, and suppose further that the test proceeds by choosing a random set of vertices and reading all of the edges in the induced subgraph, and checking that the property is satisfied there (many graph properties are testable this way, for example bipartiteness [GGR98]). Suppose we only allow ensembles where for each subset , the local graph satisfies the property (e.g. it is bipartite). This fits into our formalism by specifying the space of allowed functions to consist only of accepting local views. This is analogous to requiring, in the low degree test, that the local function on each line has low degree as a univariate polynomial. By Corollary 1.3, we know that if these local graphs agree with each other with probability , there is a global graph that agrees with of them. In particular, this graph passes the property test, so must itself be close to having the property! At this point it is absolutely crucial that the agreement theorem provides the stronger guarantee that (and not ) for of the ’s. We can thus conclude that not only is there a global graph , but actually that this global is close to having the property.
This should be compared to the low degree agreement test, where we only allow local functions with low degree, and the conclusion is that there is a global function that itself has low degree.
Our proof of Theorem 1.2 proceeds by induction on the dimension . For , this is the direct product test theorem of Dinur and Steurer [DS14], which we reprove in a way that more readily generalizes to higher dimension. Given an ensemble , it is easy to define the global function , by popular vote (“majority decoding”). The main difficulty is to prove that for a typical set , agrees with on all elements (and later on all -sets).
Our proof doesn’t proceed by defining as majority vote right away. Instead, like in many previous proofs [DG08, IKW12, DS14], we condition on a certain event (focusing say on all subsets that contain a certain set , and such that for a certain value of ), and define a “restricted global” function, for each , by taking majority just among the sets in the conditioned event. This boosts the probability of agreement inside this event. After this boost, we can afford to take a union bound and safely get agreement with the restricted global function . The proof then needs to perform another agreement step which stitches the restricted global functions into a completely global function. The resulting global function does not necessarily equal the majority vote function , and a separate argument is then carried out to show that the conclusion is correct also for .
In higher dimensions , these two steps of agreement (first to restricted global and then to global) become a longer sequence of steps, where at each step we are looking at restricted functions that are defined over larger and larger parts of the domain.
The technical main difficulty is that a single event consists of little events, namely for all , that each have some probability of failure. We thus need an even larger boost, to bound the failure probability by about so that we can afford to take a union bound on the different sub-events. How do we get this large boost? Our strategy is to proceed by induction, where at each stage, we condition on the global function from the previous stage, boosting the probability of success further.
Hypergraph pruning lemma
An important component that yields this boosting is the hypergraph pruning lemma (Lemma 1.4) that was described earlier. The lemma allows approximating a given hypergraph by a subhypergraph that has a bounded branching factor.
Definition 1.5 (branching factor).
For any , a hypergraph over a vertex set is said to have branching factor if for all subsets and integers , there are at most hyperedges in of cardinality containing .
Our proof of the hypergraph pruning lemma produces a sub-hypergraph with branching factor . The branching factor is responsible for the second item in the lemma, which guarantees that usually if a set contains a hyperedge from , it contains a unique hyperedge from .
The importance of this is roughly for “inverting union bound arguments”. It essentially allows us to estimate the probability of an event of the form “contains some hyperedge of ” as the sum, over all hyperedges, of the probability that contains a specific hyperedge.
The proof of the lemma is subtle and proceeds by induction on the dimension . It essentially describes an algorithm for obtaining from and the proof of correctness uses the FKG inequality. We illustrate how Lemma 1.4 is used by its application to majority decoding.
The most natural choice for the global function in the conclusion of Theorem 1.2 is the majority decoding, where is the most common value of over all containing . This is the content of the “furthermore” clause in the statement of the theorem. Neither the proof strategy of [DS14] nor our generalization promises that the produced global function is the majority decoding. Our inductive strategy produces a global function which agrees with most local functions, but we cannot guarantee immediately that this global function corresponds to majority decoding. What we are able to show is that if there is a global function agreeing with most of the local functions then the function obtained via majority decoding also agrees with most of the local functions. We outline the argument below. Suppose that is an ensemble of local functions that mostly agree with each other, and suppose that they also mostly agree with some global function . Let be the function obtained by majority decoding: is the most common value of over all containing . Our goal is to show that also mostly agrees with the local functions, and we do this by showing that and mostly agree.
Suppose that . We consider two cases. If the distribution of
is very skewed toward, then will happen very often. If the distribution of is very spread out, then will happen very often. Since both events and are known to be rare, we would like to conclude that happens for very few ’s.
Here we face a problem: the bad events (either or ) corresponding to different ’s are not necessarily disjoint. A priori, there might be many different ’s such that , but the bad events implied by them could all coincide.
The hypergraph pruning lemma enables us to overcome this difficulty. Let , and apply the hypergraph pruning lemma to obtain a subhypergraph . The lemma states that with constant probability, a random set sees at most one disagreement between and . This implies that the bad events considered above can be associated, with constant probability, with a unique . In this way, we are able to obtain an upper bound on the probability that disagree on an input from . The hypergraph pruning lemma then guarantees that the probability that disagree (on any input) is also bounded.
The rest of this paper is organized as follows. We begin by reproving Theorem 1.1 of Dinur and Steurer [DS14] in Section 2 a manner that generalizes to higher dimension. We then generalize the proof of the theorem to higher dimensions (Theorem 3.1) in Section 3. This almost proves Theorem 1.2 but for the “furthermore” clause. In Section 4, we prove the hypergraph pruning lemma, a crucial ingredient in the generalization to higher dimensions. Finally, in Section 5, we use the hypergraph pruning lemma (again) to prove the “furthermore” clause of Theorem 1.2 thus completing the proof of our main result. We also show how the agreement theorem can be extended to the biased setting in Section 5.
2 One-dimensional agreement theorem
In this section, we prove the following direct product agreement testing theorem for dimension one in the uniform setting. This theorem is a special case of the more general theorem (Theorem 3.1) proved in the next section and also follows from the work of Dinur and Steurer [DS14]. However, we give the proof for the dimension one case as it serves as a warmup to the general dimension case.
Theorem 1.1 (Restated) (Agreement theorem, dimension 1). There exists constants such that for all satisfying , all positive integers satisfying and and , and all finite alphabets , the following holds: Let be an ensemble of local functions satisfying , that is,
where is the uniform distribution over pairs of -sized subsets of of intersection exactly .
Then there exists a global function satisfying .
The distribution is the distribution induced on the pair of sets by first choosing uniformly at random a set of size and then two sets and of size of uniformly at random conditioned on . We can think of picking these two sets as first choosing uniformly at random a set of size , then a random element , setting and then choosing two sets and such that . Clearly, the probability that the functions and disagree is the sum of the probabilities of the following two events: (A) , (B) but . This motivates the following definitions for any and .
It is easy to see that for a typical , both and is . This suggests the following strategy to prove Theorem 1.1. For each typical , construct a “global” function based on the most popular value of among the ’s that agree on (see Section 2.2 for details) and show that most ’s agree with each other. More precisely, we prove the theorem in 3 steps as follows: In the first step (Section 2.1), we bound and for typical ’s and . In the second step (Section 2.2), we construct for a typical , a “global” function that explains most “local” . In the final step (Section 2.3), we show that the global functions corresponding to most pairs of typical ’s agree with each other, thus demonstrating the existence of a single global function (in particular a random global function ) that explains most of the “local” functions even corresponding to ’s which do not contain .
2.1 Step 1: Bounding and
We begin by showing that for a typical of size , we can upper bound and .
We have and .
For a non-negative integer , let be the probability that the functions and corresponding to a pair of sets picked according to the distribution disagree on exactly elements in . By assumption of Theorem 1.1, we have . Furthermore, it is easy to see that and . The lemma follows from these observations. ∎
We will need the following auxiliary lemma in our analysis.
Let and . Consider the bipartite inclusion graph between and (ie., is an edge if ). Let and be such that for each , the set of neighbours of in (denoted by ) is of size at least . Then either
Let be a random set of size . To begin with, we can assume that since otherwise and we are done. Let be any element in . The probability that conditioned on the event that contains is given as follows:
Hence, for any , . It follows that
If the above is true for , it is also true for any . Now, if , then consider of size . Then applying the above inequality for , we have . Other wise , now again appealing to the above inequality, we have . ∎
2.2 Step 2: Constructing global functions for typical ’s
We prove the following lemma in this section.
For all and positive integers satisfying and and alphabet the following holds: Let be an ensemble of local functions satisfying
then there exists an ensemble of global functions such when a random and are chosen such that , then .
By Lemma 2.1, we know that a typical of size satisfies . We prove the above lemma, by constructing for each such typical a global function that explains most local functions for . For the rest of this section fix such a .
Given , let . Let and . For , let .
We now define the “global” function as follows. We first define the value of (we will drop the subscript when is clear from context) for and then for each . Define to be the most popular restriction of the functions for . In other words, is the function that maximizes . Let be the set of ’s that agree with this most popular value. For each , let . For each such , define to be the most popular value among . This completes the definition of the function .
We now show that if is small, then the function agrees with most functions .
This motivates the definition of the following quantities which we need to bound.
We now bound and in terms and via the following (disagreement) probabilities.
Claim 2.4 (Bounding ).
By definition, we have since is the most popular value among for . The only difference between and is the distribution from which the pairs are drawn; for , is drawn uniformly from all pairs while for , is drawn from . To complete the argument, we choose in the following coupled fashion such that while . First choose at random, then choose and at random, and choose at random such that and . We now have . Clearly, if , then either or . Hence, . ∎
Claim 2.5 (Bounding ).
If , then .
The proof of this claim proceeds similar to the proof of the previous claim. By definition, we have since is the most popular value among for . We then observe that
We now choose in a coupled fashion as follows. Let be the distribution of when are chosen at random from . First choose at random. Then choose , so . Choose disjoint sets disjoint from of sizes respectively, and let for . Here, we have used the fact that
. The joint distributionsatisfy that and conditioned on and . Furthermore, if (i.e., ) and then one of the following must hold:
and , or
(The first parts always hold, and the second parts cannot both not hold.) This shows that is bounded above by
If and , then .
This follows from an application of Lemma 2.2 by setting and . Then, either or . ∎
We now return to bounding from (1) as follows:
If and , then