Combinatorial lower bounds for 3-query LDCs

11/25/2019 ∙ by Arnab Bhattacharyya, et al. ∙ National University of Singapore indian institute of science 0

A code is called a q-query locally decodable code (LDC) if there is a randomized decoding algorithm that, given an index i and a received word w close to an encoding of a message x, outputs x_i by querying only at most q coordinates of w. Understanding the tradeoffs between the dimension, length and query complexity of LDCs is a fascinating and unresolved research challenge. In particular, for 3-query binary LDCs of dimension k and length n, the best known bounds are: 2^k^o(1)≥ n ≥Ω̃(k^2). In this work, we take a second look at binary 3-query LDCs. We investigate a class of 3-uniform hypergraphs that are equivalent to strong binary 3-query LDCs. We prove an upper bound on the number of edges in these hypergraphs, reproducing the known lower bound of Ω̃(k^2) for the length of strong 3-query LDCs. In contrast to previous work, our techniques are purely combinatorial and do not rely on a direct reduction to 2-query LDCs, opening up a potentially different approach to analyzing 3-query LDCs.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A code is said to be a -query locally decodable code (LDC) if it is possible to recover any symbol of a message by querying on at most locations, such that even if a constant fraction of is corrupted, the decoder returns

with high probability. LDCs already appeared in the PCP literature (e.g., implicitly in

[BFL+91]) but they were first explicitly formulated by Katz and Trevisan in [KT00]. LDCs have attracted attention not only because of their immediate relevance to data transmission and data storage but also because of their surprising connections to complexity theory and cryptography ([CKG+98, STV01, DS07, KS09]). In more recent years, the analysis of LDCs has led to a greater understanding of basic problems in incidence geometry, the construction of design matrices and the theory of matrix scaling, e.g. [BDY+11, DSW14b, DSW14a].

Although LDCs have been studied now for two decades, some basic questions remain stubbornly open. In particular, we have the following open question for 3-query LDCs:

Open Question 1.1.

What is the length of the shortest -query LDC that can encode all -bit binary messages?

A wild variety of techniques have been used to study the problem. For a while, it was believed that the length should be exponential in for -query LDCs (indeed, for any constant number of queries). This belief was shattered by a breakthrough work of Yekhanin that designed -query LDCs of length subexponential in (conditional on some number-theoretic conjectures). Subsequent work ([EFR12, DGY11] reformulated the construction in terms of

matching vector codes

and established an unconditional upper bound of on the length.

As for lower bounds on the length of -query LDCs, which is the focus of this work, Katz and Trevisan [KT00] first gave a super linear lower bound of , which was then improved to by Kerenedis and de Wolf [Kd03] using quantum information theoretic techniques. The current state-of-the-art is due to Woodruff [WOO07] from over a decade ago where he showed that .

Given the state of affairs, it is natural to try to prove lower bounds for stronger variants111For instance, Woodruff in [WOO12] gave an lower bound for the special case of linear -query LDCs. of LDCs where the task should be easier. In this work, we study a restricted form of LDCs which seem to capture most of the challenges associated with general LDCs.

Definition 1.2.

For a given , a code is a -strong LDC if for every , there exists a set of disjoint triples in such that for every and for every triple , . Moreover, if , a triple in intersects a triple in in at most 1 coordinate.

Known constructions of 3-query LDCs are strong. Conceptually, the main222The decoding scheme of taking the product (xor) of the codeword bits is without loss of generality (see [WOO07]). The additional condition in Definition 1.2 about triples in different matchings intersecting only at single coordinates is made for technical convenience and should be avoidable. restriction that the above definition makes is that each triple in the matching successfully decodes for every . On the other hand, Katz and Trevisan [KT00] show that general LDCs yield matchings such that each triple in the matching sucessfully decodes for most (not all) ’s.

We show a combinatorial proof of the known lower bound for the length of code words of -query strong LDCs. Here is the main theorem stating the lower bound.

Theorem 1.3.

Let be a -strong LDC. Then, .

1.1 Proof Overview

As we already noted, Theorem 1.3 follows from [WOO07]. Of more interest is our proof technique. Woodruff’s lower bound reduces -query LDCs to -query LDCs and applies known analytic proofs giving tight bounds for -query LDCs [Kd03]. On the other hand, our proof is purely combinatorial and does not seem to be a reduction to 2 queries.

Our starting point is the observation that strong 3-query LDCs are equivalent to even-colored 3-uniform hypergraphs. A 3-uniform hypergraph is called linear if any two edges intersect in at most one vertex.

Definition 1.4.

An -even-colored 3-uniform hypergraph is a linear edge-colored hypergraph on vertices with each edge having a color in such that:

  • For each , the edges of color form a matching of size at least , and

  • If is a subgraph of such that every vertex has even degree in , then there are an even number of edges of each color in .

Given a -strong LDCs, define the hypergraph which is the union of the matchings given by Definition 1.2, and let the color of an edge be the matching it comes from. Then, it is easy to check that both conditions (i) and (ii) are met (see creftypecap 2.1). The correspondence naturally goes in the other direction too, although this is not needed in the present work.

We prove an upper bound for -even-colored 3-uniform hypergraphs, proving our main theorem. To motivate our proof, let us sketch the corresponding argument for 2-query LDCs (which is also new to the best of our knowledge). Suppose we have a (2-uniform) graph which is the union of matchings, with edges from the ’th matching having color . Analogously to condition (ii) of Definition 1.4, also suppose that any cycle contains an even number of edges of each color. Then, we prove that the number of vertices is at least . For simplicity, suppose the matchings of each color are perfect. Our argument is through coding (ironically!). Fix an arbitrary vertex . For any vertex , let its signature be defined as where is the parity of the number of edges of color on a path from to . We claim that does not depend on the path chosen. This is because if two paths from to

gave different signatures, this would yield a cycle in which some color occurred an odd number of times. On the other hand, there are at least

different signatures because for any signature , there is a path from with exactly edges of color (since the matchings are perfect). Hence, the number of vertices is at least .

Figure 1: A cherry formed from the edges and intersecting at .

Our proof for 3-uniform hypergraphs is in a similar spirit. Instead of a path to define a signature, we use a sequence of cherries, borrowing an idea from [DHŁ+12]. A cherry is a pair of hyperedges which uniquely intersect at a hyperedge; see Figure 1. We observe that if the number of edges is sufficiently large, then there are many cherries. We then use this structure to show that there are even subgraphs (i.e., subgraphs in which all vertices have even degrees) which have an odd number of edges of some color. Namely, we construct a ‘cycle of cherries’ in which we know there is a color that appears on a unique edge, yielding the contradiction. More details follow.

Formally, given an -even colored hypergraph which is a union of matchings , define the signature graph as follows. The vertex set , and there is an edge in between the vertices and whenever there exists a such that and are hyperedges forming a cherry in . Moreover, such an edge is labeled by the pair of colors if and . The signature graph enjoys the following useful structural property:

  • For any vertex in and for any color , there are at most edges incident to that have in their label.()

The proof of () follows from the definition of in terms of cherries (see creftypecap 2.4) .

For the sake of contradiction, assume that for some large constant . This, along with a standard averaging argument, implies that there exists a large subgraph of the signature graph with minimum degree at least . Now fixing an arbitrary vertex as root, we iteratively grow a sequence of trees using edges in , while maintaining the following rainbow condition: For any vertex , no color appears more than once among the colors labeling the unique path from to in .

We explain how to construct from so that the above rainbow condition is met. Let denote the leaves of the tree , and let denote the neighbors of vertices so that the colors labeling the edge do not occur on the path from the root to in . A short argument (creftypecap 3.1) allows us to deduce that must be disjoint from , because otherwise, condition (ii) of Definition 1.4 is violated. Hence, the next tree can be built by letting and adding one edge from each vertex in to a vertex in .

We continue this process until for some iteration . We now sketch how to arrive at a contradiction. From the stopping criteria, we know that for every we have , and therefore the depth of the tree is at most . Therefore, for any , the number of colors labeling the path from to is at most . From property (), we get that there are at least neighbors of that are in (for some other constant ). Since , there exists a vertex with at least neighbors in . Again, invoking property (), for large enough, there will be a neighbor such that that the colors labeling do not appear among the labels of the path from to . From here, we can conclude that the unique path between and in along with the edge forms a cycle in in which some color appears exactly once. This structure corresponds to a subgraph in that violates condition (ii) of Definition 1.4.

In the rest of the paper, we present the argument formally with all the details. It is unclear currently how to extend the analysis to -query LDCs or how to improve the analysis for -query LDCs. But we remain hopeful that by looking at more intricate combinatorial structures than cherries, we can make some progress.

2 Preliminaries

In this section and later, we do not invoke the notion of even-colored subgraphs, and we define objects directly in reference to strong 3-query LDCs.

Given a -strong LDC, we define the recovery hypergraph , where and to be the -uniform hypergraph which is the union of matchings . For any edge , we say that the color of the hyperedge is if belongs to matching . We use the notation to denote the color of the hyperedge . We additionally assume that is linear, i.e. no two hyperedges of intersect in more than one element.

Let be a hypergraph. Then we define an augmentation of as follows: and is a multiset where each member also belongs to but can possibly have a higher multiplicity than the multiplicity of in . With respect to a hypergraph where a hyperedge is allowed to have multiplicity greater than 1, we denote by the multiplicity of in . We may drop the subscript , if the hypergraph under consideration is clear from the context. Also for . If for all is even, then is called an even hypergraph. We use to denote the multiset of colors associated with edges in .

If is an augmentation of the recovery hypergraph , for , we define

Claim 2.1.

Let be an augmentation of the recovery hypergraph . If is even, then for , is even.

Proof.

Suppose for contradiction that there exists , such that is odd. Recall that the indices of the code word bits correspond to the vertices of the recovery hypergraph. Let us assume that is the code word of a message , where and for . For an edge of the recovery hypergraph, . Now it is clear that , since is an even augmentation of .

On the other hand, by definition of the recovery hypergraph, if , then for . Therefore . Clearly since for the selected message for , we infer that , if is odd. This is a contradiction. We conclude that for , is even.

The Signature Graph.

We define a graph called the signature graph as follows: and an edge exists between two vertices and of if and only if and there exists a vertex such that . Note that since the recovery hypergraph is linear, if there exists an edge between two vertices and , there is a unique vertex such that . We may say that the vertex causes the edge . Given an edge , we define if causes the edge . We may abuse the notation and use to denote the corresponding unordered set. We define if and . Note that since cannot be in two different edges of the same matching.

Claim 2.2.

The number of edges in the signature graph is at least .

Proof.

Since each matching is of size at least , the number of hyperedges in is at least . It follows that . For any vertex consider a pair of incident edges, say and . Since is linear, . It is easy to see that based on this pair of incident edges, can cause 4 distinct edges of the signature graph . Therefore the vertex causes distinct edges of , where . As we have mentioned earlier, two different vertices and cannot cause the same edge in . Therefore . Recall that . Using the Cauchy-Schwarz inequality, 333For vectors , the Cauchy-Schwarz inequality states that . Now take , where and to get the required lower bound. we get . It follows that . Here we have used and , since .

For a subgraph of the signature graph , we define to be the augmentation of , with and . Note that when we take the union here, we retain multiple copies of a hyperedge if that hyperedge appears in multiple sets taking part in the union operation. Thus is by definition a multi-set. We extend some of the notation used for hypergraphs to subgraphs of signature graphs also in the following way: We use to denote the multiset . A hypergraph is rainbow colored with respect to an edge coloring if there exist no two hyperedges having the same color. (In particular, there will not be any hyperedge with multiplicity greater than 1.) A subgraph of the signature graph is rainbow colored, if is rainbow colored. We may also say (or ) is rainbow, shortening the phrase rainbow colored.

Claim 2.3.

Let be an even subgraph of the signature graph , i.e. , is even. Then is an even augmentation of .

Proof.

Recall that each edge corresponds to exactly edges in , namely the two edges of , where is the unique vertex which caused the edge . We say that appears in the role of an intermediate vertex and and appear in the role of signature vertices in . It is easy to see that since in itself the is even, each vertex plays the role of an intermediate vertex an even number of times. Noting that in each vertex appears in the role of a signature vertex exactly once, it is easy to see that if is a vertex of , then (also ) plays the role of a signature vertex in (where denotes the set of edges incident on in ) exactly times. Since is even, it follows that and are even numbers. ∎

Claim 2.4.

Let be a vertex of the signature graph and let be the set of edges incident on in . Let be a subset of colors. Let . Then .

Proof.

Let . For an edge , contains 2 hyperedges, exactly one of which contains and the other one contains : Let us denote by and the hyperedges in that contain and respectively. For , , where for . First we will show that . To see this, note that if , then and contains as mentioned earlier. There is a unique hyperedge with these properties since is a matching. Let . Then either or could have caused the edge . If caused the edge , then contains both and , and there is a unique edge in that is a superset of since is linear. Similarly if caused , then is uniquely determined, since it should contain both and . It follows that for all , . A similar argument shows that . It follows that .

Now we are ready to prove Theorem 1.3.

3 Proof of Theorem 1.3

For contradiction, we shall assume that where is a sufficiently large constant. We can then lower bound the average degree of as follows. From claim 2.2, we know that . On the other hand, the number of vertices in is at most . Therefore, for , for large enough, the average degree of is at least so that we can find a subgraph with minimum degree at least , where is a sufficiently large constant.

Now we fix a vertex , and we grow a rainbow tree rooted at in level by level as follows. Let be the tree consisting only of the root and be the tree consisting of and all its neighbors. At the th stage, we will have a tree where where is the set of vertices in level . Note that and consists of 2 levels, and . For two vertices and , the unique path in from to will be denoted by .

Moreover at the th stage, we will make sure that the tree satisfies the following property:

(1)
Claim 3.1.

If satisfies property (1), and if is an edge of such that and , where , then .

Proof.

Suppose not. Let be the least common ancestor of and in . Since is a rainbow path by property (1), is also rainbow. Since , clearly we have . It follows that there is at least one matching color, say in the cycle , such that . But since is an even subgraph of , is an even augmentation of by Claim 2.3. Then by Claim 2.1, should be an even number, a contradiction.

Now we describe how to construct from by adding a new level . For a vertex define . Observe that : This follows from Claim 3.1, since if there is an edge in from a vertex to a vertex , then , and therefore . Define . Clearly . If , define a bipartite graph such that for and , if and only if . Now for each , select one vertex from such that to be its parent thus obtaining the new tree . From the way we defined for , it is clear that property (1) is satisfied by . If , we proceed to add the next level. Otherwise we stop the procedure and define the final tree to be .

Let be the last level added to the tree. Clearly . We observe that . Otherwise , since for . Now consider the bipartite graph : For each vertex , we know by applying Claim 2.4 with that . But and therefore . Therefore has at least edges. Since , there exists a vertex such that its degree in is at least . Again by applying Claim 2.4, this time with , at most of these edges can have a common color with any edge in . It follows that if is taken large enough, has a neighbor such that . This contradicts Claim 3.1 applied to the tree . The situation is depicted in Figure 2. Thus we infer that , which in turn implies that .

Figure 2: The cycle formed by the concatenation of and corresponds to an even subgraph in with an odd number of edges having a particular color.

Acknowledgments

AB thanks Sivakanth Gopi, Nikhil Srivastava, and Luca Trevisan for many useful discussions about this problem. The authors would also like to thank the anonymous reviewers for their useful comments and suggestions.

References

  • [BFL+91] L. Babai, L. Fortnow, L. Levin, and M. Szegedy (1991) Checking computations in polylogarithmic time. In

    Proceedings of the 23rd Annual ACM Symposium on Theory of Computing

    ,
    pp. 21–31. Cited by: §1.
  • [BDY+11] B. Barak, Z. Dvir, A. Yehudayoff, and A. Wigderson (2011) Rank bounds for design matrices with applications to combinatorial geometry and locally correctable codes. In Proceedings of the 43rd annual ACM symposium on Theory of computing, pp. 519–528. Cited by: §1.
  • [CKG+98] B. Chor, E. Kushilevitz, O. Goldreich, and M. Sudan (1998) Private information retrieval. Journal of the ACM (JACM) 45 (6), pp. 965–981. Cited by: §1.
  • [DHŁ+12] D. Dellamonica, P. Haxell, T. Łuczak, D. Mubayi, B. Nagle, Y. Person, V. Rödl, M. Schacht, and J. Verstraëte (2012) On even-degree subgraphs of linear hypergraphs. Combinatorics, Probability and Computing 21 (1-2), pp. 113–127. Cited by: §1.1.
  • [DGY11] Z. Dvir, P. Gopalan, and S. Yekhanin (2011) Matching vector codes. SIAM Journal on Computing 40 (4), pp. 1154–1178. Cited by: §1.
  • [DSW14a] Z. Dvir, S. Saraf, and A. Wigderson (2014) Breaking the quadratic barrier for 3-LCC’s over the reals. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp. 784–793. Cited by: §1.
  • [DSW14b] Z. Dvir, S. Saraf, and A. Wigderson (2014) Improved rank bounds for design matrices and a new proof of Kelly’s theorem. In Forum of Mathematics, Sigma, Vol. 2, pp. e4. Cited by: §1.
  • [DS07] Z. Dvir and A. Shpilka (2007) Locally decodable codes with two queries and polynomial identity testing for depth 3 circuits. SIAM Journal on Computing 36 (5), pp. 1404–1434. Cited by: §1.
  • [EFR12] K. Efremenko (2012) 3-query locally decodable codes of subexponential length. SIAM Journal on Computing 41 (6), pp. 1694–1703. Cited by: §1.
  • [KT00] J. Katz and L. Trevisan (2000) On the efficiency of local decoding procedures for error-correcting codes. In Proceedings of the 32nd annual ACM symposium on Theory of Computing, pp. 80–86. Cited by: §1, §1, §1.
  • [KS09] N. Kayal and S. Saraf (2009) Blackbox polynomial identity testing for depth 3 circuits. In Proceedings of the 50th annual IEEE symposium on Foundations of Computer Science, pp. 198–207. Cited by: §1.
  • [Kd03] I. Kerenidis and R. de Wolf (2003) Exponential lower bound for 2-query locally decodable codes via a quantum argument. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pp. 106–115. Cited by: §1.1, §1.
  • [STV01] M. Sudan, L. Trevisan, and S. Vadhan (2001) Pseudorandom generators without the XOR lemma. Journal of Computer and System Sciences 62 (2), pp. 236–266. Cited by: §1.
  • [WOO12] D. P. Woodruff (2012) A quadratic lower bound for three-query linear locally decodable codes over any field. Journal of Computer Science Technology 27 (4), pp. 678–686. Cited by: footnote 1.
  • [WOO07] D. Woodruff (2007) New lower bounds for general locally decodable codes. In Electronic Colloquium on Computational Complexity (ECCC), 14. Cited by: §1.1, §1, footnote 2.