We develop the notion of a double sampler, which is an “enhanced” sampler. A sampler is a bipartite graph such that for every function the global expectation is roughly equal the local expectation for most . More formally a -sampler is a graph such that at most -fraction of the vertices have , see [Zuc97] for more details. In this work we study a strong extension of samplers, called double samplers.
Towards defining double samplers we observe that every given a sampler , every can be identified with the set of its neighbors . In this way is a collection of subsets of . In the other direction, given a ground set and a collection of subsets , the graph pops out as the inclusion graph with an edge from to iff .
A double sampler consists of a triple where is the vertex set, is a collection of -subsets of , and is a collection of -subsets of , where . We say that is a double sampler if
The inclusion graphs on , and on are each samplers. (An inclusion graph is a graph where we connect two subsets by an edge if one contains the other; here a single vertex is also considered to be a singleton subset).
For every , let be the sets in that are contained in . Let be the bipartite inclusion graph connecting vertices in to subsets in . We require that for every , the graph is a sampler. We call this property the locality property of the double sampler. (See fig:double_samp for an illustration.)
Our definition of double samplers is stronger than the initial definition in [DK17] that was missing the locality property111The main result in [DK17] was proven directly from high dimensional expanders, and not from double samplers, so this locality property was used implicitly. It is possible that the result of [DK17] can be proven directly from our revised definition of double samplers.. Whereas the definition in [DK17] can be obtained e.g. by concatenating two samplers, the revised definition herein is much stronger and carries properties not known to be obtained by any random construction. It is quite remarkable that high dimensional expanders [LSV05, KO18] give rise to double samplers for which :
(Informal, see formal version in thm:doublesampler) For every pair , there is an explicit construction of a double sampler on vertices, such that , for infinitely many .
On random double samplers.
To appreciate the remarkableness of double samplers, think of concrete parameters such as . A random construction amounts to placing vertices in , a linear (in ) number of edges in and a linear number of triples in . In any -like model, most edges will be connected to at most one triple, or vice versa, most triples will be connected to at most one edge. In either case the inclusion graph on is highly disconnected, let alone that it be a sampler222Observe that for the chosen parameters of and , there are obvious limits on the parameters of the sampler, since each triple is connected to at most edges..
We elaborate more on the construction of double samplers towards the end of the introduction.
Samplers and distance amplification.
Alon, Bruck, Naor, Naor and Roth [ABN92] showed how to amplify the distance of any code, simply by pushing the symbols along edges of a sampler graph. Let us describe their transformation in notation consistent with the above. We think of the graph as a sampler , where is a collection of -sets of . Given an -bit string , we place on the -th vertex and then each subset “collects” all of the symbols of its elements and gets a short string . The resulting codeword is the sequence which can be viewed as a string of length over alphabet .
If the string came from an initial code with minimum distance , then is a new code. Assuming is a -sampler, the minimum distance of is at least . Of course the length of words in depends on the size of , so the shorter the better.
This elegant transformation from to is very local and easy to compute in the forward direction (from to ), and indeed it has been found useful in several coding theory constructions, e.g. [GI02, KMRS17]. In this work we study the inverse question, also known as decoding: given a noisy version of , find . Moreover, we wish to recover from as many errors as possible.
Decoding and list decoding
A decoding algorithm for gets as input a string , and needs to find a word such that for as many as possible. A natural approach is the “maximum likelihood decoding” algorithm: assign each vertex the most likely symbol, by looking at the “vote” of each of the subsets ,
and then run the unique decoding algorithm of on . Assuming is efficiently unique-decodable from errors, and assuming is a good sampler, this gives a decoding algorithm for that recovers from error rates close to .
Going beyond the unique decoding radius, the large distance of guarantees, via the Johnson bound, that it is (combinatorially) list decodable up to a radius where is the distance (see [GRS, Chapter 7]). However, the maximum likelihood decoder stops working in this regime: one cannot rule out the situation where for each vertex , both and symbols occur with equal likelihood, and it is not known, in general,333We remark that when the base code has additional special properties it is possible that more can be done (see e.g [GI02]), but our focus is on a generic decoding mechanism that does not depend on the code at all. In fact, we show an approximate list decoding algorithm that works even when does not have any distance, e.g. for . how to recover .
Thus, it is natural to ask for an algorithm that list decodes up to radius close to . Our main result is a list decoding algorithm that goes beyond the unique-decoding barrier and works for error rates approaching . The algorithm works whenever the underlying graph is part of a double sampler, namely where there is a collection of sets of size so that the triple is a double sampler.
Theorem 1.2 (Main - informal, see thm:main).
Let be constants. Suppose is a code that is efficiently decodable from errors, and suppose is a double sampler (with parameters depending on ). The code defined over alphabet by
has block length , and is list-decodable from fraction of errors.
At this point the reader may be wondering how the double-sampler property helps facilitate list decoding. Roughly speaking, a double sampler is a collection of (small) subsets that have both large overlaps as well as strong expansion properties. The expansion properties are key for distance amplification, and the large overlaps, again with good sampling properties, are key for the list decoding algorithm.
1.1 The list decoding algorithm
Our algorithm starts out with a voting step, similar to the maximum likelihood decoder. Here we vote not on the value of each bit but rather on the value of an entire set . Since is also a sampler, a typical sees a noticeable fraction of ’s for which . Since the graph between and is a sampler (this is the locality property), we can come up with a short list of popular candidates for . This is done by looking at for all subsets . We define
Note that since has constant size, we are able to search exhaustively over all in constant time.
Given a list for each , we now need to stitch these lists together, and here we again use the fact that is a good sampler. Whenever is significantly large, we will match with iff . Moreover, the double sampler property allows us to come up with an expander graph whose vertex set is , and whose edges connect to when they have significant overlap. This guarantees that for almost all edges there is a matching between the list of and the list of .
At this point what we are looking at is a unique games instance, where the said expander is the constraint graph, and the said matchings are the unique constraints.444For definitions, please see the preliminary section. We now make two important observations. First, a word with noticeable correlation with the received word, corresponds to a solution for the unique games instance with very high value (i.e., satisfying a large fraction of the constraints). Second, the algorithm of [AKK08] will find a high-value solution, because the underlying unique games constraint graph is an expander! It is important to understand that a more naive greedy belief propagation algorithm would fail miserably because it takes about steps to reach a typical point in an expander graph, and this accumulate an intolerable amount of error.
We are almost done, it remains to run the unique decoding algorithm of on each unique games solution, to remove any small errors, and this completes the list decoding.
The above high level description gives the rough idea for our algorithm, but the implementation brings up some subtle difficulties, which we explain below.
Every set induces a constant size “local view” on the code , which has no reason to be an error correcting code, and in particular has no distance. This makes the task of finding a short more difficult, since there could be several valid candidates that are very close in Hamming distance. Suppose differ only in a single bit, then for most , we don’t know which element in should be matched to and which to .
In order to solve this difficulty, we prune each and enforce minimal distance between each two list items. The pruning should also promise a “covering” property - that if was in the initial list, then exists some (where is with relation to ) in the final list.
For any predetermined distance parameter , one can come up with counterexamples showing that this is impossible. Our solution is to let the pruning algorithm choose dynamically. The algorithm starts with an initial distance , and gradually increases it till it reaches a radius at which both the distance and covering properties holds together (this is done in sec:app list dec).
Given with and the same radius , we match to if they are close (with respect to ) on . If however have different radii, we don’t know how to match these lists correctly. Therefore, our unique games instance is created on a subgraph containing only those vertices that share the same (most popular) radius . We show that there exists such a subgraph which is itself an expander.
1.2 Double Samplers and High dimensional expanders
Let us briefly explain how double samplers are constructed from high dimensional expanders (proving thm:ds exists informal). A high dimensional expander is a -dimensional simplicial complex , which is just a hypergraph with hyperedges of size and a closure property: for every hyperedge in the hypergraph, all of its subsets are also hyperedges in the hypergraph. The hyperedges with elements are denoted , and the complex is said to be an expander if certain spectral conditions are obeyed, see sec:doublesampler.
In [DK17] the authors prove that a (two-sided spectral) high dimensional expander gives rise to a multi-partite graph with interesting spectral expansion properties. The graph has vertices , and we place edges for inclusion. Namely, is connected by an edge to if . It is shown that the graph induced by focusing on layers and has . We show, in sec:doublesampler, that by narrowing our focus to three layers in this graph (namely, ) we get a double sampler. This is proven by observing that the spectral properties are strong enough to yield a sampler (an expander mixing lemma argument suffices since we are only seeking relatively weak sampling properties).
Better double samplers?
Double samplers with super-linear (polynomial and even exponential) size have appeared implicitly (or somewhat similarly as “intersection codes”) in the works of [IKW12, IJKW10]. Two concrete constructions were studied,
The first where , so .
The second where
is identified with a vector space over some finite field and thenconsists of all -dimensional subspaces of . Here .
The current work is the first to construct double samplers with linear size. This raises the question of finding the best possible parameters for these objects. In particular, for given sampler parameters and , how small can be?
Our current construction is based on Ramanujan complexes of [LSV05] that are optimal with respect to the spectrum of certain Laplacian operators, and not necessarily with respect to obtaining best possible double samplers. It is an interesting challenge to meet and possibly improve upon these parameters through other constructions.
Unlike other pseudorandom objects, there is no known random construction of a double sampler. In particular, we cannot use it as a yardstick for the quality of our parameters. It remains to explore what possible parametric limitations there are for these objects.
We believe that double samplers capture a powerful feature of high dimensional expanders whose potential for TCS merits more study. Previously, in [DK17], it was shown that high dimensional expanders give rise to a very efficient de-randomization of the direct product code that is nevertheless still testable. Part of the contribution of the current work is a demonstration of the utility of these objects in a new context, namely of list decoding.
1.3 Derandomized Direct Product and Approximate List Decoding
Our list decoding algorithm can also be viewed in the context of decoding derandomized direct products. The direct product encoding takes and encodes it into where contains all possible -subsets of . An encoding with , as in this paper, is called a derandomized direct product encoding.
Direct products and derandomized direct products are important in several contexts, primarily for hardness amplification. One begins with a string that is viewed as a truth table of a function (here ), and analyzes the hardness of the new function defined by . A typical hardness amplification argument proceeds by showing that if no algorithm (in a certain complexity class) computes on more than of its inputs, then no algorithm computes on more than of its inputs. Namely, is much harder than .
Such a statement is proven, as first described in [Tre05, Imp03], through a (list-) decoding argument: given a hypothetical algorithm that computes successfully on at least fraction of inputs, the approximate list decoder computes on of its inputs. We prove,
Theorem 1.3 (Approximate list decoding - informal).
Let be constants. Suppose is a double sampler (with parameters depending on ). Let be the encoding that takes to . There is an algorithm (running in time ) that when given a word that is -correlated with for some , namely, , finds such that .
This theorem differs from thm:main-informal only in that does not come from any initial error correcting code . This is why we can only provide an approximate answer instead of .
Our list decoding result falls short of being useful for hardness amplification, because it is not local or efficient enough. We leave it as an open question whether a more efficient and local list decoding algorithm exists for these codes.
There is a significant technical hurdle that one faces, related to the diameter of the bipartite graph corresponding to . In the local list decoding constructions analyzed in [IKW12, IJKW10] (both derandomized and non-derandomized), the diameter is , and this is crucially used in the list decoding algorithm.
When we move to a linear size derandomized direct product encoding, as we do in this work, we pay by enlarging the diameter to become super-constant. This is what makes the approximate list decoding algorithm performed in our work much more challenging (even in the non-local setting), and the algorithm more complicated than the analogous task performed by [IKW12, IJKW10].
1.4 Future directions
The construction in this paper starts with a binary code and constructs a code over a larger alphabet that has efficient list decoding. The larger alphabet size arises because we use a derandomized direct product, i.e., each vertex collects the bit symbols from all indices . It is natural to consider the direct sum operation, where for each we output a single bit that is the sum (over ) of all values in the direct sum. A recent example of such a code is the recent construction in [Ta-17]. This code achieves close to optimal rate (and this is made possible because the direct product operator is replaced with a direct sum) and has explicit encoding (the sampler it uses is a variant of the sampler that is obtained through random walks on good expanders). One obvious shortcoming of the code of [Ta-17] is that while it has efficient encoding, it is not known how to efficiently decode (or list decode) it. It is still a major open problem to find a binary code with distance close to half, close to optimal rate (as in [Ta-17]) and efficient encoding and decoding.
It is possible that the results in this paper might help with finding such a code:
First, while the result in this paper uses direct product, it is conceivable that with a refinement of the double sampler notion one might do with direct sum. This is true, e.g., in a situation where for each , the values for , (and notice that now ) define a code, or even, an approximately list decodable code. This property is different from the locality property we require from our double samplers, but is close in spirit to it.
Then, having that, one needs to improve the parameters of the algorithms presented in the paper, and also find appropriate ”double samplers” (with the stronger property we require). It is not known whether such objects exist, and even for double samplers it is not clear what the best possible parameters are. This question is quite intriguing as we cannot compare our desired explicit object to non-explicit random objects, simply because here random objects are no good.
Thus, while this approach seems right now technichally challenging, and it is not even clear parameter-wise (or non-explicitly) whether it is possible, we believe it opens up a new, and exciting, research agenda that we hope will eventually lead to near-optimal binary codes that have both explicit encoding and decoding.
2 Preliminaries and Notations
2.1 Weighted graphs and expanders
(Weighted graph) We say that is a weighted graph if is an undirected graph, and is a weight function that associates with each edge a non-negative weight . We have the convention that non-edges have zero weight. Given the edge weights the weight of a vertex is defined as . The edge weights induce a distribution on edges (and vertices) which we denote by .
(Edge expansion) Let be a weighted graph. The edge expansion of is
where denotes the set of edges between and .
For every weighted graph , let the normalized adjacency matrix be defined as . Let
be the second largest eigenvalue (in absolute value) of.
(Weights on -partite graphs) Let be a -partite graph and let be a distribution over
. Define a joint distributionwhose values are -partite paths chosen by sampling according to , then choosing a random neighbor of and so forth. We denote by the -th coordinate of .
We will say that to mean that is a random neighbor (in ) of .
(Sampler) Let be a bipartite graph with a distribution on . We say is an sampler, if the following holds for every ,
Definition 2.6 (Two-step walk graph).
Let be a bipartite graph with distribution on . The two-step walk of is the weighted graph whose edge weights are given by selecting and then two independent copies . More explicitly
The following simple fact is important,
If we choose a random edge in , and a random vertex in it, then is distributed according to .
Theorem 2.8 (Every sampler contains an induced expander).
Let be such that . Let be an sampler. Let be the two-step walk graph of . Let be any set with . Then there exists a set such that:
Let be the induced graph of on . .
Furthermore, given , such a set can be found in time polynomial in .
The theorem is proven in app:expanding subset.
2.3 Double Samplers
(Inclusion graph) An inclusion graph with cardinalities is a tri-partite graph with vertices where for every and iff .
Given an inclusion graph and distribution over , recall from def:weights that induces a distribution over paths whose components are and .
Definition 2.10 (Double Sampler).
Let be an inclusion graph with a distribution over , let be the bipartite graph between .
We say that is a double sampler, if
is a sampler.
is a sampler.
For every , we define the weighted bipartite graph
where and for and iff , and . We require that is a sampler.
Furthermore, is called regular if is uniform over and for each , is uniform over .
Note that the distribution on
is by definition the uniform distribution on.
Short of being uniform, we formulate a “flatness” property of the distributions involved in the double sampler,
A distribution over is said to be -flat for an integer if there is some such that for each .
This property will allow us to treat the distribution as uniform by duplicating each element at most times.
2.4 Double Samplers Exist
We prove, in sec:doublesampler, that linear size double samplers are implied by the existence of high dimensional expanders.
For every there exist and a family of explicitly constructible double samplers for infinitely many such that
is an inclusion graph where , for , with distribution over .
is a regular double sampler.
. The distributions are -flat.
2.5 UG Constraint Graphs
(UG constraint graph) Let be a weighted graph. We say is a UG constraint graph with labels if is a permutation. We say is an assignment for if . We say the assignment satisfies an edge if . The value of an assignment is the fraction of satisfied edges. We say is -satisfiable if there exists an assignment with value at least .
Theorem 2.14 ([Mm10, Theorem 10]).
Let be a regular graph with second smallest Lapacian eigenvalue and edge expansion . There exist positive absolute constants and and a polynomial time approximation algorithm that given a satisfiable instance of UG on with , the algorithm finds a solution of value .
We need a version of this theorem with two modifications:
The theorem, as stated, refers to unweighted, regular graphs. We need the same results for non regular weighted graphs.
The theorem finds one assignment with high value. However, we need to get an approximation to all assignments with high value.
In the appendix we go over [AKK08, MM10] and show that the same result holds for weighted, non-regular graphs (see app:weighted ug). We also show how to output a list that contains an approximation to all assignments with high value. To do that we rerun the algorithm of thm:unique games several times, each time peeling off the solution that is found (see app:list ug). We prove:
Let be a weighted undirected UG constraint graph with labels such that . There exits an absolute constant and a polynomial time algorithm that outputs a list of assignments such that for every assignment with value there exists an assignment which satisfies .
2.6 Miscellaneous notation
Let , and let , then
In the case of , we omit the subscript .
3 The main theorem: encoding, list decoding and correctness proof
In this section we describe how to transform any uniquely decodable code to a list-decodable code, using double samplers. We begin by describing the new list decodable code through its encoding algorithm (sec:enc). We then describe the list decoding algorithm (sec:dec), and finally prove the correctness in sec:correctness.
3.1 The encoding
Let be an linear binary code. Let be a regular double sampler as in thm:doublesampler with , and parameters as follows: such that , for an absolute constant. We assume that is -flat for a parameter that depends exponentially on (where is the smallest of the ’s and is the smallest of the ’s).
As explained in the introduction, we wish to encode by . The double sampler assigns the set. If is not uniform, we view as a multiset that has multiple copies of each to account for the different probabilities. Choosing uniformly in the multiset is identical to choosing . The fact that is -flat guarantees that each needs to be repeated at most times. From now on we view as a multiset and let denote the number of elements in counted with multiplicity.
Denote . We define the encoding
Given we let be defined by
We let be the code .
The encoding is essentially the same as [ABN92], with weights on the sampler graph. We remark that the construction also works with any non-linear code , in which case the resulting code is also not (necessarily) linear.
Let be constants, and let be a double sampler, with , , , and an absolute constant.
Suppose is a code with an efficient unique-decoding from errors. Then is an code with an efficient list decoding algorithm, meaning there exist a time algorithm, which on input it returns all codewords in at distance from .
3.2 The list decoding algorithm
The input to the list decoding algorithm is a received word which we interpret as by setting 555A subtle point is that if is repeated several times in , then can take several possibly different values, per each repetition of . In this case will consist not of a single “deterministic” value in but rather a distribution over the different values. For clarity of presentation we will ignore this issue and treat as if it were concentrated on one value for each . The reader can check that our voting scheme works just as well if were a distribution over values.. We are promised that there is some so that .
Approximate List Decoding of Local Views
For every we construct a list of size of well-separated elements such that every that has agreement with on is very close to one of the elements in the list. More precisely,
Let . There exists a decoding algorithm that given returns a list of size at most and a radius , such that:
If is such that then there exits such that
For every , .
We prove the lemma in sec:app list dec. We let be the set of all with .
Creating a UG Constraint Graph
We define the graph , to be the two step walk of the weighted bipartite graph (see def:G2).
We label every vertex with labels from .
For every we define a constraint permutation as follows. Suppose and . We choose according the distribution . We go over : if there exists an unmatched such that
we set . At the end, for every unmatched we set to an arbitrary unmatched label.
Finding a large expanding UG constraint subgraph
For every such that let be the induced subgraph of on . By thm:induced-expander we can find such that:
Denote by the induced subgraph of on . Then, .
Solving the Unique Constraints
For every as above, we apply thm:pre UG on and get a list of assignments. For each assignment in the list we define by doing: for every we pick and let . We run the unique decoding algorithm of the code on and output its result.
The algorithm we described is randomized, but it can easily be derandomized since the random choices are local and we can enumerate over them in parallel. For example, in the step of constructing the constraint graph, each constraint is constructed randomly, but these random choices can clearly be enumerated over in parallel for all constraints.
3.3 Proof of correctness
(of thm:main) Let be the given input, such that the interpretation of as has agreement with some codeword of , i.e., there exists a function such that
For each , let be the list from item:app_loc of the decoding algorithm, and the radius. The list and radius satisfy the conclusion of lem:clustering above.
Let be the constraint graph described in item:constraint of the decoding algorithm, with the edge constraints for every . Recall that is a partition of according to the list radius .
A constraint for is correct with respect to if there exist and such that:
In sec:constraint graph we prove:
With high probability () there exists a such that the graph satisfies
By thm:induced-expander, there exists a subset such that , and (indeed the conditions for applying the theorem hold by our choice of parameters,