# Locally testable codes via high-dimensional expanders

Locally testable codes (LTC) are error-correcting codes that have a local tester which can distinguish valid codewords from words that are "far" from all codewords by probing a given word only at a very few (sublinear, typically constant) number of locations. Such codes form the combinatorial backbone of PCPs. A major open problem is whether there exist LTCs with positive rate, constant relative distance and testable with a constant number of queries. In this paper, we present a new approach towards constructing such LTCs using the machinery of high-dimensional expanders. To this end, we consider the Tanner representation of a code, which is specified by a graph and a base code. Informally, our result states that if this graph is part of a high-dimensional expander then the local testability of the code follows from the local testability of the base code. This work unifies and generalizes the known results on testability of the Hadamard, Reed-Muller and lifted codes on the Subspace Complex, all of which are proved via local self correction. However, unlike previous results, constant rounds of self correction do not suffice as the diameter of the underlying test graph can be logarithmically large in a high-dimensional expander and not constant as in all known earlier results. We overcome this technical hurdle by performing iterative self correction with logarithmically many rounds and tightly controlling the error in each iteration using properties of the high-dimensional expander. Given this result, the missing ingredient towards constructing a constant-query LTC with positive rate and constant relative distance is an instantiation of a base code that interacts well with a constant-degree high-dimensional expander.

## Authors

• 2 publications
• 9 publications
• 14 publications
• 4 publications
• ### On Decoding Cohen-Haeupler-Schulman Tree Codes

Tree codes, introduced by Schulman, are combinatorial structures essenti...
09/16/2019 ∙ by Anand Kumar Narayanan, et al. ∙ 0

• ### Locally Recoverable codes with local error detection

A locally recoverable code is an error-correcting code such that any era...
12/03/2018 ∙ by Carlos Munuera, et al. ∙ 0

• ### Decodable quantum LDPC codes beyond the √(n) distance barrier using high dimensional expanders

Constructing quantum LDPC codes with a minimum distance that grows faste...
04/16/2020 ∙ by Shai Evra, et al. ∙ 0

• ### Locally Decodable/Correctable Codes for Insertions and Deletions

Recent efforts in coding theory have focused on building codes for inser...
10/22/2020 ∙ by Alexander R. Block, et al. ∙ 0

• ### Local decoding and testing of polynomials over grids

The well-known DeMillo-Lipton-Schwartz-Zippel lemma says that n-variate ...
09/18/2017 ∙ by Srikanth Srinivasan, et al. ∙ 0

• ### Neural codes, decidability, and a new local obstruction to convexity

Given an intersection pattern of arbitrary sets in Euclidean space, is t...
03/30/2018 ∙ by Aaron Chen, et al. ∙ 0

• ### Almost-Reed-Muller Codes Achieve Constant Rates for Random Errors

This paper considers 'δ-almost Reed-Muller codes', i.e., linear codes sp...
04/20/2020 ∙ by Emmanuel Abbe, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this work, we study an approach to constructing locally testable codes (LTCs) based on high-dimensional expansion. LTCs are error-correcting codes that have a local tester which can test if a given word is a valid codeword or far (in Hamming distance) from all codewords, by probing the given word only at a very small (sublinear, typically constant) number of locations. Reed-Muller codes were the first codes shown to be locally-testable [FriedlS1995, RubinfeldS1996]. These codes are based on low degree polynomial functions, and have inverse polynomial rate. Later on, LTCs with inverse poly-logarithmic rate were constructed by [BenSassonS2008, Dinur2007]. Obtaining an LTC family with rate that is not vanishing is a major open question in this area. Such codes are known as “good” LTCs or -LTCs since they have constant rate, constant relative distance, and testable with a constant number of queries [Goldreich2010]. This question is interesting in its own right, and also could potentially lead towards constructing linear-length PCPs (as LTCs are the combinatorial backbone of all PCP constructions). The problem of constructing -LTCs is particularly difficult as we do not know if such good codes exist, even non-explicitly (say using a probabilistic argument). The difficulty stems from the fact that local testability requires redundancy in the constraints. In known LTCs, the constraints are highly overlapping, a property that in the past went hand in hand with relatively dense families of constraints. Alas this density seems to significantly limit the rate. In contrast, high-dimensional expanders give sparse families of subsets that are heavily overlapping. Perhaps if we manage to find appropriate constraints on these subsets we may find higher rate LTCs.

In this work, the vague notion of “overlapping constraints” is captured through so-called agreement-expansion (which will be formally defined below).

Informally speaking, we show that if an error-correcting code is defined through a collection of local constraints that sit on an agreement expander, then to prove local testability of the entire code it suffices to prove local testability of the local components (which are of merely constant size in the case of constant-degree agreement expanders). This is similar in spirit to recent applications of high-dimensional expanders towards proving other local-to-global results. This passing from local to global is particularly important because known constructions of high-dimensional expanders are very difficult to analyze on a global level. So far, successful analyses focused on the local structure (in neighborhoods, or so-called links) of these objects. Through this work, the task of constructing global LTCs is reduced to the task of constructing LTCs on the local structure, which appears to be a much more reasonable task.

This work can be viewed as providing a generic scheme for constructing an LTC on a high-dimensional expander (or an agreement expander), and the (big) missing ingredient is an appropriate instantiation. We comment that the flagship example of an LTC, namely Reed-Muller codes, can be viewed as an instantiation of this scheme, with the underlying agreement expander being the Grassmannian complex and the base code being the Reed-Solomon code (see sec:applications). The hope is that replacing the “dense” Grassmannian complex by a bounded-degree complex, together with finding an appropriate base code, could potentially lead to a LTC.

#### Tanner Codes

To elucidate the main result, we begin by recalling a well-studied family of codes, the Tanner codes [Gallager1960, Tanner1981]. A Tanner code is given by a family of (small, often constant-sized) subsets and for each subset a base code . A string is in the code if for each , .111A Tanner code is equivalently described on a bipartite graph (called the Tanner graph) with right vertices corresponding to the coordinates of the code and left vertices corresponding to the sets , with an edge between and if .

Many known codes, including Reed-Muller codes, lifted codes, tensor codes, and expander codes, are in fact Tanner codes. In all of these cases, there is a single base code

such that for all , but this need not be the case.

The Tanner representation of a code also gives a natural candidate for a local test for checking whether a given word is in the code.

Natural Tanner Test: Choose a random and accept iff .

We say that is -locally-testable with the natural tester if

 ρ⋅dist(w,C)⩽P[Test fails].

A family of codes is a locally testable code (LTC) if it satisfies the above inequality for some test (not necessarily the natural Tanner test) with a constant (that does not decrease with the block length of the code).

Many Tanner codes, including expander codes and random LDPC codes, that are very good in terms of rate and distance, and can be characterized by “low density” constraints (that look at only a constant number of bits in the codeword) fail quite miserably at being LTCs [BenSassonHR2005].

Imagine that in addition to we also have a family of subsets of , such that each has constant size, but slightly larger than the size of the ’s. For each such we consider the ‘local’ Tanner code

 Cs={w∈{0,1}s∣∣w∈{0,1}sw|t∈Ct,∀t∈T,t⊂sw|t∈Ct,∀t∈T,t⊂s}.

(Of course, is non-trivial only if there are some contained in .)

In this work, we show that if for each , the code itself is locally testable with the natural Tanner test, then the code too must be locally testable with respect to the natural Tanner test. This holds as long as we assume some nice structure on the families and , namely that they are part of a “multi-layered agreement sampler”, MAS for short, which is described below.

Let us change point of view and look at the codes as a collection of base codes, giving rise to the Tanner code . Our main result is that local testability of the base codes lifts to local testability of the entire code , assuming an expander-like MAS condition on the underlying Tanner graph. This is analogous to the celebrated expander codes [SipserS1996] in which distance of the base codes gets lifted to distance of the entire code, assuming expansion of the underlying Tanner graph. Whereas expansion alone does not suffice for local testability, the MAS structure does.

#### High-dimensional expanders and Agreement Expanders

There are several interesting and non-equivalent definitions for high-dimensional expanders, the two main ones being topological definitions of coboundary or cosystolic expansion [LinialM2006, Gromov2010, DotterrerKW2018], and, more relevant to this work, random walk definitions either locally at the link level [KaufmanM2017, DinurK2017] or globally [DiksteinDFH2018, KaufmanO2017]. Without going into details, high-dimensional expansion has already been shown to imply some surprising local to global theorems. For example the trickling down theorem of [Oppenheim2018] proves global spectral expansion using local spectral expansion in the links (which are the neighborhoods of individual vertices). Another example is the list decoding of [DinurHKNT2019] which deduces global list decoding from list-decoding on the local pieces.

Yet another example, which is crucial for this work, is that high-dimensional expanders give rise to agreement expanders [DinurK2017, DiksteinD2019]. An agreement expander allows one to stitch together many mostly-consistent local functions into a single global function. We elaborate a little more on this notion. Let be a ground set of elements, and let be a collection of subsets of of some fixed size. Let be a graph whose vertices are the subsets in , and each edge is labeled by a subset . Let be a collection of subsets labelling the edges.

is an -agreement expander if whenever an ensemble has agreement value there exists a global function such that for all but at most of . (See sec:ae for the full definition). An agreement expander is given by and the edge-labelled graph . Suppose that for each we are given a local function . The agreement value of the ensemble

is the probability of

for a randomly chosen edge (this is notation for an edge between labeled by ) in the graph . Whenever there is a global function such that for all , the agreement value of is clearly . We say that

Agreement expanders have been studied and used in the LTC and PCP literature for years (under different names such as direct product tests or sometimes low degree tests). However, prior to the recent connection with high-dimensional expanders, the only known agreement expanders were relatively dense. The existence of sparse such objects seems promising and could potentially lead to LTCs with positive rate. This work shows how agreement expansion can be useful for constructing LTCs.

#### Multilayered Agreement Samplers (MAS)

We now describe the MAS combinatorial structure needed for our LTC scheme. Let be a ground set of elements, and let be three families of subsets of of sizes . The system is said to be a -MAS if the following two conditions are met.

• are part of an -agreement-expander.

• The bipartite containment graph of vs. is a -sampler.

The above definition is stricter than what we actually need, see the formal more refined definition in def:mas. We are now ready to state our main result.

#### Main Result

Let be a -MAS. Suppose that for each we have a local code . Let be the Tanner code defined by for all . Namely,

 C:={w∈{0,1}V∣∣w∈{0,1}Vw|t∈Ct for every tw|t∈Ct for every t}.

Similarly, for each , let be the Tanner code defined by , namely,

 Cs={w∈{0,1}s∣∣w∈{0,1}sw|t∈Ct for every t⊂sw|t∈Ct for every t⊂s},

and similarly define for each , .

###### Theorem 1.1.

Let be layers in a -MAS satisfying . Suppose has relative distance for all and suppose that is -locally testable with the natural Tanner tester. Then is locally testable (with the natural Tanner tester).

We state our full main theorem in thm:LLTC-imples-GLTC.

#### Overview of proof

Our proof of local testability, like previous proofs of testability, goes via self correction. The main difficulty in our setting is that a single round of self-correction is insufficient to correct the word.

Let be a word that satisfies a -fraction of the constraints in the Tanner graph. We would like to show that there exists a such that . For specific codes, one could use the properties of the code to perform this self-correction (cf. Reed-Muller testing, one could use the properties of polynomials).

However, we cannot resort to such properties since we are working in an abstract setting. Instead, we rely on simple majority decoding: each vertex takes a value that satisfies the majority of the constraints it participates in. The main engine driving our proof is agreement expansion. Our proof strategy is as follows:

Construct a word from the received word via self correction (or otherwise) and show

1. is close to , and

2. is a valid codeword.

Property (a) is easy to show if is constructed via self correction using majority decoding. Property (b) is not very hard in the context of Hadamard testing and Reed-Muller testing: every vertex participates in a constraint with every other vertex (indeed the diameter of the Tanner graph is a constant), hence one round of self-correction results in a valid codeword . However, since our proof is general enough to work even for constant-degree Tanner graphs wherein the diameter can be as large as logarithmic, one does not expect a single step of self correction via majority decoding to yield a codeword in a single step.

Our proof instead relies on a novel iterative self correction procedure that slowly corrects a given word in logarithmically many iterations. A standard problem that arises when using iterative procedures is that the error grows linearly in the number of iterations, which is prohibitively expensive in our setting. We use the properties of MAS to show that the number of unsatisfied constraints by the self-corrected word reduces by a constant factor in each iteration. This allows us to perform an arbitrary number of rounds in the iterative self-correction procedure till we reach a perfect codeword (actually a logarithmic number of rounds will suffice). This type of argument is new in the context of locally testable codes.

Given this we can proceed with the proof overview as follows. Since satisfies -fraction of the constraints, an averaging argument shows that a -fraction of the ’s satisfy most of the constraints within them. Hence, by the local testability of the code we get that for most ’s, is close to a local codeword, say . Furthermore, it is not hard to show that these local codewords satisfy that for a typical and such that , we have . In other words, the ’s satisfy the hypothesis of the agreement test. From the agreement expansion of the MAS, there exists a “global” word that explains most of the ’s. Furthermore, it is not hard to show that is close to the original word . We then use the sampler property of the MAS to show that violates significantly fewer constraints than (in particular, violates at most -fraction of constraints).

We iteratively apply the above self-correction procedure to get a sequence of words such that such that violates at most -fraction of constraints and . Since the fraction of violated constraints cannot infinitely decrease, we have that eventually for a large enough , and .

#### Relation to previous work

We begin by recalling the history of LTCs and the close connection between PCP and LTC constructions. LTCs were first studied in the context of program checking by Blum, Luby and Rubinfeld [BlumLR1993] and Gemmell et al. [GemmellLRSW1991]. The notion of LTCs is implicit in the work on locally checkable proofs by Babai et al. [BabaiFLS1991] and subsequent works on PCPs. The explicit definition appeared independently in the works of Rubinfeld and Sudan [RubinfeldS1996], Friedl and Sudan [FriedlS1995], Arora’s PhD thesis [Arora1994] and Spielman’s PhD thesis [Spielman1995]. A formal study of LTCs was initiated by Goldreich and Sudan [GoldreichS2006]. Most known constructions of PCPs yield LTCs with similar parameters. In fact, there is a generic transformation to convert a PCP of proximity (which is a PCP with more requirements) into an LTC with comparable parameters [BenSassonGHSV2006, Trevisan2004]. See a survey by Goldreich [Goldreich2010] for the interplay between PCP and LTC constructions. In fact, the current best construction of LTCs (constant-query, constant fractional distance and inverse polylogarithmic rate) is obtained from the PCP constructions of Ben-Sasson and Sudan [BenSassonS2008] and Dinur [Dinur2007]. PCP-based constructions are unlikely to yield LTCs with constant rate since PCP constructions typically involve at least a logarithmic overhead. Nevertheless LTC constructions that aren’t derived from PCPs perhaps have a better chance at achieving the coding-theory gold-standard of positive rate and distance.

Agreement expansion and the multilayered set system structure play a central role in our proof of local testability. Another application of agreement expansion towards local testability was studied in [DinurHKR2019], where it was used to enhance the local testability of a code in the context of the subspaces (Grassmannian) complex. We remark that use of such multilayered agreement samplers in the context of locally-testable codes is actually implicit in many previous constructions of locally testable codes. The Raz-Safra [RazS1997] proof of the local testability of the Reed-Muller codes works with points-lines-planes structure, a subgraph of the Grassmannian complex which is an excellent agreement expander as explained in detail in sec:applications. The original proof due to Blum, Luby and Rubinfeld [BlumLR1993] (as well as subsequent improvements due to Coppersmith) of the local testability of the Hadamard codes as well as Kaufman and Sudan’s proof of testability of affine-invariant codes [KaufmanS2008], relies on the three-layered structure comprising of the points, the three-point tests and certain nine-point sets, sometimes referred to as "magic squares" [KaufmanS2008].

Our proof makes explicit this use of MAS to construct LTCs and shows that four-layered MAS are sufficient to transform “local” local testability to “global” local testability. In this sense, our proof can be viewed as bringing together these seemingly different proofs of local-testability under a common umbrella.

We already remarked that our construction has a similar paradigm as the Sipser-Spielman construction of expander codes [SipserS1996] which demonstrates that if the base code has good distance then the Tanner code also has good distance provided the graph is an expander. Another construction of the same flavor is the result of Dinur et al. [DinurHKNT2019] that demonstrates that if the local code is efficiently list-decodable then so is the global code defined by ABNNR distance amplification property via an expander [AlonBNNR1992], provided the expander is part of a large high-dimensional expander.

#### Further Discussion and Future Work

This work gives a general scheme for constructing an LTC. It needs to be instanciated with an appropriate MAS and base codes. As mentioned earlier, and explained in detail in sec:applications, one such instanciation is to choose the Grassmannian complex as the MAS, and the Reed Solomon code as the base codes. This gives the well-studied locally testable codes called Reed-Muller codes, as well as the more recent so-called lifted codes.

The most interesting direction is to instantiate this scheme with an MAS that comes from some bounded-degree high-dimensional expander, and to combine it with appropriate choice of locally testable base code. The main hurdle in choosing the base codes is to be able to certify that the resulting Tanner code maintains positive rate. In some similar situations this is done by a simple counting of the number of constraints. However, such an argument cannot work in the setting of LTCs, and we leave it as an open question.

## 2 Preliminaries

### 2.1 Error Correcting Codes

Let be some finite set. A code is some . Let be a prime power and be an

-dimensional vector space over a field with

elements. We say that is a linear code when is a subspace of . The rate of the code is .

It is convenient to think about as functions . The distance between two functions , denoted by , is the fraction of so that . The distance of a code is defined to be . When is linear, this is the same as .

### 2.2 Tanner Codes

A Tanner code [Gallager1960, Tanner1981] over an alphabet (also called a lifted code) is defined through two objects: a family of -element subsets of , and with each subset a base code . The code is given by

 C={w∈Σn∣∣w∈Σnw|t∈Ct,t∈Tw|t∈Ct,t∈T}.

The family is often described through a bipartite graph on vertex sets and connecting to whenever . Several well-known families of codes can be constructed as Tanner codes, including tensor codes, Reed-Muller codes, and the codes considered by Sipser and Spielman [SipserS1996]. A family of Tanner codes that is especially related to our context is the family of so-called lifted codes. Lifted codes were first introduced by Ben-Sasson, Maatouk, Shpilka and Sudan [BenSassonMSS2011] and their local testability was studied by Guo, Kopparty and Sudan [GuoKS2013]. These codes can be described as Tanner codes where is identified with points of a vector space and the family contains all possible affine subspaces of a prescribed dimension . The base code is taken to be affine invariant. A prime example for such codes is the Reed-Muller code.

### 2.3 Locally Testable Codes

A -local tester for the code is a probabilistic oracle algorithm that determines whether a word is in the code. It does the following: given oracle access to a function , it queries at input locations. Then if it accepts with probability . If it rejects with probability at least . Here is some constant parameter, and is the distance between and the closest codeword to it in .

For linear codes , [BenSassonHR2005] showed that without loss of generality, we can assume that the local testers picks a random subset according to some distribution, and accept if and only if (that is, that there exists some codeword so that ). Thus we formally define the locally testable codes as following:

###### Definition 2.1 (Locally Testable Codes).

Let some finite set, and be some linear code on . Let be some distribution on subsets of , and suppose every set of of size at most . Let . We say is -testable with respect to if

 ρ⋅dist(f,C)⩽Pt∼D[f|t∉C|t].

An alternate way of describing a locally testable code is using the Tanner graph representation of a code. In this representation, corresponds to the input locations in the codeword. corresponds to the subsets of indexes that are queried by the local tester. We connect and if .

A local tester that corresponds to this representation picks a random constraint and checks if the corresponding constraint is satisfied.

### 2.4 Sampler Graphs

Let be a bipartite graph, and assume that each edge carries a non-negative weight such that

. This probability distribution induces a marginal probability distribution on U and similarly on V given by

. For every set (and respectively) we denote by As a slight abuse of notation, for a set and a vertex we denote by

 P[B|Bv0v0]=Puv∈E[u∈B|u∈Bv=v0v=v0].

A sampler graph is a graph where for all , most of the vertices have that .

###### Definition 2.2 (λ-sampler).

Let be a bipartite graph. For any , we define . For we say is a -sampler if for every and every ,

 P[N]⩽λδ2P[B].

There is a tight connection between expander bipartite graphs and sampler graphs. For more on this, see [Goldreich2011-samp].

### 2.5 Agreement Expanders

Let be a finite universe, a collection of subsets of , and for each subset , a local function . An ensemble is perfectly global if it comes from a single global function , namely, for all . We denote by the collection of all perfectly global ensembles. An agreement tester is given by a non-negatively weighted graph with vertex set , and such that each edge is labelled by some . Without loss of generality we require that the weights sum to , so that the edges form a distribution over pairs . Given a collection of local functions, the tester selects an edge at random, and accepts if for each . We call this the value of under and denote it by ,

 A({fs}):=Ps,s′∼A[fs(v)=fs′(v),∀v∈k].

It is clear that a perfectly global ensemble has value . Indeed for any pair and any , assuming that is the global function that agrees with . The graph is an agreement expander if a robust converse holds, namely any ensemble with has to be close to a perfectly global ensemble. Formally,

###### Definition 2.3.

Let be as above. We call an -agreement expander if for every ensemble of local functions

 α⋅dist({fs},G)⩽1−A({fs}). (2.1)

where the distance is the distance between and the closest perfectly global ensemble; where distance between two ensembles is defined as probability when is chosen from the marginal distribution of .

#### More refined notions of agreement-expansion

We also say that is an agreement expander with respect to -ensembles if (2.1) holds for every ensemble that is a -ensemble, namely, such that for every edge in the graph , we have either or else the Hamming distance between and is at least .

Furthermore, we allow a slightly weaker notion of distance from being perfectly global. We say that has soundness wrt ensembles if the following holds. Suppose that for every there is a distribution that samples that are subsets of . We say that is -sound wrt when

 α⋅minG∈GPs∈S,k∼Ds[fs|k≠G|k]⩽1−A({fs}). (2.2)

We say that is sound with respect to ensembles if (2.2) holds for all ensembles.

## 3 Multilayer Agreement Samplers

The structure we use to construct locally testable codes has a sampler component and an agreement expander component, that sit together in four layers. We call these structures Multilayer Agreement Samplers.

###### Definition 3.1 (Multilayer Agreement Samplers (MAS)).

Let . Let be a set of elements, and

be families of subsets so that there is a non-degenerate Markov chain that samples

from respectively, so that . (Spelling out the Markov chain requirement we have a distribution over such that the choice of is conditioned only on , the choice of is conditioned only on , and finally the choice of is conditioned only on .)

We say that are a -MAS if

1. There is an agreement expander with vertex set and edge labels so that:

• The marginal distribution of sampling a labeled edge in , and returning , is the same as the marginal distribution of the Markov chain.

• is -sound for -ensembles.

2. The bipartite containment graph of vs. , is a -sampler. Here the probability of sampling an edge is the probability of sampling together in the Markov chain.

A natural example for an MAS is the Grassmannian complex, that is, a four layer structure where and are affine spaces of of fixed dimensions. We elaborate on this example in the subsection below. The Grassmannian complex is dense, that is, the number of subspaces grows exponentially with the dimension. No known codes on the Grassmannian complex have good rate.

Currently known constant degree MASs arise from high-dimensional expanders which are simplicial complexes. However, we cannot use MASs that are directly simplicial complexes to construct any code with non-trivial rate and distance. It is conceivable that high-dimensional expanders that are not simplicial complexes222These can still arise from high dimensional expanders. For example an MAS whose subsets are links of a high dimensional expander. may yield good LTCs.

### 3.1 MASs coming from the Grassmannian Complex

The set system for the Grassmannian MAS is corresponds to points and affine subspaces of a vector space. Formally, let be some prime power, and be some integers greater than . Our ground set is , and over it we define the following set system where consist of all affine subspaces of dimensions and respectively. The Markov process of this set system, is sampling uniformly.

The edge distribution of the test graph , is to sample a subspace , and then two subspaces independently, given that . We call this the -agreement test.

We claim that this set system is an MAS:

###### Lemma 3.2.

There is a universal constant so that the following holds. Let be as above, and assume . Let be any prime power. Let be as above. Then is a -MAS for every .

the constant above does not depend on , nor on .

###### Proof.

The sampling properties of the layers of a Grassmannian complex are folklore:

###### Fact 3.3.

Let be the graph where are subspaces of dimension and are subspaces of dimension , and if with uniform weights. This graph is a -sampler.

Agreement of the -agreement test graph was proven by [DiksteinD2019] (Theorem 6.2).

###### Theorem 3.4 (Agreement for Grassmannian).

There exists a constant such that for every prime power , , and integers such that the following holds. The -Grassmannian agreement test is -sound for -ensembles.

Combining these two statements together we get that there exists some so that for every , defined above are a -MAS. ∎

In sec:applications use our main theorem, thm:LLTC-imples-GLTC, to show that local testability of lifted on the Grassmannian complex, is implied by the local testability of the base code.

## 4 Main Theorem - Locally Testable Codes on MASs

Given an and a set of base codes , the lifted code to is

 C={w:V→Σ∣∣w:V→Σw|t∈Ct,∀t∈Tw|t∈Ct,∀t∈T}.

Similarly, for every or , the local lifts to or are

 Cs={w:s→Σ∣∣w:s→Σw|t∈Ct,∀t⊆sw|t∈Ct,∀t⊆s},Ck={w:k→Σ∣∣w:k→Σw|t∈Ct,∀t⊆kw|t∈Ct,∀t⊆k}.

The next theorem is a reformulation of thm:main-hdx.

###### Theorem 4.1 (Main).

Let be a finite set and so that . Let be a -MAS. Let be a set of base codes, and let be the lifted code. Suppose that

1. Local Distance: has distance for every .

2. Local local testability: For every , the code is -locally testable with respect to sampling given that .

Then is -locally testable with respect to the distribution of choosing .

We encourage the readers to think of as some fixed constants of the set system. Then the theorem states that if have large relative distance , and are -locally testable for a large enough , then the lifted code is -locally testable.

### 4.1 Proof of the Main Theorem

###### Proof of thm:LLTC-imples-GLTC.

Let be some word so that

 Fail(w0)def=Pt∈T[w0|t∉Ct]=ε.

We need to find a word so that . We will find a word so that , and

 Fail(w1)=Pt∈T[w1|t∉Ct]⩽12ε.

As a first step we define a function ensemble so that is the closest code word to (ties broken arbitrarily). For each both and , and since is a code with relative distance , we get that is a -ensemble.

We claim that the ensemble passes the agreement test with high probability.

###### Claim 4.2.
 P{s1,s2}k∼A[fs1|k=fs2|k]=1−4ερδ.

As there is an agreement expander that is -sound with respect to -ensembles, there exists some function so that

 Pk⊂s[w1|k=fs|k]=1−4ερδα. (4.1)

We claim that is close to , and that fails the test with probability .

###### Claim 4.4.

Modulo claim:new-is-close-to-old and claim:new-function-rarely-fails, we repeat the correction process times. In the beginning of the -th iteration we start with that fails the test with probability . In the end of the iteration, we find that fails the test with probability , and so that . Thus we obtain a sequence of functions that ends with that always passes the test. The distance we accumulate from is

 dist(w0,wr)⩽r−1∑i=0dist(wi,wi+1)⩽8ρδα∞∑i=012i=16ρδα.

###### Proof of claim:ensemble-has-high-agreement.

By the local testability of the base code ,

 Es[dist(w0|s,Cs)]⩽ρ−1Es[Pt⊂s[w0|t∉Ct]]=ρ−1Pt[w0|t∉Ct]⩽ερ. (4.2)

As is closest code word to ,

 ερ⩾Es[dist(w0|s,fs)]=Es[Ek⊂s[dist(w0|k,fs|k)]].

By Markov’s inequality, with probability of sampling , it holds that where is the closest codeword in to .

By the local distance assumption, has distance , and if , then

 fs1|k=fs2|k.

###### Proof of claim:new-is-close-to-old.

We note that

 dist(w0,w1)=Es[dist(w0|s,w1|s)].

We show closeness by the triangle inequality. Fix , then

 dist(w0|s,w1|s)⩽dist(w0|s,fs)+dist(fs,w1|s).

By (4.2),

 Es[dist(w0|s,fs)]⩽ερ.

By the -soundness of the agreement expander ,

 dist(w1|s,fs)=Ek⊂s[dist(w1|k,fs|k)]⩽Pk⊂s[w1|k≠fs|k]=4ερδα.

By the triangle inequality, and using the fact that both

 dist(w0,w1)⩽8ερδα.

The proof of claim:new-function-rarely-fails relies on the -sampling property of the MAS.

###### Proof of claim:new-function-rarely-fails.

By assumption the containment graph between and is has the -sampling property. Let . We observe the following:

1. By the agreement property, , and without loss of generality (if we want to show that the code is -locally testable, it is enough to consider so that ).

2. If contributes to the failure (i.e ), then for all and . Thus all its neighbours are in .

Denote by the set of so that all of ’s neighbours are in . By item above we have that . We note that if we sample a neighbour of , we get some with probability . Thus by the -sampling property, we have that

 Pt∈T[N]⩽4λ8ερδα.

We chose , hence .

###### Remark 4.5.

The MAS has four layers. The vertex layer and the layer are required to define the lifted code itself. It is also natural to introduce a higher layer , since without any other requirements we can’t expect any lifted code to be locally testable.

However, the intermediate layer is possibly unneeded. While it is a crucial part of the proof, it is not needed for lifting the code, nor for the local tests. We believe it is interesting to understand whether it is enough to study a three-layered set system, namely . Are there similar properties, in terms of agreement, sampling and expansion, that also give us a similar result?

## 5 Local Testability in Vector Spaces

In this section we demonstrate how the main theorem fits in with, and generalizes, the known results on testability of Reed-Muller codes. In this case the MAS is the Grassmannian complex MAS described in lem:Grassmann-is-MAS for and being the collections of all affine subspaces of dimension respectively.

We define the code on by lifting base codes . Namely

 C={w:Fnp→Fp∣∣w:Fnp→Fpw|t∈Ct,∀t∈Tw|t∈Ct,∀t∈T}.

One example of such a code, is the -Reed-Muller code on . This code consists of all polynomials of degree . When we call this the Reed-Solomon code. Take to be the set of all affine lines (i.e. ), and let be the -Reed-Solomon code on every line. Lifting this code to results in all functions so that for every line , is a function of degree at most . For some parameters this results in the -Reed-Muller code. Surprisingly, [GuoKS2013] showed that for some other parameters the code lifted from the -Reed-Solomon code, contains more than the -Reed-Muller code. Nevertheless, these codes are locally testable as well [GuoHS2015, HaramatyRS2015].

Our main theorem states that to prove local testability of it is enough to prove that is locally testable, for to each subspace .

This gives rise to testability results for Reed-Muller codes (which are well studied, see [RubinfeldS1996, RazS1997, AroraS2003]) as well as to lifted codes as were studied in [GuoKS2013] (given of course, that we check their local testability in a some small fixed space). Moreover, this statement continues to hold for more general sets of base codes : If the lifts of to dimension subspaces are locally testable (with good enough parameters), then the lifted code to dimension is also locally testable. This is particularly useful in the regime where are fixed, and tends to infinity. This includes the examples above, but is a more general statement.

###### Theorem 5.1.

There is a universal constant so that the following holds. Let be as above, and assume . Let be any prime power. Let be as above. Let be a set of base codes, and suppose that there exists some and so that:

1. For any dimensional space , has distance .333[GuoKS2013] showed this holds, for example, whenever the base codes themselves have distance .

2. For every dimensional space , is -locally testable.

Then for any , the lift of to is -locally testable.

The constant doesn’t depend on any of the other parameters, nor on the field size.

We encourage the readers to think of . Then for every fixed dimensions and field size there is some , so that for every lifted code that is -locally testable on spaces of dimension , the code is also -locally testable on all spaces of dimension (for a large enough ). Note that this theorem applies both to the regime where the field size is small (e.g. ), and where the field size goes to infinity. When grows, the conditions of the theorem become easier to satisfy, that is, that the lower bound on becomes smaller as well.

###### Proof of thm:tanner-grassmann.

Let be the constant stated in lem:Grassmann-is-MAS. The system defined above is a -MAS, by lem:Grassmann-is-MAS, for that .

Denote by the lift of to . This code satisfies the distance and local local testability properties:

1. The lift of to an dimensional space has distance .

2. The lift of to a -dimensional space is -locally testable.

Hence by thm:LLTC-imples-GLTC, this code is -locally testable. ∎