Sublinear-Time Computation in the Presence of Online Erasures

We initiate the study of sublinear-time algorithms that access their input via an online adversarial erasure oracle. After answering each query to the input object, such an oracle can erase t input values. Our goal is to understand the complexity of basic computational tasks in extremely adversarial situations, where the algorithm's access to data is blocked during the execution of the algorithm in response to its actions. Specifically, we focus on property testing in the model with online erasures. We show that two fundamental properties of functions, linearity and quadraticity, can be tested for constant t with asymptotically the same complexity as in the standard property testing model. For linearity testing, we prove tight bounds in terms of t, showing that the query complexity is Θ(log t). In contrast to linearity and quadraticity, some other properties, including sortedness and the Lipschitz property of sequences, cannot be tested at all, even for t=1. Our investigation leads to a deeper understanding of the structure of violations of linearity and other widely studied properties.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/15/2021

Testing properties of signed graphs

In graph property testing the task is to distinguish whether a graph sat...
05/08/2021

Quantum Proofs of Proximity

We initiate the systematic study of QMA algorithms in the setting of pro...
04/22/2019

Robust Clustering Oracle and Local Reconstructor of Cluster Structure of Graphs

Due to the massive size of modern network data, local algorithms that ru...
11/29/2020

Erasure-Resilient Sublinear-Time Graph Algorithms

We investigate sublinear-time algorithms that take partially erased grap...
09/07/2019

Hard properties with (very) short PCPPs and their applications

We show that there exist properties that are maximally hard for testing,...
11/19/2018

Testing local properties of arrays

We study testing of local properties in one-dimensional and multi-dimens...
10/10/2020

A Structural Theorem for Local Algorithms with Applications to Coding, Testing, and Privacy

We prove a general structural theorem for a wide family of local algorit...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We initiate the study of sublinear-time algorithms that compute in the presence of an online adversary that blocks access to some data points in response to the algorithm’s queries. A motivating scenario is when a user wishes to remove their data from a dataset due to privacy concerns, as enabled by right to be forgotten laws such as the EU General Data Protection Regulation [Mantelero13]. The online aspect of our model suitably captures the case of individuals who are prompted to restrict access to their data after noticing an inquiry into their or others’ data. We choose to model such user actions as adversarial in order to perform worst-case analysis. Two other motivating scenarios are naturally adversarial. In one, an algorithm is trying to detect some fraud (e.g., tax fraud) and the adversary wants to obstruct access to data in order to make it hard to uncover any evidence. In the other scenario, an algorithm’s goal is to determine an optimal course of action (e.g., whether to invest in a stock or to buy an item), whereas the adversary leads the algorithm astray by adaptively blocking access to pertinent information. In our model, after answering each query to the input object, the adversary can hide a small number of input values. Our goal is to understand the complexity of basic computational tasks in extremely adversarial situations, where the algorithm’s access to data is blocked during the execution of the algorithm in response to its actions. Specifically, we represent the input object as a function on an arbitrary finite domain111Input objects such as strings, sequences, images, matrices, and graphs can all be represented as functions., which the algorithm can access by querying a point from the domain and receiving the answer from an oracle. At the beginning of computation, for all points in the domain of the function. We parameterize our model by a natural number that controls the number of function values the adversary can erase after the oracle answers each query. Mathematically, we represent the oracle and the adversary as one entity. However, it might be helpful to think of the oracle as the data holder and of the adversary as the obstructionist. A -online-erasure oracle can replace values on up to points with a special symbol , thus erasing them. The new values will be used by the oracle to answer future queries to the corresponding points. The locations of erasures are unknown to the algorithm. The actions of the oracle can depend on the input, the queries made so far, and even on the publicly known code that the algorithm is running, but not on future coin tosses of the algorithm. We focus on investigating property testing in the presence of online erasures. In the property testing model, introduced by [RubinfeldS96, GGR98] with the goal of formally studying sublinear-time algorithms, a property is represented by a set (of functions satisfying the desired property). A function is -far from if differs from each function on at least an

fraction of domain points. The goal is to distinguish, with constant probability, functions

from functions that are -far from We call an algorithm a -online-erasure-resilient -tester for property  if, given parameters and and access to an input function via a -online-erasure oracle, the algorithm accepts with probability at least 2/3 if and rejects with probability at least 2/3 if is -far from . We study the query complexity of online-erasure-resilient testing of several fundamental properties. We show that for linearity and quadraticity of functions , the query complexity of -online-erasure-resilient testing for constant is asymptotically the same as in the standard model. For linearity, we also prove tight bounds in terms of , showing that the query complexity is . A function is linear if it can be represented as a sum of monomials of the form , where 

is a vector of 

bits; the function is quadratic if it can be represented as a sum of monomials of the form  or . To understand the difficulty of testing in the presence of online erasures, consider the case of linearity and The celebrated tester for linearity in the standard property testing model was proposed by Blum, Luby, and Rubinfeld [BlumLR93]. It looks for witnesses of non-linearity that consist of three points and satisfying , where addition is mod 2, and denotes bitwise XOR. Bellare et al. [BellareCHKS96] show that if is -far from linear, then a triple is a witness to non-linearity with probability at least when are chosen uniformly and independently at random. In our model, after and are queried, the oracle can erase the value of To overcome this, our tester considers witnesses with more points, namely, of the form for sets of even size. Witnesses of non-quadraticity are even more complicated. The tester of Alon et al. [AlonKKLR05] looks for witnesses consisting of points , and all four of their linear combinations. We describe a two-player game that models the interaction between the tester and the adversary and give a winning strategy for the tester-player. We also consider witness structures in which all specified tuples are witnesses of non-quadraticity (to allow for the possibility of the adversary erasing some points from the structure). We analyze the probability of getting a witness structure under uniform sampling when the input function is -far from quadratic. Our investigation leads to a deeper understanding of the structure of witnesses for both properties, linearity and quadraticity. In contrast to linearity and quadraticity, we show that several other properties, specifically, sortedness and the Lipschitz property of sequences, and the Lipschitz property of functions cannot be tested in the presence of an online-erasure oracle, even with , no matter how many queries the algorithm makes. Interestingly, witnesses for these properties have a much simpler structure than witnesses for linearity and quadraticity. Consider the case of sortedness of integer sequences, represented by functions A sequence is sorted (or the corresponding function is monotone) if for all in . A witness of non-sortedness consists of two points , such that In the standard model, sortedness can be -tested with an algorithm that queries an uniform and independent points [FLNRRS02]. (The fastest testers for this property have query complexity [EKKRV00, DGLRRS99, BGJRW12, CS13, Belovs18], but they make correlated queries that follow a more complicated distribution.) Our impossibility result demonstrates that even the simplest testing strategy of querying independent points can be thwarted by an online adversary. To prove this result, we use sequences that are far from being sorted, but where each point is involved in only one witness, allowing the oracle to erase the second point of the witness as soon as the first one is queried. Using a version of Yao’s principle that is suitable for our model, we turn these examples into a general impossibility result for testing sortedness with a 1-online-erasure oracle. Our impossibility result for testing sortedness uses sequences with many (specifically, ) distinct integers. We show that this is not a coincidence by designing a -online-erasure-resilient sortedness tester that works for sequences that have distinct values. However, the number of distinct values does not have to be large to preclude testing the Lipschitz property in our model. A function , representing an -integer sequence, is Lipschitz if for all . Similarly, a function is Lipschitz if for all . We show that the Lipschitz property of sequences, as well as -variate functions, cannot be tested even when the range has size 3, even with , no matter how many queries the algorithm makes.

Comparison to related models.

Our model is closely related to (offline) erasure-resilient testing of Dixit et al. [DixitRTV18]. In the model of Dixit et al., also investigated in [RV18, RRV19, BenFLR20, PallavoorRW21, LPRV21, NV20], the adversary performs all erasures to the function before the execution of the algorithm. An (offline) erasure-resilient tester is given a parameter , an upper bound on the fraction of the values that are erased. The adversary we consider is more powerful in the sense that it can perform erasures online, during the execution of the tester. However, in some parameter regimes, our oracle cannot perform as many erasures. Importantly, all three properties that we show are impossible to test in our model, are testable in the model of Dixit et al. with essentially the same query complexity as in the standard model [DixitRTV18]. It is open if there are properties that have lower query complexity in the online model than in the offline model. The models are not directly comparable because the erasures are budgeted differently. Another widely studied model in property testing is that of tolerant testing [PRR06]. As explained by Dixit et al., every tolerant tester is also (offline) erasure-resilient with corresponding parameters. As pointed out in [PRR06], the BLR tester is a tolerant tester of linearity for significantly smaller than Tolerant testing of linearity with distributional assumptions was studied in [KoppartyS09] and tolerant testing of low-degree polynomials over large alphabets was studied in [GuruswamiR05]

. Tolerant testing of sortedness is closely related to approximating the distance to monotonicity and estimating the longest increasing subsequence. These tasks can be performed with polylogorithmic in

number of queries [PRR06, ACCL07, SaksS17]. As we showed, sortedness is impossible to test in the presence of online erasures.

1.1 Our Results

We design -online-erasure-resilient testers for linearity and quadraticity, two properties widely studied because of their connection to probabilistically checkable proofs, hardness of approximating NP-hard problems, and coding theory. Our testers have 1-sided error, that is, they always accept functions with the property. They are also nonadaptive, that is, their queries do not depend on answers to previous queries.

Linearity.

Starting from the pioneering work of [BlumLR93], linearity testing has been investigated, e.g., in [BellareGLR93, BellareS94, FeigeGLSS96, BellareCHKS96, BellareGS98, Trevisan98, SudanT98, SamorodnitskyT00, HastadW03, Ben-SassonSVW03, Samorodnitsky07, SamorodnitskyT09, ShpilkaW06, KaufmanLX10] (see [RasR16] for a survey). Linearity can be -tested in the standard property testing model with queries by the BLR tester. We say that a pair violates linearity if . The BLR tester repeatedly selects a uniformly random pair of domain points and rejects if it violates linearity. A tight lower bound on the probability that a uniformly random pair violates linearity was proven by Bellare et al. [BellareCHKS96] and Kaufman et al. [KaufmanLX10]. We show that linearity can be -tested with queries with a -online-erasure oracle.

Theorem 1.1.

There exist a constant and a 1-sided error, nonadaptive, -online-erasure-resilient -tester for linearity of functions that works for and makes queries.

Our linearity tester has query complexity for constant , which is optimal even in the standard property testing model, with no erasures. The tester looks for more general witnesses of non-linearity than the BLR tester does, namely, for tuples of elements from such that and is even. We call such tuples violating. The analysis of our linearity tester crucially depends on the following structural theorem.

Theorem 1.2.

Let be a tuple of a fixed even size, where each element of is sampled uniformly and independently at random from . If a function is -far from linear, then

Our theorem generalizes the result of [BellareCHKS96], which dealt with the case . We remark that Theorem 1.2

does not hold for odd

. Consider the function , where is the first bit of . Function is -far from linear, but has no violating tuples of odd size. The core procedure of our linearity tester queries uniform points from to build a reserve and then queries sums of the form , where is a uniformly random tuple of reserve elements such that is even. The quality of the reserve is the probability that is violating. The likelyhood that the procedure catches a violating tuple depends on the quality of the reserve (which is a priori unknown to the tester) and the number of sums queried. Instead of querying the same number of sums in each iteration of this core procedure, one can obtain a better query complexity by guessing different reserve qualities for each iteration and querying the number of sums that is inversely proportional to the reserve quality. We decide on the number of sums to query based on the work investment strategy by Berman, Raskhodnikova, and Yaroslavtsev [BermanRY14], which builds on an idea proposed by Levin and popularized by Goldreich [Goldreich14]. Next, we show that our tester has optimal query complexity in terms of the erasure budget .

Theorem 1.3.

For all , every -online-erasure-resilient -tester for linearity of functions must make more than queries.

The main idea in the proof of Theorem 1.3 is that when a tester makes queries, the adversary has the budget to erase all linear combinations of the previous queries after every step. As a result, the tester cannot distinguish a random linear function from a random function.

Quadraticity.

Quadraticity and, more generally, low-degree testing have been studied, e.g., in [BabaiFL91, BabaiFLS91, GemmellLRSW91, FeigeGLSS96, FriedlS95, RubinfeldS96, RazS97, AlonKKLR05, AroraS03, MoshkovitzR08, Moshkovitz17, KaufmanR06, Samorodnitsky07, SamorodnitskyT09, JutlaPRZ09, BKSSZ10, HaramatySS13, Ron-ZewiS13, DinurG13]. Low-degree testing is closely related to local testing of Reed-Muller codes. The Reed-Muller code consists of codewords, each of which corresponds to all evaluations of a polynomial of degree at most . A local tester for a code queries a few locations of a codeword; it accepts if the codeword is in the code; otherwise, it rejects with probability proportional to the distance of the codeword from the code. In the standard property testing model, quadraticity can be -tested with queries by the tester of Alon et al. [AlonKKLR05] that repeatedly selects and queries on all of their linear combinations—the points themselves, the double sums , and the triple sum . The tester rejects if the values of the function on all seven queried points sum to 1, since this cannot happen for a quadratic function. A tight lower bound on the probability that the resulting 7-tuple is a witness of non-quadraticity was proved by Alon et al. [AlonKKLR05] and Bhattacharyya et al. [BKSSZ10]. We prove that quadraticity can be -tested with queries with a -online-erasure-oracle for constant . Our tester can be easily modified to give a local tester for the Reed-Muller code that works with a -online-erasure oracle.

Theorem 1.4.

There exists a 1-sided error, nonadaptive, -online-erasure-resilient -tester for quadraticity of functions that makes queries for constant .

The main ideas behind our quadraticity tester are explained in Section 1.2.

Sortedness.

Sortedness testing (see [Enc1] for a survey) was introduced by Ergun et al. [EKKRV00]. Its query complexity has been pinned down to by [EKKRV00, Fis04, ChSe14, Belovs18]. We show that online-erasure-resilient testing of integer-valued sequences is, in general, impossible.

Theorem 1.5.

For all , no algorithm can -test sortedness of integer sequences when accessed via the -online-erasure oracle.

In the case without erasures, sortedness can be tested with uniform and independent queries [FLNRRS02]. Theorem 1.5 implies that a uniform tester for a property does not translate into the existence of an online-erasure-resilient tester, counter to the intuition that testers that make only uniform and independent queries should be less prone to adversarial attacks. Our lower bound construction demonstrates that the structure of violations to a property plays an important role in determining whether the property is testable. The hard sequences from the proof of Theorem 1.5 have distinct values. Pallavoor et al. [PRV18, Ramesh] considered the setting when the tester is given an additional parameter , the number of distinct elements in the sequence, and obtained an -query tester. Two lower bounds apply to this setting: for nonadaptive testers [BlaRY14] and for all testers for the case when [Belovs18]. Pallavoor et al. also showed that sortedness can be tested with uniform and independent queries. We extend the result of Pallavoor et al. to the setting with online erasures for the case when is small.

Theorem 1.6.

Let be a constant. There exists a 1-sided error, nonadaptive, -online-erasure-resilient -tester for sortedness of -element sequences with at most distinct values. The tester makes uniform and independent queries and works when .

Thus, sortedness is not testable with online erasures when is large and is testable in the setting when  is small. For example, for Boolean sequences, it is testable with queries.

The Lipschitz property.

Lipschitz testing, introduced by [JhaR13], was subsequently studied in [CS13, DixitJRT13, BermanRY14, AwasthiJMR16, ChakrabartyDJS17]. Lipschitz testing of functions can be performed with queries [JhaR13]. For functions , it can be done with queries [JhaR13, CS13]. We show that the Lipschitz property is impossible to test in the online-erasures model even when the range of the function has only 3 distinct values. This applies to both domains, and

Theorem 1.7.

For all , there is no 1-online-erasure-resilient -tester for the Lipschitz property of functions . The same statement holds when the domain is instead of

Yao’s minimax principle.

All our lower bounds use Yao’s minimax principle. A formulation of Yao’s minimax principle suitable for our online-erasures model is described in Appendix A.

1.2 The Ideas Behind Our Quadraticity Tester

One challenge in generalizing the tester of [AlonKKLR05] to work with an online-erasure oracle is that its queries are correlated. First, we want to ensure that the tester can obtain function values on tuples of the form . Then we want to ensure that, if the original function is far from the property, the tester is likely to catch such a tuple that is also a witness to not satisfying the property. Next, we formulate a two-player game222This game has been tested on real children, and they spent hours playing it. that abstracts the first task. In the game, the tester-player sees what erasures are made by the oracle-player. This assumption is made to abstract out the most basic challenge and is not used in the algorithms’ analyses.

Quadraticity testing as a two-player game.

Player 1 represents the tester and Player 2 represents the adversary. The players take turns drawing points, connecting points with edges, and coloring triangles specified by drawn points, each in their own color. Player 1 wins the game if it draws in blue all the vertices and edges of a triangle and colors the triangle blue. The vertices represent the points , the edges are the sums , and the triangle is the sum A move of Player 1 consists of drawing a point or an edge between two existing non-adjacent points or coloring a uncolored triangle between three existing points (in blue). A move of Player 2 consists of at most steps; in each step, it can draw a red edge between existing points or color a triangle between three existing points (in red).

Figure 1: Stages in the quadraticity game for , played according to the winning strategy for Player 1: connecting the -decoys from the first tree to -decoys (frames 1-4); drawing and connecting it to -decoys and an -decoy (frames 5-6), and creation of a blue triangle (frames 7–8). Frame 5 contains edges from to two structures, each replicating frame 4. We depict only points and edges relevant for subsequent frames.

Our online-erasure-resilient quadraticity tester is based on a winning strategy for Player 1 with moves. At a high level, Player 1 first draws many decoys for . The -decoys are organized in full -ary trees of depth . The root for tree is , its children are , where , etc. We jot the rest of the winning strategy for and depict it in Fig. 1. In this case, Player 1 does the following for each of two trees: it draws points ; connects to half of -decoys (w.l.o.g., ); draws point connects it to two of the -decoys adjacent to (w.l.o.g., and ); draws point connects it to two of (w.l.o.g., and ); draws and connects it to one of the roots (w.l.o.g., ), connects to one of and (w.l.o.g., ), connects to one of and (w.l.o.g., ), and finally colors one of the triangles and , thus winning the game. The decoys are arranged to guarantee that Player 1 always has at least one available move in each step of the strategy. For general , the winning strategy is described in full detail in Algorithm 2. Recall that the -decoys are organized in full -ary trees of depth . For every root-to-leaf path in every tree, Player 1 draws edges from all the nodes in that path to a separate set of decoys for . After is drawn, the tester “walks” along a root-to-leaf path in one of the trees, drawing edges between and the -decoys on the path. The goal of this walk is to avoid the parts of the tree spoiled by Player 2. Finally, Player 1 connects to an -decoy that is adjacent to all vertices in the path, and then colors a triangle involving this -decoy, a -decoy from the chosen path, and . The structure of decoys guarantees that Player 1 always has options for its next move, only of which can be spoiled by Player 1.

From the game to a tester.

There are two important aspects of designing a tester that are abstracted away in the game: First, the tester does not actually know which values are erased until it queries them. Second, the tester needs to catch a witness demonstrating a violation of the property, not merely a tuple of the right form with no erasures. Here, we briefly describe how we overcome these challenges. Our quadraticity tester is based directly on the game. It converts the moves of the winning strategy of Player 1 into a corresponding procedure, making a uniformly random guess at each step about the choices that remain nonerased. There are three core technical lemmas used in the analysis of the algorithm. Lemma 4.4 lower bounds the probability that the tester makes correct guesses at each step about which edges (double sums) and triangles (triple sums) remain nonerased, thus addressing part of the first challenge. This probability depends only on the erasure budget . To address the second challenge, Lemma 4.3 gives a lower bound on the probability that uniformly random sampled points (the - and - decoys together with ) form a large violation structure, where all triangles that Player 1 might eventually complete violate quadraticity. Building on a result of Alon et al., we show that even though the number of triangles involved in the violation structure is large, namely , the probability of sampling such a structure is , where depends only on . Finally, Lemma 4.2 shows that despite the online adversarial erasures, the tester has a probability of of sampling the points of such a large violation structure and obtaining their values from the oracle. The three lemmas combined show that quadraticity can be tested with queries for constant .

1.3 Conclusions and Open Questions

We initiate a study of sublinear-time algorithms in the presence of online adversarial erasures. We design efficient online-erasure-resilient testers for several important properties (linearity, quadraticity, and—for the case of small number of distinct values—sortedness). For linearity, we prove tight upper and lower bounds in terms of . We also show that several basic properties, specifically, sortedness of integer sequences and the Lipschitz properties, cannot be tested in our model. We now list several open problems.

  • Sortedness is an example of a property that is impossible to test with online erasures, but is easy to test with offline erasures, as well as tolerantly. Is there a property that has smaller query complexity in the online-erasure-resilient model than in the (offline) erasure-resilient model of [DixitRTV18]?

  • We design a -online-erasure-resilient quadraticity tester that makes queries for constant . What is the query complexity of -online-erasure-resilient quadraticity testing in terms of and ?

  • The query complexity of -testing if a function is a polynomial of degree at most is [AlonKKLR05, BKSSZ10]. Is there a low-degree test for that works in the presence of online erasures?

2 An Online-Erasure-Resilient Linearity Tester

To prove Theorem 1.1, we present and analyze two testers. Our main online-erasure-resilient linearity tester (Algorithm 1) is presented in this section. Its query complexity has optimal dependence on and nearly optimal dependence on . Its performance is summarized in Theorem 2.1. To complete the proof of Theorem 1.1, we give a -query linearity tester in Section B. It has optimal query complexity for constant .

Theorem 2.1.

There exists and a 1-sided error, nonadaptive, -online-erasure-resilient -tester for linearity of functions  that works for and makes queries.

The -online-erasure-resilient tester guaranteed by Theorem 2.1 is presented in Algorithm 1.

1:, erasure budget , access to function via a -online-erasure oracle.
2:Let .
3:for all :
4: repeat times:
5:for all :
6:Sample and query at .
7:repeat times:
8:Sample a uniform nonempty subset of of even size and query at .
9:Reject if and all points are nonerased.
10:Accept.
Algorithm 1 An Online-Erasure-Resilient Linearity Tester

2.1 Proof of Theorem 1.2

In this section, we prove Theorem 1.2, the main structural result needed for Theorem 2.1. Recall that a -tuple violates linearity if . (Addition is mod when adding values of Boolean functions.) Theorem 1.2 states that if is -far from linear, then for all even , with probability at least , independently and uniformly sampled points form a violating -tuple. Our proof of Theorem 1.2 builds on the proof of [BellareCHKS96, Theorem 1.2], which is a special case of Theorem 1.2 for . The proof is via Fourier analysis. Next, we state some standard facts and definitions related to Fourier analysis. See, e.g., [OD1] for proofs of these facts. Consider the space of all real-valued functions on equipped with the inner-product

where . The character functions , defined as for , form an orthonormal basis for the space of functions under consideration. Hence, every function can be uniquely expressed as a linear combination of the functions where . The Fourier coefficients of are the coefficients on the functions in this linear representation of .

Definition 2.2 (Fourier coefficient).

For and , the Fourier coefficient of on is

We will need the following facts about Fourier coefficients.

Theorem 2.3 (Parseval’s Theorem).

For all , it holds that . In particular, if then .

Theorem 2.4 (Plancherel’s Theorem).

For all , it holds that

A function is linear if for all .

Lemma 2.5.

The distance of to linearity is .

Finally, we also use the convolution operation, defined below, and one of its key properties.

Definition 2.6 (Convolution).

Let . Their convolution is the function defined by

Theorem 2.7.

Let . Then, for all it holds .

Proof of Theorem 1.2.

Define so that . That is, is obtained from the function by encoding its output with . Note that the distance to linearity of is the same as the distance to linearity of . Then the expression is an indicator for the event that points violate linearity for . Define to be the convolution of with itself times, i.e., , where the operator appears times. We obtain

(1)
(2)
(3)
(4)

where (1) holds by the definition of convolution, (2) follows by repeated application of the steps used to obtain (1), equality (3) follows from Plancherel’s Theorem (Theorem 2.4), and (4) follows from Theorem 2.7. Note that for all , because is the inner product of two functions with range in . In addition, for such that is the closest linear function to . Then, for even ,

where the equality follows from Parseval’s Theorem (Theorem 2.3). By Lemma 2.5, the distance of to linearity is , which is at least , since is -far from linear. This concludes the proof. ∎

2.2 Proof of Theorem 2.1

In this section, we prove Theorem 2.1 using Theorem 1.2. In Lemma 2.8, we analyze the probability of good events that capture, roughly, that the queries made in the beginning of each iteration haven’t already been “spoiled” by the previous erasures. Then we use the work investment strategy of [BermanRY14], stated in Lemma 2.9, together with Theorem 1.2 and Lemma 2.8 to prove Theorem 2.1. Each iteration of the outer repeat loop in Steps 49 of Algorithm 1 is called a round. We say a query is successfully obtained if it is nonerased when queried, i.e., the tester obtains as opposed to .

Lemma 2.8 (Good events).

Fix one round of Algorithm 1. Consider the points queried in Step 6 of this round, where , and the set of all sums where is a nonempty subset of of even size. Let be the (good) event that all points in are distinct. Let be the (good) event that all points are successfully obtained and all points in are nonerased at the beginning of the round. Finally, let . Then for all adversarial strategies.

Proof.

First, we analyze event . Consider points , where , and are even. Since the points are distributed uniformly and independently, so are the sums and . The probability that two uniformly and independently sampled points are identical is . The number of sets of even size is because every subset of can be uniquely completed to such a set . By a union bound over all pairs of sums, . To analyze fix any adversarial strategy. The number of queries made by Algorithm 1 is at most

(5)

Hence, the oracle erases at most points. Since each point is sampled uniformly from ,

Additionally, before the queries are revealed to the algorithm, each sum is distributed uniformly at random. Therefore, for every ,

By a union bound over the points sampled in Step 6 and at most sums, we get

Since , we get and, consequently,

since , as stated in Theorem 2.1, and assuming is sufficiently small. ∎

Next, we state the work investment lemma.

Lemma 2.9 (Lemma 2.5 of [BermanRY14]).

Let

be a random variable taking values in

. Suppose . Let and be the desired probability of error. For all , let and . Then

Proof of Theorem 2.1.

By (5), the query complexity of Algorithm 1 is . Algorithm 1 is nonadaptive and always accepts if is linear. Suppose now that is -far from linear and fix any adversarial strategy. We show that Algorithm 1 rejects with probability at least . Consider the last round of Algorithm 1. For points sampled in Step 6 of this last round, let denote the fraction of nonempty sets such that is even and violates linearity. Recall the event defined in Lemma 2.8. Let

be the indicator random variable for the event

for the last round.

Claim 2.10.

Let , where is as defined above. Then .

Proof.

For all nonempty , such that is even, let be the indicator for the event that violates linearity. By Theorem 1.2 and the fact that is even,

We obtain a lower bound on by linearity of expectation.

(6)

Observe that when occurs, and otherwise. By the law of total expectation,

where the inequality follows from (6), the fact that , and Lemma 2.8. ∎

Fix any round of Algorithm 1 and the value of used in this round (as defined in Step 3). Let be defined as , but for this round instead of the last one. The round is special if . Let and . Then , since the number of erasures only increases with each round. For each , there are rounds of Algorithm 1 that are run with this particular value of . Since Algorithm 1 uses independent random coins for each round, the probability that no round is special is at most

where the last inequality follows by Lemma 2.9 applied with and and Claim 2.10. Therefore, with probability at least , Algorithm 1 has a special round. Consider a special round of Algorithm 1 and fix the value of for this round. We show that Algorithm 1 rejects in the special round with probability at least . We call a sum violating if the tuple violates linearity. Since occurred, all points queried in Step 6 of Algorithm 1 were successfully obtained. So, the algorithm will reject as soon as it successfully obtains a violating sum. Since occurred, there are at least distinct sums that can be queried in Step 8, all of them nonerased at the beginning of the round. Algorithm 1 makes at most queries in this round, and thus the fraction of these sums erased during the round is at most

where in the first inequality we used that for and that for (note that ), in the second inequality we used , and in the third inequality we used and . Since the round is special, at least a fraction of the sums that can be queried in Step 8 are violating. Thus, the fraction of the sums that are violating and nonerased before each iteration of Steps 89 in this round is at least Then, each iteration of Steps 89 rejects with probability at least Since there are iterations with independently chosen sums, the probability that the special round accepts is at most

That is, the probability that Algorithm 1 rejects in the special round is at least . Since the special round exists with probability at least , Algorithm 1 rejects with probability at least . ∎

3 A Lower Bound for Online-Erasure-Resilient Linearity Testing

In this section, we prove Theorem 1.3 that shows that every -online-erasure-resilient -tester for linearity of functions must make more than queries.

Proof of Theorem 1.3.

The proof is via Yao’s minimax principle for the online-erasures model (stated in Theorem A.1 and Corollary A.4). Let

be the uniform distribution over all linear Boolean functions on

and be the uniform distribution over all Boolean functions functions on . We show that a function is -far from linear with probability at least . Let be a linear function, , and be the fraction of domain points on which and differ. Then, . By the Hoeffding bound, . By a union bound over the linear functions, . For large enough, this probability is at least . We fix the following strategy for a -online-erasure oracle : after responding to each query, erase sums of the form , where is a subset of the queries made so far, choosing the subsets in some fixed order. If at most queries are made, the adversary erases all the sums of queried points. Let be a deterministic algorithm that makes queries to the oracle . Assume w.l.o.g. that does not repeat queries. We describe two random processes and that interact with algorithm in lieu of oracle and provide query answers consistent with a random function from and , respectively. For each query of , both processes and return if the value has been previously erased by ; otherwise, they return or with equal probability. Thus, the distribution over query-answer histories when interacts with is the same as when interacts with . Next, we describe how the processes and assign values to the locations of that were either not queried by or erased when queried, and show that they generate and , respectively. After finishes its queries, sets the remaining unassigned locations (including the erased locations) of the function to be or with equal probability. Clearly, generates a function from the distribution . To describe fully, first let denote the queries of that are answered with a value other than . Since , by our choice of the oracle , the sum of any subset of vectors in is not contained in . Hence, the vectors in are linearly independent. Then, completes to a basis for and sets the value of on all vectors in independently to or with equal probability. Since is a basis, each vector can be expressed as a linear combination of vectors in (with coefficients in ), that is, for some . The process sets , where additions is mod 2. The function is linear and agrees with all values previously assigned by to the vectors in . Moreover, is distributed according to , since one can obtain a uniformly random linear function by first specifying a basis for , and then setting the value of to be or with equal probability for each basis vector. Thus, generates linear functions, generates functions that are -far from linear with probability at least , and the query-answer histories for any deterministic algorithm that makes at most queries and runs against our -online-erasure oracle are identical under and . Consequently, Corollary A.4 implies the desired lower bound. ∎

4 An Online-Erasure-Resilient Quadraticity Tester

In this section, we state our online-erasure-resilient quadraticity tester (Algorithm 2) and prove Theorem 1.4. The main idea behind Algorithm 2 and its representation as a two-player game is given in Section 1.2, together with explanatory figures for the case when . We now give a high level overview of Algorithm 2. For a function and let