# Lower Bounds for Semi-adaptive Data Structures via Corruption

In a dynamic data structure problem we wish to maintain an encoding of some data in memory, in such a way that we may efficiently carry out a sequence of queries and updates to the data. A long-standing open problem in this area is to prove an unconditional polynomial lower bound of a trade-off between the update time and the query time of an adaptive dynamic data structure computing some explicit function. Ko and Weinstein provided such lower bound for a restricted class of semi-adaptive data structures, which compute the Disjointness function. There, the data are subsets x_1,…,x_k and y of {1,…,n}, the updates can modify y (by inserting and removing elements), and the queries are an index i ∈{1,…,k} (query i should answer whether x_i and y are disjoint, i.e., it should compute the Disjointness function applied to (x_i, y)). The semi-adaptiveness places a restriction in how the data structure can be accessed in order to answer a query. We generalize the lower bound of Ko and Weinstein to work not just for the Disjointness, but for any function having high complexity under the smooth corruption bound.

## Authors

• 7 publications
• 2 publications
• ### An Adaptive Step Toward the Multiphase Conjecture

In 2010, Pǎtraşcu proposed the following three-phase dynamic problem, as...
10/29/2019 ∙ by Young Kun Ko, et al. ∙ 0

• ### Derandomization of Cell Sampling

Since 1989, the best known lower bound on static data structures was Sie...
08/12/2021 ∙ by Alexander Golovnev, et al. ∙ 0

• ### A Lower Bound for Dynamic Fractional Cascading

We investigate the limits of one of the fundamental ideas in data struct...
11/01/2020 ∙ by Peyman Afshani, et al. ∙ 0

09/22/2021 ∙ by Shaleen Deep, et al. ∙ 0

• ### Conjunctive Queries with Theta Joins Under Updates

Modern application domains such as Composite Event Recognition (CER) and...
05/23/2019 ∙ by Muhammad Idris, et al. ∙ 0

• ### Smooth heaps and a dual view of self-adjusting data structures

We present a new connection between self-adjusting binary search trees (...
02/15/2018 ∙ by László Kozma, et al. ∙ 0

• ### Updatable Materialization of Approximate Constraints

Modern big data applications integrate data from various sources. As a r...
02/12/2021 ∙ by Steffen Kläbe, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A suitable setting to study data structures is the cell probe model [22]. Here we think of the memory divided into registers, or cells, where each cell can carry bits, and we measure efficiency by counting the number of memory accesses, or probes, needed for each query — the query time and each update — the update time . The main goal of this line of research is to understand the inherent trade-off between , and , for various interesting problems. Specifically, one would like to show lower bounds on for reasonable choices of (which is typically logarithmic in the size of the data).

The first lower bound for this setting was proven by Fredman and Saks [9], which proved for various problems. These lower bounds were successively improved [16, 18, 14, 15], and we are now able to show that certain problems with non-Boolean queries require , and certain problems with Boolean queries require .

The major unsolved question in this area is to prove a polynomial lower bound on . For example, consider the dynamic reachability problem, where we wish to maintain a directed -vertex graph in memory, under edge insertions and deletions, while being able to answer reachability queries (“is vertex connected to vertex ?”). Is it true that any scheme for the dynamic reachability problem requires , for some constant ? Indeed, such a lower bound is known under various complexity-theoretic assumptions111See [17, 1]. Strictly speaking, these conditional lower bounds only work if the preprocessing time, which is the time taken to encode the data into memory, is also bounded. But we will ignore this distinction., the question is whether such a lower bound may be proven unconditionally.

In an influential paper [19], Mihai Pătraşcu proposed an approach to this unsolved question. He defines a data structure problem, called the multiphase problem. Let us represent partial functions as total functions where if is not defined. Then associated with a partial Boolean function , and a natural number , we may define a corresponding multiphase problem of as the following dynamic process:

Phase I - Initialization.

We are given inputs , and are allowed to preprocess this input in time .

Phase II - Update.

We are then given another input , and we have time to read and update the memory locations from the data structure constructed in Phase I.

Phase III - Query.

Finally, we are given a query , we have time to answer the question whether . If is not defined, the answer can be arbitrary.

Typically we will have . Let us be more precise, and consider randomized solutions to the above problem.

[Scheme for the multiphase problem of ] Let be a partial Boolean function. A scheme for the multiphase problem of with preprocessing time , update time and query time is a triple , where:

• maps the input to the memory contents , where each of the memory locations holds bits. must be computed in time .

• For each ,

is a decision-tree of depth

, which reads and produces a sequence of updates.222In the usual way of defining the update phase, we have a read/write decision-tree which changes the very same cells that it reads. But when , this can be seen to be equivalent, up to constant factors, to the definition we present here, where we have a decision-tree that writes the updates on a separate location. In order to simulate a scheme that uses a read/write decision-tree, we may use a hash table with worst-case lookup time, such as cuckoo hashing. Then we have a read-only decision-tree whose output is the hash table containing all the which were updated by , associated with their final value in the execution of .

• For each , is a decision-tree of depth .333All our results will hold even if is allowed to depend arbitrarily on . This makes for a less natural model, however, so we omit this from the definitions.

• For all , , and ,

 f(xi,y)≠∗⟹Qi(E(x),Uy(E(x)))=f(xi,y).

In a randomized scheme for the multiphase problem of , each and are distributions over decision trees, and it must hold that for all , , and ,

 f(xi,y)≠∗⟹PrQi,Uy[Qi(E(x),Uy(E(x)))=f(xi,y)]≥1−ε.

The value is called the

error probability

of the scheme.

Pătraşcu [19] considered this problem where is the Disjointness function:

 DISJ(x,y)={0if there exists i∈[n] such that xi=yi=11otherwise

He conjectured that any scheme for the multiphase problem of must have for some constant .

Pătraşcu shows that such lower bounds on the multiphase problem for would imply polynomial lower bounds for various dynamic data structure problems. For example such lower bounds would imply that dynamic reachability requires . He also shows that these lower bounds hold true under the assumption that 3SUM has no sub-quadratic algorithms.

Finally, Pătraşcu then defines a 3-player Number-On-Forehead (NOF) communication game, such that lower bounds on this game imply matching lower bounds for the multiphase problem. The game associated with a function is as follows:

1. Alice is given and , Bob gets and and Charlie gets and .

2. Charlie sends a private message of bits to Bob and then he is silent.

3. Alice and Bob communicate bits and want to compute .

Pătraşcu [19] conjectured that if is , then has to be bigger than the communication complexity of . However, this conjecture turned out to be false. The randomized communication complexity of is [20, 11, 3], but Chattopadhyay et al. [7] construct a protocol for where both .

So the above communication model is more powerful than it appears at first glance.444The conjecture remains that if , then has to be larger than the maximum distributional communication complexity of under a product distribution. This is for Disjointness [2]. However, a recent paper by Ko and Weinstein [12] succeeds in proving lower bounds for a simpler version of the multiphase problem, which translate to lower bounds for a restricted class of dynamic data structure schemes. They manage to prove a lower bound of for the simpler version of the multiphase problem which is associated with the Disjointness function . The main contribution of our paper is to generalize their lower bound to any function which has large complexity according to the smooth corruption bound, under a product distribution. Disjointness is such a function [2], but so is the Inner Product, Gap Orthogonality, and Gap Hamming Distance [21]. Our proof method is significantly different: Ko and Weinstein use information complexity to derive their lower-bound (similar to [3, 4]), whereas we construct a large nearly-monochromatic rectangle. Our proof is reminiscent of [6], but via a more direct bucketing argument. We furthermore show that this lower bound holds also for randomized schemes, and not just for deterministic schemes.

Let us provide rigorous definitions.

[Semi-adaptive random data structure [12]] Let be a partial function. A scheme for the multiphase problem of is called semi-adaptive if any path on the decision-tree first queries the first part of the input (the part), and then queries the second part of the input (the part). If is randomized, then this property must hold for every randomized choice of . We point out that the reading of the cells in each part is completely adaptive. The restriction is only that the data structure can not read cells of if it already started to read cells of . Ko and Weinstein state their result for deterministic data structures, i.e., thus the data structure always returns the correct answer.

[Ko and Weinstein [12]] Let . Any semi-adaptive deterministic data structure that solves the multiphase problem of the function, must have either or . To prove the lower bound they reduce the semi-adaptive data structure into a low correlation random process.

[Ko and Weinstein [12]] Let

be random variables over

and each of them is independently distributed according to the same distribution and let be a random variable over distributed according to (independently of ). Let be a randomized semi-adaptive scheme for the multiphase problem for a partial function with error probability bounded by . Then, for any there is a random variable and such that:

1. .

2. .

3. .

4. .

Ko and Weinstein [12] proved Theorem 1.1 for the deterministic schemes for the function and in the case where . However, their proof actually works for any (partial) function and for any two, possibly distinct distributions and . Moreover, their proof also works for randomized schemes. The resulting statement for randomized schemes for any function is what we have given above. To complete the proof of their lower bound, Ko and Weinstein proved that if we set (and ) large enough so that then such random variable can not exist when is the function. It is this second step which we generalize.

Let be a function and be a distribution over . A set is a rectangle if there exists sets and such that . For and , we say the rectangle is -almost -monochromatic for under if . We say the distribution is a product distribution if there are two independent distribution over and over such that . For , the distribution is -balanced according to if . We will prove that the existence of a random variable given by Theorem 1.1 implies that, for any , any balanced product distribution and any function which is “close” to , there is a large (according to ) -almost -monochromatic rectangle for in terms of . This technique is known as smooth corruption bound [5, 6] or smooth rectangle bound [10]. We denote the smooth corruption bound of as . Informally, if there is and a partial function which is close555“Closeness” is measured by the parameter , see Section 2.1 for the formal definition. to such that any -almost -monochromatic rectangle for has size (under ) at most . We will define smooth corruption bound formally in the next section. Thus, if we use Theorem 1.1 as a black box we generalize Theorem 1.1 for any function of large corruption bound.

[Main Result] Let such that for . Let be a product distribution over such that is -balanced according to a partial function . Any semi-adaptive randomized scheme for the multiphase problem of , with error probability bounded by , must have either , or

 tq⋅w≥Ω(α⋅scbO(ε/α),λμ(f)).

We point out that and in the bound given above hide absolute constants independent of and .

As a consequence of our main result, and of previously-known bounds on corruption, we will are able to show new lower-bounds of against semi-adaptive schemes for the multiphase problem of the Inner Product, Gap Orthogonality and Gap Hamming Distance functions (where the gap is ). These lower-bounds hold assuming that . They follow from the small discrepancy of the Inner Product, and from a bound shown by Sherstov on the corruption of the Gap Orthogonality following by a reduction to the Gap Hamming Distance [21]. This result also gives an alternative proof of the same lower-bound proven by Ko and Weinstein [12], for the Disjointness function, of . This follows from the bound on corruption of Disjointness under a product distribution, shown by Babai et al. [2].

The paper is organized as follows. In Section 2 we give important notation, and the basic definitions from information theory and communication complexity. The proof of Theorem 1.1 appears in Section 3. The various applications appear in Section 4.

## 2 Preliminaries

We use a notational scheme where sets are denoted by uppercase letters, such as and , elements of the sets are denoted by the same lowercase letters, such as and , and random variables are denoted by the same lowercase boldface letters, such as and . We will use lowercase greek letters, such as , to denote distributions. If is a distribution over a product set, such as , and , then is the probability of seeing under . We will sometimes denote by , using non-italicized lowercase letters corresponding to . This allows us to to use the notation and to denote the and -marginals of , for example; then if we use the same notation with italicized lowercase letters, we get the marginal probabilities, i.e., for each and

 μ(x)=∑y,zμ(x,y,z)μ(y)=∑x,zμ(x,y,z).

If , then we will also use the notation to denote the -marginal of conditioned seeing the specific value . Then for each and , we have

 μ(x∣y)=∑zμ(x,y,z).

We will also write to mean that are random variables chosen according to the distribution , i.e., for all , . Naturally if , then . We let denote the support of , i.e., the set of with .

We now formally define the smooth corruption bound and related measures from communication complexity, and refer the book by Kushilevitz and Nisan [13] for more details. At the end of this section we provide necessary notions of information theory which are used in the paper, and for more details on these we refer to the book by Cover and Thomas [8].

### 2.1 Rectangle Measures

Let be a partial function and be a distribution over . We say that is -close to a function under if

 Pr(x,y)∼μ[f(x,y)≠g(x,y)]≤λ.

Let

be the set of -almost -monochromatic rectangles for under . The complexity measure quantifies how large almost -monochromatic rectangles can be [5]:

 monoρμ(f)=minb∈{0,1}maxR∈Rρbμ(R)

Using we can define the corruption bound of a function as and the smooth corruption bound as

 scbρ,λμ(f)=maxg: λ-close to f % under μcbρμ(g).

Thus, if then there is a and a function which -close to under such that for any -almost -monochromatic rectangle for under it holds that .

The notion is related to the discrepancy of a function:

 discμ(f)=maxR: rectangle of X×Y ∣∣μ(R∩f−1(0))−μ(R∩f−1(1))∣∣.

It is easy to see that for a total function holds that for any . Thus, Theorem 1.1 will give us lower bounds also for functions of small discrepancy.

### 2.2 Information Theory

We define several measures from information theory. If are two distributions such that , then the Kullback-Leibler divergence of from is

 DKL(μ′ ∥ μ)=∑zμ′(z)logμ′(z)μ(z).

With Kullback-Leibler divergence we can define the mutual information, which measures how close (according to KL divergence) is a joint distribution to the product of its marginals. If we have two random variables

, then we define their mutual information to be

If we have three random variables , then the mutual information of and conditioned by is

We present several facts about the mutual information, the proofs can be found in the book of Cover and Thomas [8].

###### Fact (Chain Rule).

For any random variables and holds that

 I(x1x2:y∣z)=I(x1:y∣z)+I(x2:y∣z,x1).

Since mutual information is never negative, we have the following corollary. For any random variables and holds that .

The -distance between two distributions is defined as

 ∥∥μ′(z)−μ(z)∥∥1=∑z∣∣μ′(z)−μ(z)∣∣.

There is a relation between -distance and Kullback-Leibler divergence.

###### Fact (Pinsker’s Inequality).

For any two distributions and , we have

 ∥∥μ′(z)−μ(z)∥∥1≤√2⋅DKL(μ′(z) ∥ μ(z))

## 3 The Proof of Theorem 1.1

Let be a partial function. Suppose there is a semi-adaptive random scheme for the multiphase problem of with error probability bounded by such that . Let be a product distribution over , such that is -balanced according to . Let and be a partial function which is -close to under . We will prove there is a large almost -monochromatic rectangle for .

Let be independent random variables each of them distributed according to and be an independent random variable distributed according to . Let the random variable and the index be given by Theorem 1.1 applied to the random variables and the function . For simplicity we denote .

We will denote the joint distribution of by . Note that here the notation is consistent, in the sense that for all . We will then need to keep in mind that is the -marginal of the joint distribution of .

By we denote the event that the random variable gives us the wrong answer on an input from the support of , i.e. and hold simultaneously. By Theorem 1.1 we know that Since and are -close under , we have that is still balanced according to and with small probability, as stated in the next observation.

###### Observation .

Let and . For the function it holds that

1. The distribution is -balanced according to .

2. .

###### Proof.

Let . We will bound .

 ~α≤Pr[f(x,y)=b′]= Pr[f(x,y)=b′,f(x,y)=g(x,y)] +Pr[f(x,y)=b′,f(x,y)≠g(x,y)] ≤ Pr[g(x,y)=b′]+λ.

Thus, by rearranging we get . The proof of the second bound is similar:

 Pr[g(x,y)≠∗zm]= Pr[f(x,y)≠∗zm,f(x,y)=g(x,y)] +Pr[g(x,y)≠∗zm,f(x,y)≠g(x,y)]≤~ε+λ=ε.\qed

Let be the bound on and given by Theorem 1.1. Since , we have We will prove that if we assume that and we choose large enough ( of Theorem 1.1) then we can find a rectangle such that is -almost -monochromatic for and for . Thus, we have and consequently

 scbO(ε/α),λμ(f)≤O(tq⋅wα).

By rearranging, we get the bound from Theorem 1.1.

Let us sketch the proof of how we can find such a rectangle . We will first fix the random variable to such that and are not very correlated conditioned on , i.e., the joint distribution is very similar to the product distribution of the marginals . Moreover, we will pick in such a way the probability of error is still small. Then, since is close to , the probability of error under the latter distribution will be small as well, i.e., if , then will also be small. Finally, we will find subsets of large mass (under the original distributions and ), while keeping the probability of error on the rectangle sufficiently small.

Let us then proceed to implement this plan. Let . We will show that is a lower bound for the probability that is equal to . Let be the bound on given by Theorem 1.1, i.e., .

There exists such that

1. .

2. .

3. .

4. .

###### Proof.

Note that

 α ≤Pr[g(x,y)=b]

Thus, by rearranging we get . By expanding the information we find

 γ≥I(x:y∣z)=Ez∼μ(z)[I(x:y∣z=z)] and by the Markov inequality we get that Prz∼μ(z)[I(x:y∣z=z)≥5β⋅γ]≤β5.

Similarly, for the information :

 c≥I(xy:z)≥I(x:z)=Ez∼μ(z)[DKL(μ(x∣z) ∥ μ(x))] and so

The bound for is analogous. Let . Then,

 ε≥Pr[g(x,y)≠∗zm]=∑z∈Zμ(z)⋅ez=Ez∼μ(z)[ez] Prz∼μ(z)[ez≥5β⋅ε]≤β5.

Thus, by a union bound we may infer the existence of the sought . ∎

Let us now fix from the previous lemma. Let be the distribution conditioned on , and let be the product of its marginals. Let be the support of , and let and be the supports of and , respectively, i.e., and are the projections of into and .

Then Pinsker’s inequality will give us that and are very close. Let .

###### Proof.

Indeed, by Pinsker’s inequality,

 ∥∥μz(x,y)−μ′z(x,y)∥∥1≤√2⋅DKL(μz(x,y) ∥ μ′z(x,y)).

The right-hand side is , which by definition of mutual information equals , and by Lemma 3 this is . ∎

For the sake of reasoning, let be random variables chosen according to to . Let . It then follows from Lemma 3 and Lemma 3 that:

.

###### Proof.

We prove that

 ∣∣Pr[g(x,y)≠∗zm∣z=z]−Pr[g(x′,y′)≠∗zm]∣∣≤δ.

Since by Lemma 3, the lemma follows. Let

 B={(x,y)∈Sx×Sy:g(x,y)≠zm,g(x,y)≠∗}.

Thus, we have the following.

 ∣∣Pr ≤∑(x,y)∈B∣∣μz(x,y)−μ′z(x,y)∣∣≤δ by triangle inequality and Lemma 3

Let . We will prove the ratio between and is larger than with only small probability (when ). The same holds for and .

###### Proof.

We prove the lemma for , the proof for is analogous. By Lemma 3 we know that . We expand the Kullback-Leibler divergence:

 c′≥DKL(μ′z(x) ∥ μ(x))=∑x∈Sxμ′z(x)logμ′z(x)μ(x)=E[logμ′z(x′)μ(x′)], and then use the Markov inequality: Pr[μ′z(x′)≥26c′⋅μ(x′)]=Pr[logμ′z(x′)μ(x′)≥6c′]≤16.\qed

We now split and into buckets and (for ), where the -th buckets are

 Cxℓ ={x∈Sx∣∣(ℓ−1)⋅μ(x)2c′<μ′z(x)≤ℓ⋅μ(x)2c′}, Cyℓ ={y∈Sy∣∣(ℓ−1)⋅μ(y)2c′<μ′z(y)≤ℓ⋅μ(y)2c′}.

In a bucket there are elements of such that their probability under is approximately -times bigger than their probability under . By Lemma 3, it holds that with high probability the elements are in the buckets and for . Thus, if we find a bucket