Lower Bounds for Semialgebraic Range Searching and Stabbing Problems

12/01/2020 ∙ by Peyman Afshani, et al. ∙ Aarhus Universitet 0

In the semialgebraic range searching problem, we are to preprocess n points in ℝ^d s.t. for any query range from a family of constant complexity semialgebraic sets, all the points intersecting the range can be reported or counted efficiently. When the ranges are composed of simplices, the problem can be solved using S(n) space and with Q(n) query time with S(n)Q^d(n) = Õ(n^d) and this trade-off is almost tight. Consequently, there exists low space structures that use Õ(n) space with O(n^1-1/d) query time and fast query structures that use O(n^d) space with O(log^d n) query time. However, for the general semialgebraic ranges, only low space solutions are known, but the best solutions match the same trade-off curve as the simplex queries. It has been conjectured that the same could be done for the fast query case but this open problem has stayed unresolved. Here, we disprove this conjecture. We give the first nontrivial lower bounds for semilagebraic range searching and related problems. We show that any data structure for reporting the points between two concentric circles with Q(n) query time must use S(n)=Ω(n^3-o(1)/Q(n)^5) space, meaning, for Q(n)=O(log^O(1)n), Ω(n^3-o(1)) space must be used. We also study the problem of reporting the points between two polynomials of form Y=∑_i=0^Δ a_i X^i where a_0, ⋯, a_Δ are given at the query time. We show S(n)=Ω(n^Δ+1-o(1)/Q(n)^Δ^2+Δ). So for Q(n)=O(log^O(1)n), we must use Ω(n^Δ+1-o(1)) space. For the dual semialgebraic stabbing problems, we show that in linear space, any data structure that solves 2D ring stabbing must use Ω(n^2/3) query time. This almost matches the linearization upper bound. For general semialgebraic slab stabbing problems, again, we show an almost tight lower bounds.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We address one of the biggest open problems of the recent years in the range searching area. Our main results are lower bounds in the pointer machine model of computation that essentially show that the so-called “fast query” version of the semialgebraic range reporting problem is “impervious” to the algebraic techniques. Our main result reveals that to obtain polylogarithmic query time, the data structure requires space111 , , notations hide factors and , , notations hide factors., where the constant depends on , is the input size, and is the number of parameters of each “polynomial inequality” (these will be defined more clearly later). Thus, we refute a relatively popular recent conjecture that data structures with space and polylogarithmic query time could exist, where is the dimension of the input points. Surprisingly, the proofs behind these lower bounds are simple, and these lower bounds could have been discovered years ago as the tools we use already existed decades ago. Range searching is a broad area of research in which we are given a set of points in and the goal is to preprocess such that given a query range , we can count or report the subset of that lies in . Often is restricted to a fixed family of ranges, e.g., in simplex range counting problem, is a simplex in and the goal is to report , or in halfspace range reporting problem, is a halfspace and the goal is to report . Range searching problems have been studied extensively and they have numerous variants. For an overview of this topic, we refer the readers to an excellent survey by Agarwal [toth2017handbook]. Another highly related problem which can be viewed as the “dual” of this problem is range stabbing: we are given a set of ranges as input and the goal is to preprocess such that given a query point , we can count or report the ranges of containing efficiently. Here, we focus on the reporting version of range stabbing problems.

1.1 Range Searching: A Very Brief Survey

1.1.1 Simplex Range Searching

Simplices is one of the most fundamental family of queries. In fact, if the query is decomposable (such as range counting or range reporting queries), then simplices can be used as “building blocks” to answer more complicated queries: for a query which is a polyhedral region of complexity, we can decompose it into disjoint simplices (with a constant that depends on ) and thus answering can be reduced to answering simplicial queries. Simplicial queries were hotly investigated in 1980s and this led to development of two important tools in computational geometry: cuttings and partition theorem and both of them have found applications in areas not related to range searching.

Cuttings and Fast Data Structures

“Fast query” data structures can answer simplex range counting or reporting queries in polylogarithmic query time but by using space and they can be built using cuttings. In a nut-shell, given a set of hyperplanes in , a -cutting, is a decomposition of into simplices such that each simplex is intersected by hyperplanes of . These were developed by some of the pioneers in the range searching area, such as Clarkson [ClaDCG87], Haussler and Welzl [hw87], Chazelle and Friedman [Chazelle.Friedman], Matoušek [Matousek91cuttings], finally culminating in a result of Chazelle [Chazelle.cutting] who optimized various aspects of cuttings. Using cuttings, one can answer simplex range counting, or reporting queries with space and query time (where is the output size) [matouvsek1993range]. The query time can be lowered to by increasing the space slightly to for any constant  [chazelle1989quasi]. An interested reader can refer to a book on cuttings by Chazelle [Chazelle.book].

The Partition Theorem and Space-efficient Data Structures

At the opposite end of the spectrum, simplex range counting or reporting queries can be answered using linear space but with higher query time of , using partition trees and the related techniques. This branch of techniques has a very interesting history. In 1982, Willard [Willardpartition] cleverly used ham sandwich theorem to obtain a linear-sized data structure with query time of for some constant for simplicial queries in 2D. After a number of attempts that either improved the exponent or generalized the technique to higher dimensions, Welzl [Welzlpartree82] in 1982 provided the first optimal exponent for the partition trees, then Chazelle et al. [chazelle1989quasi] provided the first data near-linear size data structure with query time of roughly . Finally, a data structure with space and query time was given by Matoušek [matouvsek1993range]. This was also simplified recently by Chan [chan2012optimal].

Space/Query Time Trade-off

It is possible to combine fast query data structures and linear-sized data structures to solve simplex queries with space and query time such that . This trade-off between space and query time is optimal, at least in the pointer machine model and in the semigroup model [afshani2012improved, chazelle1996simplex, chazelle1989lower].

Multi-level Structures, Stabbing and Other Related Queries

By using multi-level data structures, one can solve more complicated problems where both the input and the query shapes can be simplicial objects of constant complexity. The best multi-level data structures use one extra factor in space and query time per level [chan2012optimal] and there exist lower bounds that show space/query time trade-off should blow up by at least factor per level [AD.frechet17]. This means that problems such as simplex stabbing (where the input is a set of simplices and we want to output the simplices containing a given query point) or simplex-simplex containment problem (where the input is a set of simplices, and we want to output simplices fully contained in a query simplex) all have the same trade-off curve of between space and query time . Thus, one can see that the simplex range searching as well as its generalization to problems where both the input and the query ranges are “flat” objects is very well understood. However, there are many natural query ranges that cannot be represented using simplices, e.g., when query ranges are spheres in . This takes us to semialgebraic range searching.

1.1.2 Semialgebraic Range Searching

A semialgebraic set is defined as a subset of that can be described as the union or intersection of ranges, where each range is defined by -variate polynomial inequality of degree at most , defined by at most values given at the query time; we call the parametric dimension. For instance, with , , and given three values and at the query time, a circular query can be represented as . In semialgebraic range searching, the queries are semialgebraic sets. Before the recent “polynomial method revolution”, the tools available to deal with semialgebraic range searching were limited, at least compared to the simplex queries. One way to deal with semialgebraic range searching is through linearization [YaoYaolinearization]. This idea maps the input points to , for some potentially large parameter , such that each polynomial inequality can be represented as a halfspace. Consequently, semialgebraic range searching can be solved with the space/query time trade off of . The exponent of in the trade-off can be improved (increased) a bit by exploiting that in , the input set actually lies in a -dimensional surface [agarwal1994range]. It is also possible to build “fast query” data structures but using , but only in specific cases [agarwal1994range] (see [toth2017handbook] for details). In 2009, Zeev Dvir [Dvirkakeya] proved the discrete Kakeya problem with a very elegant and simple proof, using a polynomial method. Within a few years, this led to revolution in discrete and computational geometry, one that was ushered in by Katz and Guth’s almost tight bound on Erdős distinct distances problem [guth2015erdHos]. For a while, the polynomial method did not have much algorithmic consequences but this changed with the work of Agarwal, Matoušek, and Sharir [agarwal2013range] where they showed that at least as long as linear-space data structures are considered, semialgebraic range queries can essentially be solved within the same time as simplex queries (ignoring some lower order terms). Later developments (and simplifications) of their approach by Matoušek and Patáková [MatousekZuzana] lead to the current best results: a data structure with linear size and with query time of .

Fast Queries for Semialgebraic Range Searching: an Open Problem

Nonetheless, despite the breakthrough results brought on by the algebraic techniques, the fast query case still remained unsolved, even in the plane: e.g., the best known data structures for answering circular queries with polylogarithmic query time still use space, by using linearization to . The fast query case of semialgebraic range searching has been explicitly mentioned as a major open problem in multiple recent publications222 To quote Agarwal et al. [agarwal2013range],“[a] very interesting and challenging problem is, in our opinion, the fast-query case of range searching with constant-complexity semialgebraic sets, where the goal is to answer a query in time using roughly space.” The same conjecture is repeated in a different survey [Agarwal2017] and it is also emphasized that the question is even open for disks in the plane, “… whether a disk range-counting query in be answered in time using space?”.. In light of the breakthrough result of Agarwal et al. [agarwal2013range], it is quite reasonable to conjecture that semialgebraic range searching should have the same trade-off curve of . Nonetheless, the algebraic techniques have failed to make sufficient advances to settle this open problem. Given that it took a revolution caused by the polynomial method to advance our knowledge of the “low space” case of semialgebraic range searching, it is not too outragous to imagine that perhaps equally revolutionary techniques are needed to settle the “fast query” case of semialgebraic range searching.

1.1.3 Semialgebraic Range Stabbing

Another important problem is semialgebraic stabbing, where the input is a set of semialgebraic sets, i.e., “ranges”, and queries are points. The goal is to output the input ranges that contain a query point. Here, sometimes “fast query” data structures are possible, for example by observing that an arrangements of disks in the plane has complexity and thus counting or reporting the disks stabbed by a query point can be done with space and query time. However, it seems difficult to make advancements in the “low space” side of things; the only known data structure with space is one that uses linearization to 3D that results in query time.

1.2 Our Results

Our main results are lower bounds in the pointer machine model of computation for four central problems defined below. In the 2D polynomial slab reporting problem, given a set of points in , the task is to preprocess such that given a query 2D polynomial slab , the points contained in the polynomial slab, i.e., , can be reported efficiently. Informally, a 2D polynomial slab is the set of points such that , for some univariate polynomial of degree and value given at the query time. In the 2D polynomial slab stabbing problem, the input is a set of polynomial slabs and the query is a point and the goal is to report all the slabs that contain . Similarly, in the 2D ring reporting problem, the input is a set of points in and the query is a “ring”, the region between two concentric circles. Finally, in 2D ring stabbing problem, the input is a set of rings, the query is a point and the goal is to report all the rings that contain . For polynomial slab queries, we show that if a data structure answers queries in time, where is the output size, using space, then ; the hidden constants depend on . So for “fast queries”, i.e., , space must be used. This is almost tight as the exponent matches the upper bounds obtained by linearization! Also, we prove that any structure that answers polynomial slab reporting queries in time must use space. In the “low space” setting, when , this gives . This is once again almost tight, as it matches the upper bounds obtained by linearization for when . For the ring reporting problem, our bound sharpens to . For the ring stabbing problem, we show , e.g., in “low space” setting when , we must have ; compare this with simplex stabbing queries can be solved with space and query time. As before, this is almost tight, as it matches the upper bounds obtained by linearization to 3D for when . Somewhat disappointedly, no revolutionary new technique is required to obtain these results. We use novel ideas in the construction of “hard input instances” but otherwise we use the two widely used pointer machine lower bound frameworks by Chazelle [chazelle1990lower], Chazelle and Rosenberg [chazelle1996simplex], and Afshani [afshani2012improved]. Our results are summarized in Table 1.

Problem Lower Bound Upper Bound
2D Polynomial Slab Reporting  [agarwal1994range, agarwal2013range, matouvsek1993range]
2D Ring Reporting  [agarwal1994range, agarwal2013range, matouvsek1993range]
2D Simplex Range Searching  [chazelle1996simplex, afshani2012improved]  [matouvsek1993range]
2D Polynomial Slab Stabbing 333 The subdivision formed by degree polynomial slabs has complexity (for some constant depending on ). We partition the subdivision into vertical strips where for any strip any slab intersecting it fully span the strip and the number of slab changes of adjacent strips is . Consider these strips from left to right, we are solving a special dynamic slab stabbing problem. We can solve this problem by building a persistent interval tree using space that answers each query in time . On the other hand, we can solve the problem in space and time by linearization. Combining these two solutions using [matouvsek1993range] gives the tradeoff.
2D Ring Stabbing 444Similar to 2D polynomial slab stabbing.
2D Simplex Stabbing  [afshani2012improved]  [matouvsek1993range]
Table 1: Our Results, indicates this paper.

2 Preliminaries

We first review the related geometric reporting data structure lower bound frameworks. The model of computation we consider is (an augmented version of) the pointer machine model. In this model, the data structure is a directed graph . Let be the set of input elements. Each cell of stores an element of and two pointers to other cells. Assume a query requires a subset be output. For the query, we only charge for the pointer navigations. Let be the smallest connected subgraph, s.t., every element of is stored in at least one element of . Clearly, is a lower bound for space and is a query lower bound. Note that this grants the algorithm unlimited computational power as well as full information about the structure of . In this model, there are two main lower bound frameworks, one for range reporting [chazelle1990lower, chazelle1996simplex], and the other for its dual, range stabbing [afshani2012improved]. We describe them in detail here.

2.1 A Lower Bound Framework for Range Reporting Problems

The following result by Chazelle [chazelle1990lower] and later Chazelle and Rosenberg [chazelle1996simplex] provides a general lower bound framework for range reporting problems. In the problem, we are given a set of points in and the queries are from a set of ranges. The task is to build a data structure such that given any query range , we can report the points intersecting the range, i.e., , efficiently. [Chazelle [chazelle1990lower] and Chazelle and Rosenberg [chazelle1996simplex]] Suppose there is a data structure for range reporting problems that uses at most space and can answer any query in where is the input size and is the output size. Assume we can show that there exists an input set of points satisfying the following: There exist subsets , where , is the output of some query and they satisfy the following two conditions: (i) for all , ; and (ii) the size of the intersection of every distinct subsets is bounded by some value , i.e., . Then

. To use this framework, we need to exploit the property of the considered problem and come up with a construction that satisfies the two conditions above. Often, the construction is randomized and thus one challenge is to satisfy condition (ii) in the worst-case. This can be done by showing that the probability that (ii) is violated is very small and then using a union bound to prove that with positive probability the construction satisfies (ii) in the worst-case.

2.2 A Lower Bound Framework for Range Stabbing Problems

Range stabbing problems can be viewed as the dual of range reporting problems. In this problem, we are given a set of ranges, and the queries are from a set of points. The task is to build a data structure such that given any query point , we can report the ranges “stabbed” by this query point, i.e., , efficiently. A recent framework by Afshani [afshani2012improved] provides a simple way to get the lower bound of such problems. [Afshani [afshani2012improved]] Suppose there is a data structure for range stabbing problems that uses at most space and can answer any query in where is the input size and is the output size. Assume we can show that there exists an input set of ranges that satisfy the following: (i) every query point of the unit square is contained in at least ranges; and (ii) the area of the intersection of every ranges is at most . Then . This is very similar to framework of Theorem 2.1 but often it requires no derandomization.

3 2D Polynomial Slab Reporting and Stabbing

We first consider the case when query ranges are 2D polynomial slabs. The formal definition of 2D polynomial slabs is as follows. Let , where , be a degree univariate polynomial. A 2D polynomial slab is a pair , where is called the base polynomial and the width of the polynomial slab. The polynomial slab is then defined as .

3.1 2D Polynomial Slab Reporting

We consider the 2D polynomial slab reporting problem in this section, where the input is a set of points in , and the query is a polynomial slab. This is an instance of semialgebraic range searching where we have two polynomial inequalities where each inequality has degree and it is defined by parameters given at the query time (thus, ). Note that is also the dimension of linearization for this problem, meaning, the 2D polynomial slab reporting problem can be lifted to the simplex range reporting problem in . Our main result shows that for fast queries (i.e., when the query time is polylogarithmic), this is tight, by showing an space lower bound, in the pointer machine model of computation. To do that, we will use Chazelle’s framework. In our construction of a hard input instance, a derandomization process will be needed. We do this using the following two general lemmas. For the proofs of these lemmas, we refer the readers to Appendix A and Appendix B. [] Let be a set of points chosen uniformly at random in a square of side length in . Let be a set of ranges in such that (i) the intersection area of any ranges is bounded by ; (ii) the total number of intersections is bounded by for . Then with probability , for all distinct ranges , . [] Let be a set of points chosen uniformly at random in a square of side length in . Let be a set of ranges in such that (i) the intersection area of any range and is at least for some constant and a parameter , where ; (ii) the total number of ranges is bounded by . Then with probability , for every range , . Given a univariate polynomial , the following simple lemma establishes the relationship between the coefficient of the maximum degree term and the maximum range within which its value is bounded. This lemma will be used to upper bound the intersection area of two polynomial slabs. For the proof of this lemma, we refer the readers to Appendix C. [] Let be a degree univariate polynomial where for some positive . Let be any positive value and be a parameter. If for all , then . With subsection 3.1 at hand, we now show a lower bound for polynomial slab reporting. Let be a set of points in . Let be the set of all 2D polynomial slabs . Then any data structure for that solves polynomial slab reporting for queries from with query time , where is the output size, uses space.

Proof.

We use Chazelle’s framework to prove this theorem. To this end, we will need to show the existence of a hard input instance. We do this as follows. In a square , we construct a set of special polynomial slabs with the following properties: (i) The intersection area of any two slabs is small; and (ii) The area of each slab inside is relatively large. Intuitively and consequently, if we sample points uniformly at random in , in expectation, few points will be in the intersection of two rings, and many points will be in each ring. Intuitively, this satisfies the two conditions of subsection 2.1. By picking parameters carefully and a derandomization process, we get our theorem. Next, we describe the details. Consider a square . Let be some parameters to be specified later. We generate a set of polynomial slabs with

where for and . Note that we normalize the coefficients such that for any polynomial slab in range , a quarter of this slab is contained in if . To show this, it is sufficient to show that every polynomial is inside , for every . As all the coefficients of the polynomials are positive, it is sufficient to upper bound , among all the polynomials that we have generated. Similarly, this maximum is attained when all the coefficients are set to their maximum value, i.e., when and , resulting in the polynomial . Now it easily follows that . Then, the claim follows from the following simple observation.

Observation 1.

The area of a polynomial slab for when is .

Proof.

The claimed area is . ∎

Next, we bound the area of the intersection of two polynomial slabs. Consider two distinct slabs and . As each slab is created using two polynomials of degree , can have at most connected regions. Consider one connected region and let the interval , be the projection of onto the -axis. Define the polynomial and observe that we must have for all . We now consider the coefficient of the highest degree term of . Let (resp. ) be the coefficient of the degree term in (resp. ). Clearly, if , then the coefficient of in will be zero. Thus, to find the highest degree term in , we need to consider the largest index such that ; in this case, will have degree and coefficient of will have absolute value . When , by Lemma 3.1, . Next, by Observation 1, the area of the intersection of and is . We pick and , for a large enough constant . Then, the intersection area of any two polynomial slabs is bounded by . Since in total we have generated slabs, the total number of intersections they can form is bounded by . By subsection 3.1, with probability , the number of points of in any intersection of two polynomial slabs is at most . Also, as we have shown that the intersection area of every slab with is at least , by subsection 3.1, with probability more than , each polynomial slab has at least points of . It thus follows that with positive probability, both conditions of Theorem subsection 2.1 are satisfied, and consequently, we obtain the lower bound of

So for the “fast query” case data structure, by picking , we obtain a space lower bound of .

3.2 2D Polynomial Slab Stabbing

By small modifications, our construction can also be applied to obtain a lower bound for (the reporting version of) polynomial slab stabbing problems using subsection 2.2. One modification is that we need to generate the slabs in such a way that they cover the entire square . The framework provided through Theorem 2.2 is more stream-lined and derandomization is not needed and we can directly apply the “volume upper bound” obtained through Lemma 3.1. There is also no factor loss (our lower bound actually uses notation). The major change is that we need to use different parameters since we need to create polygonal slabs, as now they are the input. For the details refer to Appendix D. [] Give a set of 2D polynomial slabs , any data structure for solving the 2D ring stabbing problem with query time uses space, where is the output size. So for any data structure that solves the 2D polynomial slab stabbing problem using space, subsection 3.2 implies that its query time must be .

3.3 2D Ring Reporting

In this section, we show that any data structure that solves 2D ring reporting with query time must use space. Recall that a ring is the region between two concentric circles and the width of the ring is the difference between the radii of the two circles. In general, we show that if the query time is , then the data structure must use space. Note that this is also a better trade-off curve than what we obtained for the polynomial slab reporting problem when . We will still use Chazelle’s framework. We first present a technical geometric lemma which upper bounds the intersection area of two 2D rings. We will later use this lemma to show that with probability more than , a random point sets satisfying the first condition of subsection 2.1. [] Consider two rings of width with inner radii of , where , and . Let be the distance between the centers of two rings. When , the intersection area of two rings is bounded by , where . The proof sketch.    For the complete proof see Appendix E. When , the intersection region consists of two triangle-like regions. We only bound the triangle-like region in the upper half rings as shown in Figure 1. We can show that its area is asymptotically upper bounded by the product of its base length and its height . We bound by observing that is the area of triangle but we can also obtain its area of using Heron’s formula, given its three side lengths. This gives . Since in this case , the intersection area is upper bounded by as claimed.

Figure 1: Intersections When is Small

When , the intersection region consists of two quadrilateral-like regions. Again we only consider in the upper half of the rings, which is contained in a partial ring, , as shown in Figure 2.

(a) Cover an Intersection by A Partial Ring
(b) Bound the Length of
Figure 2: Cover a Quadrilateral-like Region by a Partial Ring

We show the area of is asymptotically bounded by , where is the distance between the two endpoints of the inner arc. We upper bound by . We use the algebraic representation of the two rings, to bound the length of the projection of on the -axis by ; See Figure 2. We use Heron’s formula to bound the length of the projection of on the -axis by . The maximum of the length of the two projections yields the claimed bound. ∎

We use Chazelle’s framework to obtain a lower bound for 2D ring reporting. Let and be two squares of side length that are placed distance apart and is directly to the left of . We generate the rings as follows. We divide into a grid where each cell is a square of side length . For each grid point, we construct a series of circles as follows. Let be a grid point. The first circle generated for must pass through a corner of and not intersect the right side of , as shown in Figure 3. Then we create a series of circles centered at by increasing the radius by increments of , as long as it does not intersect the left side of . Every consecutive two circles defines a ring centered on . We repeat this for every grid cell in and this makes up our set of queries. The input points are placed uniformly randomly inside .

Figure 3: Generate a Family of Rings at Point

We now show that for the rings we constructed, the intersection of rings is not too large, for some we specify later. More precisely we prove the following. [] There exists a large enough constant such that in any subset of rings, we can find two rings such that their intersection has area . The proof sketch.    For the complete proof see Appendix F. Let be a set of rings. Suppose for the sake of contradiction that we cannot find two rings in whose intersection area is . Since by subsection 3.3, the intersection area of any two rings in our construction with distance is . The maximum distance between any two rings in must be .

Figure 4: Intersection of Two Rings

Let be a point in the intersection of rings in . Consider an arbitrary ring centered at and another ring centered at for some . For to contain , we must have for . See Figure 4 for an example. Also , by exploiting the shape of and applying subsection 3.3, we can compute an upper bound for the distance between and , namely, , where is the angle between and . This implies that must fit in a rectangle of size . Since the gird cell size is , only rings are contained in such a rectangle, a contradiction. ∎

We are now ready to plug in some parameters in our construction. We set . First, we claim that from each grid cell , we can draw circles; Let , and be the corners of sorted increasingly according to their distance to . As and are placed distance apart, an elementary geometric calculation reveals that and are vertices of the right edge of , meaning, the smallest circle that we draw from passes through and we keep drawing circles, by incrementing their radii by until we are about to draw a circle that is about to contain . We can see that and thus we draw circles from . As we have grid cells, it thus follows that we have rings in our construction. Also by our construction, the area of each ring within is . To see this, let be an arbitrary point in , let be the intersections of some circle centered at as in Figure 5.

Figure 5: The Angle of a Ring

We connect and let be the center of . Let . In the triangle , all the sides are within constant factors of each other and thus and so the area of the ring inside is at least a constant fraction of the area of the entire ring. Suppose we have a data structure that answers 2D ring reporting queries in time. We set for a large enough constant such that the area of each ring within is at least . By subsection 3.1, if we sample points uniformly at random in , then with probability more than , each ring contains at least points. Also by our construction, the total number of intersections of two rings is bounded by . Then by subsection 3.1, with probability , a point set of size picked uniformly at random in satisfies that the number of points in any of the intersection of rings is no more than . Now by union bound, there exist point sets such that each set is the output of some 2D ring query and each set contains at least points. Furthermore, the intersection of any sets is bounded by . Then by subsection 2.1, we obtain a lower bound of

This proves the following theorem about 2D ring reporting. Any data structure that solves 2D ring reporting on point set of size with query time , where is the output size, must use space. So for any data structure that solves 2D ring reporting in time , subsection 3.3 implies that space must be used.

3.4 2D Ring Stabbing

Modifications similar to those done in Subsection 3.2 can be used to obtain the following lower bound. See Appendix G for details. [] Any data structure that solves the 2D ring stabbing problem with query time , where is the output size, must use space. So for any data structure that solves the 2D ring stabbing problem using space, subsection 3.4 implies that its query time must be .

4 Conclusion and Open Problems

We investigated lower bounds for range searching with polynomial slabs and rings in . We showed space-time tradeoff bounds of and for them respectively. Both of these bounds are almost tight in the “fast query” case, i.e., when (up to a factor). This refutes the conjecture of the existence of data structure that can solve semialgebraic range searching in using space and query time. We also studied the “dual” polynomial slab stabbing and ring stabbing problems. For these two problems, we obtained lower bounds and respectively. These bounds are tight when . Our work, however, brings out some very interesting open problems. To get the lower bounds for the polynomial slabs, we only considered univariate polynomials of degree . In this setting, the number of coefficients is at most , and we have also assumed they are all independent. It would be interesting to see if similar lower bounds can be obtained under more general settings. In particular, as the maximum number of coefficients of a bivaraite polynomial of degree is , it would interesting to see if a space lower bound can be obtained for the “fast query” case. Based on our results, it is reasonable to conjecture that the “correct” bound for semialgebraic range searching in the “fast query” setting is where is the maximum number of parameters needed to specify each inequality555In our lower bound, . It is not hard to generalize our results for any by considering a polynomial with only nonzero coefficients and obtain a space lower bound.. See [agarwal2013range] for a detailed discussion. We believe this is a very exciting open problem to study and it requires advances both in upper bound and lower bound front but it can essentially settle the semialgebraic range searching problem. Last but not least, it would be interesting to consider space-time trade-offs. For instance, by combining the known “fast query” and “low space” solutions for 2D ring reporting, one can obtain data structures with trade-off curve , however, our lower bound is and it is not clear which of these bounds is closer to truth. For the ring searching problem in , in our lower bound proof, we considered a random input point set, since in most cases a random point set is the hardest input instance and our analysis seems to be tight, we therefore conjecture that our lower bound could be tight, at least when is small enough. In fact we have some ideas to (possibly) obtain the trade-off curve of in an important special case, namely, when the problem is semigroup range searching in an idempotent semigroup and the input points are uniformly random in the unit square. Recall that in the proof of Figure 3, rings with their centers inside a specific rectangle have relatively large intersection. This property can be used to obtain an efficient data structure in this very special case. So we conjecture that the known upper bound can be sharpened to match our lower bound.

Figure 6: The Intersection Region for Answering Queries
Acknowledgements.

The authors would like to thank Esther Ezra for sparking the initial ideas behind the proof.

References

Appendix A Proof of subsection 3.1

See 3.1

Proof.

Consider any intersection region of ranges with area . Let

be an indicator random variable with

Let . Clearly, . By Chernoff’s bound,

for any . Let , then

Now we pick , since for some constant , we have

Since the total number of intersections is bounded by , the number of cells in the arrangement is also bounded by and thus by the union bound, for sufficiently large , with probability , the number of points in every intersection region is less than . ∎

Appendix B Proof of  subsection 3.1

See 3.1

Proof.

The proof of this lemma is similar to the one for subsection 3.1. We pick points in uniformly at random. Let

be the indicator random variable with

We know that the area of each range is at least . Then the expected number of points in each range is . Consider an arbitrary range, let , then by Chernoff’s bound

The second last inequality follows from and the last inequality follows from . Since the total number of ranges is bounded by , by a standard union bound argument, the lemma holds. ∎

Appendix C Proof of subsection 3.1

See 3.1

Proof.

First note that w.l.o.g., we can assume , because otherwise we can consider a new polynomial . Since is still a degree univariate polynomial with , and for all , , to bound , we only need to consider on interval . Assume for the sake of contradiction that . We show that this will lead to . We pick points , where and , on the polynomial. Then can be expressed as

The coefficient of the degree term is therefore

We pick for and we therefore obtain

We now upper bound . We assume , the case for is symmetric. When ,

where the last inequality follows from and . Also by assumption, , we therefore have

where the last inequality follows from . However, in , , a contradiction. Therefore, . ∎

Appendix D Proof of subsection 3.2

See 3.2

Proof.

We use Afshani’s lower bound framework as described in subsection 2.2. First we generate polynomial slabs in a unit square as follows. Consider polynomial slabs with their base polynomials being:

where