An Efficient Algorithm for Generalized Polynomial Partitioning and Its Applications

12/26/2018 ∙ by Pankaj K. Agarwal, et al. ∙ Duke University NYU college The University of British Columbia 0

Guth showed that given a family S of n g-dimensional semi-algebraic sets in R^d and an integer parameter D ≥ 1, there is a d-variate partitioning polynomial P of degree at most D, so that each connected component of R^d ∖ Z(P) intersects O(n / D^d-g) sets from S. Such a polynomial is called a "generalized partitioning polynomial". We present a randomized algorithm that efficiently computes such a polynomial P. Specifically, the expected running time of our algorithm is only linear in |S|, where the constant of proportionality depends on d, D, and the complexity of the description of S. Our approach exploits the technique of "quantifier elimination" combined with that of "ϵ-samples". We present four applications of our result. The first is a data structure for answering point-location queries among a family of semi-algebraic sets in R^d in O( n) time; the second is data structure for answering range search queries with semi-algebraic ranges in O( n) time; the third is a data structure for answering vertical ray-shooting queries among semi-algebraic sets in R^d in O(^2 n) time; and the fourth is an efficient algorithm for cutting algebraic planar curves into pseudo-segments, i.e., into Jordan arcs, each pair of which intersect at most once.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Background and related work.

In 2010, Guth and Katz [14] resolved the Erdős distinct distances problem in the plane. A major ingredient in their proof was a partitioning theorem for points in . Specifically, they proved that, given a set of points in and an integer , there is a -variate “partitioning polynomial” of degree at most so that each connected component of  contains points from the set. Their polynomial partitioning theorem has led to a flurry of new results in combinatorial and incidence geometry, harmonic analysis, and theoretical computer science.

The Guth-Katz result established the existence of a partitioning polynomial, but it did not give an effective way to compute such a polynomial given a set of points. In [3], Agarwal, Matoušek, and Sharir developed an efficient algorithm to compute partitioning polynomials, matching the degree bound obtained in [14] up to a constant factor. They used their algorithm to obtain a linear-size data structure for the problem of range searching with semi-algebraic sets in the “low storage / sublinear query” regime.

In 2015, Guth [13] generalized the Guth-Katz partitioning polynomial result from points to semi-algebraic sets.555Guth stated his result for the special case where the semi-algebraic sets are real algebraic varieties, but his proof in fact holds in the more general setting of semi-algebraic sets. Recall that a semi-algebraic set in is the locus of points in that satisfy a Boolean formula over a set of polynomial inequalities. Informally, he proved that given a collection of -dimensional semi-algebraic sets666We refer the reader to [9, Chapter 2] for a formal definition of dimension of a semi-algebraic set. in and an integer , there is a -variate partitioning polynomial of degree at most so that each connected component of intersects semi-algebraic sets from the collection (the implicit constant in the notation depends on and on the degree and number of polynomials required to define each semi-algebraic set). We refer to such a polynomial as a generalized partitioning polynomial.

To sum up, Guth’s proof established the existence of a generalized partitioning polynomial, but it did not give an effective way to compute such a polynomial given a collection of semi-algebraic sets. In [4], the last three authors developed a computationally efficient way to construct a partitioning polynomial for a set of algebraic curves in . For other settings, however, no effective method for computing a partitioning polynomial was known prior to the present work.

Our results.

Our main result is a computationally efficient implementation of Guth’s polynomial partitioning theorem for semi-algebraic sets (Theorem 4). Given a set of  semi-algebraic sets in , our algorithm computes a polynomial partition of degree in expected running time linear in and singly-exponential in .

Next, we present four applications of our algorithm in Section 4:

  • Let be a family of semi-algebraic sets in , each of complexity at most for some constant (see Section 2 for the definition of the complexity of a semi-algebraic set). Each set in is assigned a weight that belongs to a semigroup. We present a data structure of size , for any constant , that can compute, in time, the cumulative weight of the sets in containing a query point. The data structure can be constructed in randomized expected time. This is a significant improvement over the best known data structure by Koltun [16], for , that used space.

  • Let be a set of points in , each of which is assigned a weight, and let be a (possibly infinite) family of semi-algebraic sets in . Suppose that there exists a positive integer and an injection so that for each , the set is a semi-algebraic set in of complexity at most . We can construct in randomized expected time a data structure of size , for any constant , that can compute in time the cumulative weight of for a query range . The previous best known data structure used space.

  • Given a family of semi-algebraic sets in , we present a data structure of size , for any constant , that can answer vertical ray shooting queries in time. The data structure can be constructed in randomized expected time.

  • Finally, we follow the technique of Sharir and Zahl [19] to cut algebraic planar curves into a collection of pseudo-segments (that is, a collection of Jordan arcs, each pair of which intersects at most once), where the constant of proportionality depends on the degree of the curves. By exploiting Theorem 4, we show that this collection can be constructed in comparable time bound.

2 Preliminaries

In what follows, the complexity of a semi-algebraic set in is the minimum value so that can be represented as the locus of points satisfying a Boolean formula with at most atoms of the form or , with each being a -variate polynomial of degree at most .

Hereafter we write to mean that there exits a constant depending only on so that , for all positive integers .

Our analysis makes extensive use of concepts and results from real algebraic geometry and random sampling. We review them below.

2.1 Polynomials, partitioning, and quantifier elimination

Sign conditions.

Consider polynomials . A sign condition on is an element of . A strict sign condition on is an element of . A sign condition is realizable if the set

(1)

is non-empty. A realizable strict sign condition is defined analogously. The set (1) is called the realization of the sign condition. The set of realizations of sign conditions (resp., realizations of strict sign conditions) corresponding to the tuple is the collection of all non-empty sets of the above form. These sets are pairwise disjoint and partition , by definition.

While a tuple of polynomials has sign conditions and strict sign conditions, not all of them may be realizable. In fact, Milnor and Thom (see, e.g., [8, 12]) showed that any  polynomials in of degree at most (with ) have at most  realizable sign conditions.

Polynomials and partitioning.

The set of polynomials in of degree at most

is a real vector space of dimension

; we identify this vector space with . For a point , let be the corresponding polynomial of degree at most .

Remark 1.

Consider the polynomial given by . Since we can write , where is a monomial of degree at most , has degree .

For each positive integer , let be the smallest positive integer so that ; we have . For each , pick a -dimensional subspace of the vector space of polynomials in of degree at most . These subspaces will be fixed hereafter. For each positive integer , define the product space

(2)

We identify each point with a -tuple of polynomials where . For each , and thus .

Let , let be a collection of semi-algebraic sets in , and let . We say that is a -partitioning tuple for if has realizable strict sign conditions and the realization of each of them intersects at most sets from .

Guth [13] proved that, if is chosen appropriately, then a -partitioning tuple is guaranteed to exist:

Proposition 1 (Generalized Polynomial Partitioning [13]).

Let be a family of semi-algebraic sets in , each of dimension at most and complexity at most . For each , there exists a -partitioning tuple for , with

We also recall Theorem 2.16 from [7]:

Proposition 2 (Point Location in Semi-Algebraic Sets).

Let be a set of at most  polynomials in of degree at most . Then there is an algorithm that computes a set of points meeting every semi-algebraically connected component of every realizable sign condition on  in time . There is also an algorithm providing the list of signs of all the polynomials of at each of these points in time .

Singly exponential quantifier elimination.

Let and be non-negative integers and let . Let be a first-order formula given by

(3)

where is a block of free variables; is a block of variables, and is a quantifier-free Boolean formula with atomic predicates of the form , with

The Tarski-Seidenberg theorem states that the set of points satisfying the formula  is semi-algebraic. The next proposition is a quantitative version of this result that bounds the number and degree of the polynomial equalities and inequalities needed to describe the set of points satisfying . This proposition is known as a “singly exponential quantifier elimination,” and its more general form (where may contain a mix of and quantifiers) can be found in [7, Theorem 2.27].

Proposition 3.

Let be a set of at most polynomials, each of degree at most in real variables. Given a formula of the form (3), there exists an equivalent quantifier-free formula

(4)

where are polynomials in the variables , ,

(5)

and the degrees of the polynomials are bounded by .

2.2 Range spaces, VC dimension, and -samples

We first recall several standard definitions and results from [15, Chapter 5]. A range space is a pair , where is a set and is a collection of subsets of . Let be a range space and let be a set. We define the restriction of to , denoted by to be , where If is finite, then . If equality holds, then we say is shattered. We define the shatter function by . The VC dimension of is the largest cardinality of a set shattered by . If arbitrarily large (finite) subsets can be shattered, we say that the VC dimension of is infinite.

Let be a range space, a finite subset of , and . A set is an -sample (also known as -approximation) of if

The following classical theorem of Vapnik and Chervonenkis [20] guarantees that if the VC-dimension of is finite, then for each positive , a sufficiently large random sample of is likely to be an -sample.777The following bound is not the strongest possible (see, e.g. [15, Chapter 7] for an improved bound), but is sufficient for our purposes.

Proposition 4 (-Sample Theorem).

Let be a range space of VC dimension at most and let be finite. Let . Then a random subset of cardinality is an -sample for

with probability at least

.

Proposition 5 ([12, 15]).

Let be a range space whose shatter function satisfies the bound , for all positive integers , where is a real parameter. Then has VC dimension at most .

We next closely follow the arguments in the proof of Corollary 2.3 from [12], and show the following theorem:

Theorem 1.

Let be a semi-algebraic set of complexity . For each , define . Then the range space has VC dimension at most .

Proof.

By assumption, there are polynomials and a Boolean formula , so that, for , if and only if .

Put . Fix a positive integer and let . Our goal is to bound

For each define

Let and suppose that there exists with . This means for each and for each , i.e., the semi-algebraic set consisting of those points satisfying the Boolean formula

is non-empty. Observe that if and are distinct subsets of , then and are disjoint and, in fact,

Each of the non-empty sets contains at least one realization of a sign condition of the polynomials

each of degree at most . By a result of Milnor and Thom stated in Section 2.1, these polynomials determine at most realizable sign conditions. Thus

(6)

Since (6) holds for every choice of , we conclude that

By Proposition 5, has VC dimension at most . ∎

3 Computing Generalized Polynomial Partition

In this section we obtain the main result of the paper: given a collection of semi-algebraic sets in , each of dimension at most  and complexity at most , a -partitioning tuple for can be computed efficiently. We obtain this result in several steps. First, we represent a semi-algebraic set in of complexity at most as a point in a parameter space —each point in corresponds to a tuple of polynomials in  variables, each of degree at most . We use the set (defined in Section 2.1) to parameterize the space of sign conditions specified by tuples of polynomials. With these parameterizations in place, the condition that a semi-algebraic set intersects a given sign condition is represented by a subset  of pairs of points from .

In Theorem 2, we prove that is semi-algebraic, and its complexity depends only on , , and . This means that for each semi-algebraic set , the set of tuples whose realization intersects is semi-algebraic. This in turn implies that if are semi-algebraic sets and if , then the set set of tuples whose realization intersects at most of the sets is semi-algebraic. Unfortunately, however, the complexity of this subset of might be very large; in particular, it is likely to be exponential in .

To circumvent this problem, we use the theory of -samples. That is, we show that rather than considering a large number of semi-algebraic sets , it suffices to select a small number of these sets at random. If a tuple has the property that each of its realizable sign conditions intersect few sets from the random sample, then with high probability each of the realizable sign conditions will intersect few sets from the original collection. This property is shown by applying Theorems 1 and 2.

3.1 The parameter space of semi-algebraic sets

Fix positive integers , , and , and let . Hereafter we assume that , which can be enforced by choosing sufficiently large.

As above, we denote by a family of semi-algebraic sets in , each of dimension at most  and complexity at most . Let be a Boolean function. Let . We identify a point with the semi-algebraic set

Observe that each semi-algebraic set in is of the form for some choice of and a Boolean function . Let . For each , define , where is the tuple associated to . Define

Theorem 2.

The set is semi-algebraic, defined by polynomials, each of degree .

Before proceeding with the proof of the theorem, we note that the complexity of is only singly exponential in , which we will exploit in Section 3.3.

Proof.

Define . The condition is a Boolean condition on polynomials. By Remark 1, each of these polynomials has degree at most . Similarly, the condition consists of polynomial inequalities, each of degree at most . This means that there exists a set of polynomials of degree in the variables , and a Boolean function so that

With the above definitions

We now apply Proposition 3. We have a set of polynomials, each of degree at most . The variables and from the hypothesis of Proposition 3 are set to and , recall that is sufficiently larger than , and thus is a suitably chosen polynomial function of . With these assignments, Proposition 3 says that can be expressed as a quantifier-free formula of the form

(7)

where are polynomials in the variables , ,

(8)

where the degrees of the polynomials are bounded by .

Summarizing, the quantifier-free formula (7) for is a Boolean combination of polynomial inequalities, each of degree , as claimed. ∎

3.2 A singly-exponential algorithm

In this section, we discuss how to compute a -partitioning tuple (for an appropriate value of ) for a small number of semi-algebraic sets.

Theorem 3.

Let be a family of semi-algebraic sets in , each of dimension at most and complexity at most . Let and let . Then a -partitioning tuple for can be computed in time.

Proof.

Set . As above, we identify points in with tuples of polynomials. The argument in Theorem 2, as well as the fact that the class of semi-algebraic sets is closed under the operation of taking a projection, show that, for each and each ,

is a semi-algebraic set in that can be expressed as a Boolean combination of polynomials, each of degree .

Let be a constant to be specified later (the constant will depend only on and ) and let ; observe that . For each and for each set of cardinality , the set is a semi-algebraic set in that can be expressed as a Boolean combination of polynomials, each of degree , where . Therefore

(9)

is a semi-algebraic set in that can be expressed as a Boolean combination of

polynomials, each of degree . This and the fact that the class of semi-algebraic sets is closed under the operation of taking complement imply that

is a semi-algebraic set in that can be expressed as a Boolean combination of polynomials, each of degree . This means that the set

(10)

is a semi-algebraic set in that can be expressed as a Boolean combination of polynomials, each of degree . Recall that by assumption and . It thus follows that the degree is bounded by . Similarly, the dimension of the space is bounded by as well.

Proposition 1 guarantees that if is selected sufficiently large, then the set (10) is non-empty. By Proposition 2, it is possible to locate a point in this set in time, concluding the proof of the theorem. ∎

3.3 Speeding up the algorithm using -sampling

In this section we first state and prove the following lemma:

Lemma 1.

For every choice of positive integers and , there is a constant so that the following holds. Let be a positive integer. Let be a finite collection of semi-algebraic sets in , each of dimension at most and complexity at most . Let be a positive integer and let . Let be a randomly chosen subset of of cardinality at least and let be a -partitioning tuple for . Then with probability at least , each of the realizable sign conditions of intersects elements from .

Note that Lemma 1 states that it is sufficient to consider a random subset of size polynomial in in order to obtain an appropriate partitioning tuple for the entire collection , with reasonable probability.

Proof.

Define and as above, and let . For each , define the range

Define . By Theorems 1 and 2, the range space has VC dimension . Define , where the union is taken over the Boolean functions . Since the shatter function grows by at most a multiplicative factor of , the VC dimension of the range space is as well (this is a standard fact, see, e.g., [15, Chapter 5]).

We are now ready to prove the statement of the lemma. Set . Suppose that is an -sample of and that is a -partitioning tuple for . Then for each range , we have . Combining this with -sample properties, we obtain:

and by the choice of (with an appropriate constant of proportionality) we have:

and thus

The corresponding cardinality of is

We next proceed as follows. We select a random sample of of cardinality and use Theorem 3 to compute the corresponding partitioning tuple . This takes time. By Lemma 1, this tuple will be a -partitioning tuple for with probability at least . We can verify whether the partitioning tuple works in time. If the tuple does not produce the appropriate partition, we discard it and try again. Specifically, the verification step is done as follows. For each semi-algebraic set we compute the subset of sign conditions of , with which it has a non-empty intersection. To this end, we restrict each of the polynomials to and apply Proposition 2 on this restricted collection, thereby obtaining a set of points meeting each semi-algebraically connected component of each of the realizable sign conditions, as well as the corresponding list of signs of the restricted polynomials for each of these points. This is done in time for a single semi-algebraic set , and overall time, over all sets. We refer the reader to [6] for further details concerning the complexity of the restriction of to . We have thus shown:

Theorem 4.

Let be a finite collection of semi-algebraic sets in , each of which has dimension at most and complexity at most . Let and let . Then a -partitioning tuple for can be computed in expected time by a randomized algorithm.

4 Applications

In this section we describe a few applications of Theorem 4, namely, point location amid semi-algebraic sets, semi-algebraic range searching with logarithmic query time, vertical ray shooting amid semi-algebraic sets, and cutting algebraic curves into pseudo-segments.

4.1 Point location

Let be a set of semi-algebraic sets in , each of complexity at most . Each set has a weight . We assume that the weights belong to a semigroup, i.e., subtractions are not allowed, and that the semigroup operation can be performed in constant time. We wish to preprocess into a data structure so that the cumulative weight of the sets in that contain a query point can be computed in time. Note that if the weight of each set is and the semi-group operation is Boolean , then the point-location query becomes an instance of union-membership query: determine whether the query point lies in . We follow a standard hierarchical partitioning scheme of space, e.g., as in [10, 1], but use Theorem 4 at each stage. Using this hierarchical partition, we construct a tree data structure of depth, and a query is answered by following a path in .

More precisely, we fix sufficiently large positive constants and . If , consists of a single node that stores itself. So assume that . Using Theorem 4, we construct a tuple of -variate polynomials of degree at most , which have realizable sign conditions, each of which with a realization that meets the boundaries of at most sets of . For each realizable sign condition , let be the family of sets whose boundaries meet the realization of , and let be the family of sets that contain the realization of . We compute , , and , as follows: We first apply Proposition 2 to  to compute, in time, a representative point in each realization of a sign condition.

Next, fix a set and mark all realizations that meet the boundary of . This step is similar to the one described in the proof of Theorem 4, that is, we restrict each of the polynomials to the algebraic varieties representing the boundary of and apply Proposition 2 to this restricted collection. Each remaining realization is either contained in  or disjoint from it, which can be determined by testing, for each such realization, whether its representing point (computed earlier using Proposition 2 on the original collection ) is contained in . This task can be completed in overall time over all sets of .

We create the root of and store the tuple at . We then create a child for each realizable sign condition and store and at . We recursively construct the data structure for each and attach it to as its subtree.

Since each node of has degree at most and the size of the subproblem reduces by a factor of at each level of the recursion, a standard analysis shows that the total size of the data structure is , where is a constant that can be made arbitrarily small by choosing and to be sufficiently large. Similarly, the expected preprocessing time is also .

Given a query point , we compute the cumulative weight of the sets containing by traversing a path in the tree in a top-down manner: We start from the root and maintain a partial weight , which is initially set to . At each node , we find the sign condition  of the polynomial tuple at whose realization contains , add to , and recursively query the child of . The total query time is , where the constant of proportionality depends on (and thus on ). Putting everything together, we obtain the following:

Theorem 5.

Let be a set of semi-algebraic sets in , each of complexity at most  for some constant , and let be the weight of each set that belongs to a semigroup. Assuming that the semigroup operation can be performed in constant time, can be preprocessed in randomized expected time into a data structure of size , for any constant , so that the cumulative weight of the sets that contains a query point can be computed in time.

4.2 Range searching

Next, we consider range searching with semi-algebraic sets: Let be a set of points in . Each point is assigned a weight that belongs to a semigroup. Again we assume that the semigroup operation takes constant time. We wish to preprocess so that for a query range , represented as a semi-algebraic set in , the cumulative weight of can be computed in time. Here we assume that the query ranges (semi-algebraic sets) are parameterized as described in Section 3.1. That is, we have a fixed -variate Boolean function . A query range is represented as a point , for some , and the underlying semi-algebraic set is . We refer to as the dimension of the query space, and to the range searching problem in which all query ranges are of the form as -semi-algebraic range searching.

For a point , let denote the set of semi-algebraic sets that contain , i.e., . It can be checked that is a semi-algebraic set whose complexity depends only on , and . Let . For a query range , we now wish to compute the cumulative weight of the sets in that contain . This can be done using Theorem 5. Putting everything together, we obtain the following:

Theorem 6.

Let be a set of points in , let be the weight of that belongs to a semigroup, and let be a fixed -variate Boolean function for some constant