Box Covers and Domain Orderings for Beyond Worst-Case Join Processing

09/26/2019
by   Kaleb Alway, et al.
0

Recent beyond worst-case optimal join algorithms Minesweeper and its generalization Tetris have brought the theory of indexing and join processing together by developing a geometric framework for joins. These algorithms take as input an index B, referred to as a box cover, that stores output gaps that can be inferred from traditional indexes, such as B+ trees or tries, on the input relations. The performances of these algorithms highly depend on the certificate of B, which is the smallest subset of gaps in B whose union covers all of the gaps in the output space of a query Q. Different box covers can have different size certificates and the sizes of both the box covers and certificates highly depend on the ordering of the domain values of the attributes in Q. We study how to generate box covers that contain small size certificates to guarantee efficient runtimes for these algorithms. First, given a query Q over a set of relations of size N and a fixed set of domain orderings for the attributes, we give a Õ(N)-time algorithm that generates a box cover for Q that is guaranteed to contain the smallest size certificate across any box cover for Q. Second, we show that finding a domain ordering to minimize the box cover size and certificate is NP-hard through a reduction from the 2 consecutive block minimization problem on boolean matrices. Our third contribution is an Õ(N)-time approximation algorithm to compute domain orderings, under which one can compute a box cover of size Õ(K^r), where K is the minimum box cover for Q under any domain ordering and r is the maximum arity of any relation. This guarantees certificates of size Õ(K^r). Our results provide several new beyond worst-case bounds, which on some inputs and queries can be unboundedly better than the bounds stated in prior work.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/05/2017

Covers of Query Results

We introduce succinct lossless representations of query results called c...
12/02/2021

Worst-case Optimal Binary Join Algorithms under General ℓ_p Constraints

Worst-case optimal join algorithms have so far been studied in two broad...
12/29/2019

Worst-Case Optimal Radix Triejoin

Relatively recently, the field of join processing has been swayed by the...
03/21/2020

Covering the Relational Join

In this paper, we initiate a theoretical study of what we call the join ...
08/05/2019

Optimal Joins using Compact Data Structures

Worst-case optimal join algorithms have gained a lot of attention in the...
05/03/2022

Experiments with Unit Disk Cover Algorithms for Covering Massive Pointsets

Given a set of n points in the plane, the Unit Disk Cover (UDC) problem ...
11/13/2019

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

We study ranked enumeration of the results to a join query in order of d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.