Size bounds and query plans for relational joins

11/10/2017
by   Albert Atserias, et al.
0

Relational joins are at the core of relational algebra, which in turn is the core of the standard database query language SQL. As their evaluation is expensive and very often dominated by the output size, it is an important task for database query optimisers to compute estimates on the size of joins and to find good execution plans for sequences of joins. We study these problems from a theoretical perspective, both in the worst-case model, and in an average-case model where the database is chosen according to a known probability distribution. In the former case, our first key observation is that the worst-case size of a query is characterised by the fractional edge cover number of its underlying hypergraph, a combinatorial parameter previously known to provide an upper bound. We complete the picture by proving a matching lower bound, and by showing that there exist queries for which the join-project plan suggested by the fractional edge cover approach may be substantially better than any join plan that does not use intermediate projections. On the other hand, we show that in the average-case model, every join-project plan can be turned into a plan containing no projections in such a way that the expected time to evaluate the plan increases only by a constant factor independent of the size of the database. Not surprisingly, the key combinatorial parameter in this context is the maximum density of the underlying hypergraph. We show how to make effective use of this parameter to eliminate the projections.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

Fast Join Project Query Evaluation using Matrix Multiplication

In the last few years, much effort has been devoted to developing join a...
research
03/21/2020

Covering the Relational Join

In this paper, we initiate a theoretical study of what we call the join ...
research
08/04/2021

Relational E-Matching

We present a new approach to e-matching based on relational join; in par...
research
03/05/2019

Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins

We study the problem of optimizing subgraph queries using the new worst-...
research
03/27/2018

Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems

Worst-case optimal join algorithms are the class of join algorithms whos...
research
12/02/2014

Approximate Lifted Inference with Probabilistic Databases

This paper proposes a new approach for approximate evaluation of #P-hard...
research
04/01/2022

Givens QR Decomposition over Relational Databases

This paper introduces Figaro, an algorithm for computing the upper-trian...

Please sign up or login with your details

Forgot password? Click here to reset