Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates

10/30/2021
by   Asoke Datta, et al.
0

The Join Order Benchmark (JOB) has become the de facto standard to assess the performance of relational database query optimizers due to its complexity and completeness. In order to compute the optimal execution plan – join order – existing solutions employ extensive data synopses and correlations – functional dependencies – between table attributes. These structures incur significant overhead to design, build, and maintain. In this paper, we present Simplicity Simplified (Simpli-Squared), a very simple join ordering algorithm that achieves unexpectedly good results. Simpli-Squared computes the join order without using any statistics or cardinality estimates. It takes as input only the referential integrity constraints declared at schema definition and the number of tuples (size) in the base tables. The join order of a given query is computed by splitting the join graph along the many-to-many joins and sorting the tables based on their size. The tables involved in one-to-many joins are greedily included based on size and the query join graph. The resulting plan can be efficiently generated by a lightweight query rewriting procedure. Experiments on the JOB benchmark in PostgreSQL show that Simpli-Squared achieves runtimes having an increase of only up to 16% – and sometimes even a reduction – compared to four state-of-the-art solutions that are considerably more intricate. Based on these results, we question whether JOB adequately tests query optimizers or if accurate cardinality estimation is such a fundamental requirement for performing well on the JOB benchmark.

READ FULL TEXT
research
02/04/2021

Online Sketch-based Query Optimization

Cost-based query optimization remains a critical task in relational data...
research
03/31/2023

Scardina: Scalable Join Cardinality Estimation by Multiple Density Estimators

In recent years, machine learning-based cardinality estimation methods a...
research
06/15/2020

NeuroCard: One Cardinality Estimator for All Tables

Query optimizers rely on accurate cardinality estimates to produce good ...
research
12/11/2022

FactorJoin: A New Cardinality Estimation Framework for Join Queries

Cardinality estimation is one of the most fundamental and challenging pr...
research
01/16/2019

SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning

SkinnerDB is designed from the ground up for reliable join ordering. It ...
research
06/08/2017

Optimal parameters for bloom-filtered joins in Spark

In this paper, we present an algorithm that joins relational database ta...
research
02/28/2022

Efficient Massively Parallel Join Optimization for Large Queries

Modern data analytical workloads often need to run queries over a large ...

Please sign up or login with your details

Forgot password? Click here to reset