Exact Selectivity Computation for Modern In-Memory Database Query Optimization

01/06/2019
by   Jun Hyung Shin, et al.
0

Selectivity estimation remains a critical task in query optimization even after decades of research and industrial development. Optimizers rely on accurate selectivities when generating execution plans. They maintain a large range of statistical synopses for efficiently estimating selectivities. Nonetheless, small errors -- propagated exponentially -- can lead to severely sub-optimal plans---especially, for complex predicates. Database systems for modern computing architectures rely on extensive in-memory processing supported by massive multithread parallelism and vectorized instructions. However, they maintain the same synopses approach to query optimization as traditional disk-based databases. We introduce a novel query optimization paradigm for in-memory and GPU-accelerated databases based on exact selectivity computation (ESC). The central idea in ESC is to compute selectivities exactly through queries during query optimization. In order to make the process efficient, we propose several optimizations targeting the selection and materialization of tables and predicates to which ESC is applied. We implement ESC in the MapD open-source database system. Experiments on the TPC-H and SSB benchmarks show that ESC records constant and less than 30 milliseconds overhead when running on GPU and generates improved query execution plans that are as much as 32X faster.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2018

Novel Selectivity Estimation Strategy for Modern DBMS

Selectivity estimation is important in query optimization, however accur...
research
02/04/2021

Online Sketch-based Query Optimization

Cost-based query optimization remains a critical task in relational data...
research
02/19/2020

Optimizing Federated Queries Based on the Physical Design of a Data Lake

The optimization of query execution plans is known to be crucial for red...
research
05/22/2018

Cache-based Multi-query Optimization for Data-intensive Scalable Computing Frameworks

In modern large-scale distributed systems, analytics jobs submitted by v...
research
08/22/2019

The Case for Deep Query Optimisation

Query Optimisation (QO) is the most important optimisation problem in da...
research
03/02/2022

Redefining The Query Optimization Process

Traditionally, query optimizers have been designed for computer systems ...
research
10/01/2020

Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems

Query Optimization remains an open problem for Big Data Management Syste...

Please sign up or login with your details

Forgot password? Click here to reset