Finding Performance Issues in Database Engines via Cardinality Estimation Testing

06/01/2023
by   Jinsheng Ba, et al.
0

Database Management Systems (DBMSs) process a given query by creating an execution plan, which is subsequently executed, to compute the query's result. Deriving an efficient query plan is challenging, and both academia and industry have invested decades into researching query optimization. Despite this, DBMSs are prone to performance issues, where a DBMS produces an inefficient query plan that might lead to the slow execution of a query. Finding such issues is a longstanding problem and inherently difficult, because no ground truth information on an expected execution time exists. In this work, we propose Cardinality Estimation Restriction Testing (CERT), a novel technique that detects performance issues through the lens of cardinality estimation. Given a query on a database, CERT derives a more restrictive query (e.g., by replacing a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not exceed the number of estimated rows for the original query. CERT tests cardinality estimators specifically, because they were shown to be the most important component for query optimization; thus, we expect that finding and fixing such issues might result in the highest performance gains. In addition, we found that some other kinds of query optimization issues are exposed by the unexpected cardinality estimation, which can also be detected by CERT. CERT is a black-box technique that does not require access to the source code; DBMSs expose query plans via the EXPLAIN statement. CERT eschews executing queries, which is costly and prone to performance fluctuations. We evaluated CERT on three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13 unique issues, of which 2 issues were fixed and 9 confirmed by the developers. We expect that this new angle on finding performance bugs will help DBMS developers in improving DMBSs' performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2017

Adaptive Cardinality Estimation

In this paper we address cardinality estimation problem which is an impo...
research
02/04/2021

Online Sketch-based Query Optimization

Cost-based query optimization remains a critical task in relational data...
research
02/21/2019

How I Learned to Stop Worrying and Love Re-optimization

Cost-based query optimizers remain one of the most important components ...
research
01/19/2018

PRESTO: Probabilistic Cardinality Estimation for RDF Queries Based on Subgraph Overlapping

In query optimisation accurate cardinality estimation is essential for f...
research
04/02/2021

An Empirical Evaluation of Cost-based Federated SPARQL Query Processing Engines

Finding a good query plan is key to the optimization of query runtime. T...
research
05/07/2020

Bitvector-aware Query Optimization for Decision Support Queries (extended version)

Bitvector filtering is an important query processing technique that can ...
research
07/16/2020

Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction

Database Management Systems (DBMS) are used ubiquitously. To efficiently...

Please sign up or login with your details

Forgot password? Click here to reset