COOOL: A Learning-To-Rank Approach for SQL Hint Recommendations

04/10/2023
by   Xianghong Xu, et al.
0

Query optimization is a pivotal part of every database management system (DBMS) since it determines the efficiency of query execution. Numerous works have introduced Machine Learning (ML) techniques to cost modeling, cardinality estimation, and end-to-end learned optimizer, but few of them are proven practical due to long training time, lack of interpretability, and integration cost. A recent study provides a practical method to optimize queries by recommending per-query hints but it suffers from two inherited problems. First, it follows the regression framework to predict the absolute latency of each query plan, which is very challenging because the latencies of query plans for a certain query may span multiple orders of magnitude. Second, it requires training a model for each dataset, which restricts the application of the trained models in practice. In this paper, we propose COOOL to predict Cost Orders of query plans to cOOperate with DBMS by Learning-To-Rank. Instead of estimating absolute costs, COOOL uses ranking-based approaches to compute relative ranking scores of the costs of query plans. We show that COOOL is theoretically valid to distinguish query plans with different latencies. We implement COOOL on PostgreSQL, and extensive experiments on join-order-benchmark and TPC-H data demonstrate that COOOL outperforms PostgreSQL and state-of-the-art methods on single-dataset tasks as well as a unified model for multiple-dataset tasks. Our experiments also shed some light on why COOOL outperforms regression approaches from the representation learning perspective, which may guide future research.

READ FULL TEXT
research
06/11/2023

Kepler: Robust Learning for Faster Parametric Query Optimization

Most existing parametric query optimization (PQO) techniques rely on tra...
research
03/20/2023

Less is More: Towards Lightweight Cost Estimator for Database Systems

We present FasCo, a simple yet effective learning-based estimator for th...
research
02/21/2019

How I Learned to Stop Worrying and Love Re-optimization

Cost-based query optimizers remain one of the most important components ...
research
12/29/2020

BayesCard: A Unified Bayesian Framework for Cardinality Estimation

Cardinality estimation is one of the fundamental problems in database ma...
research
02/27/2020

Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings

Query processing over big data is ubiquitous in modern clouds, where the...
research
10/01/2020

Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems

Query Optimization remains an open problem for Big Data Management Syste...
research
01/13/2021

Flow-Loss: Learning Cardinality Estimates That Matter

Previous approaches to learned cardinality estimation have focused on im...

Please sign up or login with your details

Forgot password? Click here to reset