Database Workload Characterization with Query Plan Encoders

05/26/2021
by   Debjyoti Paul, et al.
6

Smart databases are adopting artificial intelligence (AI) technologies to achieve instance optimality, and in the future, databases will come with prepackaged AI models within their core components. The reason is that every database runs on different workloads, demands specific resources, and settings to achieve optimal performance. It prompts the necessity to understand workloads running in the system along with their features comprehensively, which we dub as workload characterization. To address this workload characterization problem, we propose our query plan encoders that learn essential features and their correlations from query plans. Our pretrained encoders capture the structural and the computational performance of queries independently. We show that our pretrained encoders are adaptable to workloads that expedite the transfer learning process. We performed independent assessments of structural encoder and performance encoders with multiple downstream tasks. For the overall evaluation of our query plan encoders, we architect two downstream tasks (i) query latency prediction and (ii) query classification. These tasks show the importance of feature-based workload characterization. We also performed extensive experiments on individual encoders to verify the effectiveness of representation learning and domain adaptability.

READ FULL TEXT
research
12/16/2019

Lauca: Generating Application-Oriented Synthetic Workloads

The synthetic workload is essential and critical to the performance eval...
research
01/31/2019

Plan-Structured Deep Neural Network Models for Query Performance Prediction

Query performance prediction, the task of predicting the latency of a qu...
research
01/17/2018

Query2Vec: NLP Meets Databases for Generalized Workload Analytics

We propose methods for learning vector representations of SQL workloads ...
research
06/01/2023

BitE : Accelerating Learned Query Optimization in a Mixed-Workload Environment

Although the many efforts to apply deep reinforcement learning to query ...
research
07/29/2022

Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder

It is common practice to reuse models initially trained on different dat...
research
01/17/2018

Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics

We consider methods for learning vector representations of SQL queries t...
research
02/14/2020

Cleaning Denial Constraint Violations through Relaxation

Data cleaning is a time-consuming process which depends on the data anal...

Please sign up or login with your details

Forgot password? Click here to reset