"What makes my queries slow?": Subgroup Discovery for SQL Workload Analysis

08/09/2021
by   Youcef Remil, et al.
0

Among daily tasks of database administrators (DBAs), the analysis of query workloads to identify schema issues and improving performances is crucial. Although DBAs can easily pinpoint queries repeatedly causing performance issues, it remains challenging to automatically identify subsets of queries that share some properties only (a pattern) and simultaneously foster some target measures, such as execution time. Patterns are defined on combinations of query clauses, environment variables, database alerts and metrics and help answer questions like what makes SQL queries slow? What makes I/O communications high? Automatically discovering these patterns in a huge search space and providing them as hypotheses for helping to localize issues and root-causes is important in the context of explainable AI. To tackle it, we introduce an original approach rooted on Subgroup Discovery. We show how to instantiate and develop this generic data-mining framework to identify potential causes of SQL workloads issues. We believe that such data-mining technique is not trivial to apply for DBAs. As such, we also provide a visualization tool for interactive knowledge discovery. We analyse a one week workload from hundreds of databases from our company, make both the dataset and source code available, and experimentally show that insightful hypotheses can be discovered.

READ FULL TEXT
research
07/12/2019

Detecting coherent explorations in SQL workloads

This paper presents a proposal aiming at better understanding a workload...
research
02/21/2020

Facilitating SQL Query Composition and Analysis

Formulating efficient SQL queries requires several cycles of tuning and ...
research
01/17/2018

Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics

We consider methods for learning vector representations of SQL queries t...
research
01/17/2018

Query2Vec: NLP Meets Databases for Generalized Workload Analytics

We propose methods for learning vector representations of SQL workloads ...
research
05/21/2019

Mega-Archive and the EURONEAR Tools for Datamining World Astronomical Images

The world astronomical image archives represent huge opportunities to ti...
research
07/25/2020

Automated Query Generation for Design Pattern Mining in Source Code

Identifying which design patterns already exist in source code can help ...
research
06/18/2017

Evolutionary Data Systems

Anyone in need of a data system today is confronted with numerous comple...

Please sign up or login with your details

Forgot password? Click here to reset