A Comparative Exploration of ML Techniques for Tuning Query Degree of Parallelism

05/18/2020
by   Zhiwei Fan, et al.
0

There is a large body of recent work applying machine learning (ML) techniques to query optimization and query performance prediction in relational database management systems (RDBMSs). However, these works typically ignore the effect of intra-parallelism – a key component used to boost the performance of OLAP queries in practice – on query performance prediction. In this paper, we take a first step towards filling this gap by studying the problem of tuning the degree of parallelism (DOP) via ML techniques in Microsoft SQL Server, a popular commercial RDBMS that allows an individual query to execute using multiple cores. In our study, we cast the problem of DOP tuning as a regression task, and examine how several popular ML models can help with query performance prediction in a multi-core setting. We explore the design space and perform an extensive experimental study comparing different models against a list of performance metrics, testing how well they generalize in different settings: (i) to queries from the same template, (ii) to queries from a new template, (iii) to instances of different scale, and (iv) to different instances and queries. Our experimental results show that a simple featurization of the input query plan that ignores cost model estimations can accurately predict query performance, capture the speedup trend with respect to the available parallelism, as well as help with automatically choosing an optimal per-query DOP.

READ FULL TEXT

page 3

page 6

page 9

research
11/20/2022

NeuroSketch: Fast and Approximate Evaluation of Range Aggregate Queries with Neural Networks

Range aggregate queries (RAQs) are an integral part of many real-world a...
research
05/31/2022

End-to-end Optimization of Machine Learning Prediction Queries

Prediction queries are widely used across industries to perform advanced...
research
01/02/2022

Optimizing Machine Learning Inference Queries with Correlative Proxy Models

We consider accelerating machine learning (ML) inference queries on unst...
research
07/16/2019

Conversational Help for Task Completion and Feature Discovery in Personal Assistants

Intelligent Personal Assistants (IPAs) have become widely popular in rec...
research
08/28/2017

Analyzing Query Performance and Attributing Blame for Contentions in a Cluster Computing Framework

Analyzing contention for resources in a cluster computing environment ac...
research
08/25/2023

ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

The scale and complexity of workloads in modern cloud services have brou...
research
11/30/2021

Maliva: Using Machine Learning to Rewrite Visualization Queries Under Time Constraints

We consider data-visualization systems where a middleware layer translat...

Please sign up or login with your details

Forgot password? Click here to reset