Database-Agnostic Workload Management

08/25/2018
by   Shrainik Jain, et al.
0

We present a system to support generalized SQL workload analysis and management for multi-tenant and multi-database platforms. Workload analysis applications are becoming more sophisticated to support database administration, model user behavior, audit security, and route queries, but the methods rely on specialized feature engineering, and therefore must be carefully implemented and reimplemented for each SQL dialect, database system, and application. Meanwhile, the size and complexity of workloads are increasing as systems centralize in the cloud. We model workload analysis and management tasks as variations on query labeling, and propose a system design that can support general query labeling routines across multiple applications and database backends. The design relies on the use of learned vector embeddings for SQL queries as a replacement for application-specific syntactic features, reducing custom code and allowing the use of off-the-shelf machine learning algorithms for labeling. The key hypothesis, for which we provide evidence in this paper, is that these learned features can outperform conventional feature engineering on representative machine learning tasks. We present the design of a database-agnostic workload management and analytics service, describe potential applications, and show that separating workload representation from labeling tasks affords new capabilities and can outperform existing solutions for representative tasks, including workload sampling for index recommendation and user labeling for security audits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2018

Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics

We consider methods for learning vector representations of SQL queries t...
research
07/12/2019

Detecting coherent explorations in SQL workloads

This paper presents a proposal aiming at better understanding a workload...
research
11/11/2020

Comprehensive and Efficient Workload Compression

This work studies the problem of constructing a representative workload ...
research
09/06/2019

Automating Cluster Management with Weave

Modern cluster management systems like Kubernetes and Openstack grapple ...
research
01/17/2018

Query2Vec: NLP Meets Databases for Generalized Workload Analytics

We propose methods for learning vector representations of SQL workloads ...
research
01/25/2019

Flexible Operator Embeddings via Deep Learning

Integrating machine learning into the internals of database management s...
research
09/02/2019

DeepDB: Learn from Data, not from Queries!

The typical approach for learned DBMS components is to capture the behav...

Please sign up or login with your details

Forgot password? Click here to reset