ML-AQP: Query-Driven Approximate Query Processing based on Machine Learning

03/14/2020
by   Fotis Savva, et al.
0

As more and more organizations rely on data-driven decision making, large-scale analytics become increasingly important. However, an analyst is often stuck waiting for an exact result. As such, organizations turn to Cloud providers that have infrastructure for efficiently analyzing large quantities of data. But, with increasing costs, organizations have to optimize their usage. Having a cheap alternative that provides speed and efficiency will go a long way. Concretely, we offer a solution that can provide approximate answers to aggregate queries, relying on Machine Learning (ML), which is able to work alongside Cloud systems. Our developed lightweight ML-led system can be stored on an analyst's local machine or deployed as a service to instantly answer analytic queries, having low response times and monetary/computational costs and energy footprint. To accomplish this we leverage the knowledge obtained by previously answered queries and build ML models that can estimate the result of new queries in an efficient and inexpensive manner. The capabilities of our system are demonstrated using extensive evaluation with both real and synthetic datasets/workloads and well known benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2019

Adaptive Learning of Aggregate Analytics under Dynamic Workloads

Large organizations have seamlessly incorporated data-driven decision ma...
research
01/02/2022

Optimizing Machine Learning Inference Queries with Correlative Proxy Models

We consider accelerating machine learning (ML) inference queries on unst...
research
09/22/2022

Deep Lake: a Lakehouse for Deep Learning

Traditional data lakes provide critical data infrastructure for analytic...
research
04/12/2020

Complaint-driven Training Data Debugging for Query 2.0

As the need for machine learning (ML) increases rapidly across all indus...
research
12/18/2022

GAN-based Tabular Data Generator for Constructing Synopsis in Approximate Query Processing: Challenges and Solutions

In data-driven systems, data exploration is imperative for making real-t...
research
05/28/2020

Parallelizing Machine Learning as a Service for the End-User

As ML applications are becoming ever more pervasive, fully-trained syste...

Please sign up or login with your details

Forgot password? Click here to reset