Approximating Aggregated SQL Queries With LSTM Networks

10/25/2020
by   Nir Regev, et al.
0

Despite continuous investments in data technologies, the latency of querying data still poses a significant challenge. Modern analytic solutions require near real-time responsiveness both to make them interactive and to support automated processing. Current technologies (Hadoop, Spark, Dataflow) scan the dataset to execute queries. They focus on providing a scalable data storage to maximize task execution speed. We argue that these solutions fail to offer an adequate level of interactivity since they depend on continual access to data. In this paper we present a method for query approximation, also known as approximate query processing (AQP), that reduce the need to scan data during inference (query calculation), thus enabling a rapid query processing tool. We use LSTM network to learn the relationship between queries and their results, and to provide a rapid inference layer for predicting query results. Our method (referred as “Hunch“) produces a lightweight LSTM network which provides a high query throughput. We evaluated our method using twelve datasets and compared to state-of-the-art AQP engines (VerdictDB, BlinkDB) from query latency, model weight and accuracy perspectives. The results show that our method predicted queries' results with a normalized root mean squared error (NRMSE) ranging from approximately 1% to 4% which in the majority of our data sets was better then the compared benchmarks. Moreover, our method was able to predict up to 120,000 queries in a second (streamed together), and with a single query latency of no more than 2ms.

READ FULL TEXT
research
09/21/2023

QUEST: An Efficient Query Evaluation Scheme Towards Scan-Intensive Cross-Model Analysis

Modern data-driven applications require that databases support fast cros...
research
08/16/2020

DeepSampling: Selectivity Estimation with Predicted Error and Response Time

The rapid growth of spatial data urges the research community to find ef...
research
01/28/2022

Electra: Conditional Generative Model based Predicate-Aware Query Approximation

The goal of Approximate Query Processing (AQP) is to provide very fast b...
research
08/24/2023

Lightweight Materialization for Fast Dashboards Over Joins

Dashboards are vital in modern business intelligence tools, providing no...
research
04/20/2022

JanusAQP: Efficient Partition Tree Maintenance for Dynamic Approximate Query Processing

Approximate query processing over dynamic databases, i.e., under inserti...
research
04/03/2018

VerdictDB: Universalizing Approximate Query Processing

Despite 25 years of research in academia, approximate query processing (...
research
08/15/2023

Understanding DNS Query Composition at B-Root

The Domain Name System (DNS) is part of critical internet infrastructure...

Please sign up or login with your details

Forgot password? Click here to reset