NeuroSketch: Fast and Approximate Evaluation of Range Aggregate Queries with Neural Networks

11/20/2022
by   Sepanta Zeighami, et al.
0

Range aggregate queries (RAQs) are an integral part of many real-world applications, where, often, fast and approximate answers for the queries are desired. Recent work has studied answering RAQs using machine learning (ML) models, where a model of the data is learned to answer the queries. However, there is no theoretical understanding of why and when the ML based approaches perform well. Furthermore, since the ML approaches model the data, they fail to capitalize on any query specific information to improve performance in practice. In this paper, we focus on modeling “queries” rather than data and train neural networks to learn the query answers. This change of focus allows us to theoretically study our ML approach to provide a distribution and query dependent error bound for neural networks when answering RAQs. We confirm our theoretical results by developing NeuroSketch, a neural network framework to answer RAQs in practice. Extensive experimental study on real-world, TPC-benchmark and synthetic datasets show that NeuroSketch answers RAQs multiple orders of magnitude faster than state-of-the-art and with better accuracy.

READ FULL TEXT
research
07/10/2021

NeuroDB: A Neural Network Framework for Answering Range Aggregate Queries and Beyond

Range aggregate queries (RAQs) are an integral part of many real-world a...
research
05/18/2020

A Comparative Exploration of ML Techniques for Tuning Query Degree of Parallelism

There is a large body of recent work applying machine learning (ML) tech...
research
07/05/2017

Efficient Approximate Query Answering over Sensor Data with Deterministic Error Guarantees

With the recent proliferation of sensor data, there is an increasing nee...
research
11/30/2021

Maliva: Using Machine Learning to Rewrite Visualization Queries Under Time Constraints

We consider data-visualization systems where a middleware layer translat...
research
05/19/2020

Machine Learning-based Cardinality Estimation in DBMS on Pre-Aggregated Data

Cardinality estimation is a fundamental task in database query processin...
research
01/02/2022

Optimizing Machine Learning Inference Queries with Correlative Proxy Models

We consider accelerating machine learning (ML) inference queries on unst...
research
02/06/2013

Learning Bayesian Nets that Perform Well

A Bayesian net (BN) is more than a succinct way to encode a probabilisti...

Please sign up or login with your details

Forgot password? Click here to reset