DeepSampling: Selectivity Estimation with Predicted Error and Response Time

08/16/2020
by   Tin Vu, et al.
0

The rapid growth of spatial data urges the research community to find efficient processing techniques for interactive queries on large volumes of data. Approximate Query Processing (AQP) is the most prominent technique that can provide real-time answer for ad-hoc queries based on a random sample. Unfortunately, existing AQP methods provide an answer without providing any accuracy metrics due to the complex relationship between the sample size, the query parameters, the data distribution, and the result accuracy. This paper proposes DeepSampling, a deep-learning-based model that predicts the accuracy of a sample-based AQP algorithm, specially selectivity estimation, given the sample size, the input distribution, and query parameters. The model can also be reversed to measure the sample size that would produce a desired accuracy. DeepSampling is the first system that provides a reliable tool for existing spatial databases to control the accuracy of AQP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2020

LAQP: Learning-based Approximate Query Processing

Querying on big data is a challenging task due to the rapid growth of da...
research
07/29/2018

MISS: Finding Optimal Sample Sizes for Approximate Analytics

Nowadays, sampling-based Approximate Query Processing (AQP) is widely re...
research
10/25/2020

Approximating Aggregated SQL Queries With LSTM Networks

Despite continuous investments in data technologies, the latency of quer...
research
11/15/2018

Model-based Approximate Query Processing

Interactive visualizations are arguably the most important tool to explo...
research
01/28/2022

Electra: Conditional Generative Model based Predicate-Aware Query Approximation

The goal of Approximate Query Processing (AQP) is to provide very fast b...
research
10/23/2020

The Case for Distance-Bounded Spatial Approximations

Spatial approximations have been traditionally used in spatial databases...
research
08/12/2020

Sampling Based Approximate Skyline Calculation on Big Data

The existing algorithms for processing skyline queries cannot adapt to b...

Please sign up or login with your details

Forgot password? Click here to reset