Towards Approximate Query Enumeration with Sublinear Preprocessing Time

01/15/2021
by   Isolde Adler, et al.
0

This paper aims at providing extremely efficient algorithms for approximate query enumeration on sparse databases, that come with performance and accuracy guarantees. We introduce a new model for approximate query enumeration on classes of relational databases of bounded degree. We first prove that on databases of bounded degree any local first-order definable query can be enumerated approximately with constant delay after a constant time preprocessing phase. We extend this, showing that on databases of bounded tree-width and bounded degree, every query that is expressible in first-order logic can be enumerated approximately with constant delay after a sublinear (more precisely, polylogarithmic) time preprocessing phase. Durand and Grandjean (ACM Transactions on Computational Logic 2007) proved that exact enumeration of first-order queries on databases of bounded degree can be done with constant delay after a linear time preprocessing phase. Hence we achieve a significant speed-up in the preprocessing phase. Since sublinear running time does not allow reading the whole input database even once, sacrificing some accuracy is inevitable for our speed-up. Nevertheless, our enumeration algorithms come with guarantees: With high probability, (1) only tuples are enumerated that are answers to the query or `close' to being answers to the query, and (2) if the proportion of tuples that are answers to the query is sufficiently large, then all answers will be enumerated. Here the notion of `closeness' is a tuple edit distance in the input database. For local first-order queries, only actual answers are enumerated, strengthening (1). Moreover, both the `closeness' and the proportion required in (2) are controllable. We combine methods from property testing of bounded degree graphs with logic and query enumeration, which we believe can inspire further research.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/16/2020

Enumerating Answers to First-Order Queries over Databases of Low Degree

A class of relational databases has low degree if for all δ>0, all but f...
09/16/2020

Faster Property Testers in a Variation of the Bounded Degree Model

Property testing algorithms are highly efficient algorithms, that come w...
02/13/2018

First-order queries on classes of structures with bounded expansion

We consider the evaluation of first-order queries over classes of databa...
02/07/2019

Ranked Enumeration of Conjunctive Query Results

We investigate the enumeration of top-k answers for conjunctive queries ...
10/15/2020

Ranked enumeration of MSO logic on words

In the last years, enumeration algorithms with bounded delay have attrac...
12/27/2019

Aggregate Queries on Sparse Databases

We propose an algebraic framework for studying efficient algorithms for ...
10/06/2020

Dynamic Query Evaluation Over Structures with Low Degree

We consider the evaluation of first-order queries over classes of databa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.