Querying Incomplete Numerical Data: Between Certain and Possible Answers

10/27/2022
by   Marco Console, et al.
0

Queries with aggregation and arithmetic operations, as well as incomplete data, are common in real-world database, but we lack a good understanding of how they should interact. On the one hand, systems based on SQL provide ad-hoc rules for numerical nulls, on the other, theoretical research largely concentrates on the standard notions of certain and possible answers. In the presence of numerical attributes and aggregates, however, these answers are often meaningless, returning either too little or too much. Our goal is to define a principled framework for databases with numerical nulls and answering queries with arithmetic and aggregations over them. Towards this goal, we assume that missing values in numerical attributes are given by probability distributions associated with marked nulls. This yields a model of probabilistic bag databases in which tuples are not necessarily independent, since nulls can repeat. We provide a general compositional framework for query answering, and then concentrate on queries that resemble standard SQL with arithmetic and aggregation. We show that these queries are measurable, and that their outputs have a finite representation. Moreover, since the classical forms of answers provide little information in the numerical setting, we look at the probability that numerical values in output tuples belong to specific intervals. Even though their exact computation is intractable, we show efficient approximation algorithms to compute such probabilities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2021

Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds (extended version)

Certain answers are a principled method for coping with the uncertainty ...
research
08/25/2020

On the complexity of query containment and computing certain answers in the presence of ACs

We often add arithmetic to extend the expressiveness of query languages ...
research
03/30/2019

Uncertainty Annotated Databases - A Lightweight Approach for Approximating Certain Answers (extended version)

Certain answers are a principled method for coping with uncertainty that...
research
12/27/2019

Aggregate Queries on Sparse Databases

We propose an algebraic framework for studying efficient algorithms for ...
research
03/28/2022

HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach

What-if (provisioning for an update to a database) and how-to (how to mo...
research
03/04/2021

Consistent Answers of Aggregation Queries using SAT Solvers

The framework of database repairs and consistent answers to queries is a...
research
04/22/2022

Uniform Operational Consistent Query Answering

Operational consistent query answering (CQA) is a recent framework for C...

Please sign up or login with your details

Forgot password? Click here to reset