Distribution Constraints: The Chase for Distributed Data

03/02/2020
by   Gaetano Geck, et al.
0

This paper introduces a declarative framework to specify and reason about distributions of data over computing nodes in a distributed setting. More specifically, it proposes distribution constraints which are tuple and equality generating dependencies (tgds and egds) extended with node variables ranging over computing nodes. In particular, they can express co-partitioning constraints and constraints about range-based data distributions by using comparison atoms. The main technical contribution is the study of the implication problem of distribution constraints. While implication is undecidable in general, relevant fragments of so-called data-full constraints are exhibited for which the corresponding implication problems are complete for EXPTIME, PSPACE and NP. These results yield bounds on deciding parallel-correctness for conjunctive queries in the presence of distribution constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2021

Harmless but Useful: Beyond Separable Equality Constraints in Datalog+/-

Ontological query answering is the problem of answering queries in the p...
research
11/01/2022

Reasoning on Property Graphs with Graph Generating Dependencies

Graph Generating Dependencies (GGDs) informally express constraints betw...
research
07/19/2022

Consistent Query Answering for Expressive Constraints under Tuple-Deletion Semantics

We study consistent query answering in relational databases. We consider...
research
05/05/2019

Adaptive Task Allocation for Asynchronous Federated Mobile Edge Learning

This paper proposes a scheme to efficiently execute distributed learning...
research
12/22/2021

Shape Fragments

In constraint languages for RDF graphs, such as ShEx and SHACL, constrai...
research
06/28/2018

Polynomial-time probabilistic reasoning with partial observations via implicit learning in probability logics

Standard approaches to probabilistic reasoning require that one possesse...
research
04/21/2022

Distributed Nonparametric Estimation under Communication Constraints

In the era of big data, it is necessary to split extremely large data se...

Please sign up or login with your details

Forgot password? Click here to reset