Consistent data fusion with Parker

02/24/2022
by   Antoon Bronselaer, et al.
0

When combining data from multiple sources, inconsistent data complicates the production of a coherent result. In this paper, we introduce a new type of constraints called edit rules under a partial key (EPKs). These constraints can model inconsistencies both within and between sources, but in a loosely-coupled matter. We show that we can adapt the well-known set cover methodology to the setting of EPKs and this yields an efficient algorithm to find minimal cost repairs of sources. This algorithm is implemented in a repair engine called Parker. Empirical results show that Parker is several orders of magnitude faster than state-of-the-art repair tools. At the same time, the quality of the repairs in terms of F_1-score ranges from comparable to better compared to these tools.

READ FULL TEXT
research
12/26/2017

Pattern-Driven Data Cleaning

Data is inherently dirty and there has been a sustained effort to come u...
research
07/31/2018

Improve3C: Data Cleaning on Consistency and Completeness with Currency

Data quality plays a key role in big data management today. With the exp...
research
08/25/2022

LinCQA: Faster Consistent Query Answering with Linear Time Guarantees

Most data analytical pipelines often encounter the problem of querying i...
research
07/18/2023

Rule-based Graph Repair using Minimally Restricted Consistency-Improving Transformations

Model-driven software engineering is a suitable method for dealing with ...
research
04/10/2020

On Multiple Semantics for Declarative Database Repairs

We study the problem of database repairs through a rule-based framework ...
research
07/08/2020

T-REx: Table Repair Explanations

Data repair is a common and crucial step in many frameworks today, as ap...
research
08/10/2020

Robot Action Selection Learning via Layered Dimension Informed Program Synthesis

Action selection policies (ASPs), used to compose low-level robot skills...

Please sign up or login with your details

Forgot password? Click here to reset