Enhancing OBDA Query Translation over Tabular Data with Morph-CSV

01/24/2020
by   David Chaves Fraga, et al.
0

Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets (e.g., relational database, CSV, JSON), either by materializing integrated data into RDF or by performing on-the-fly integration via SPARQL-to-SQL query translation. In the specific case of tabular datasets comprised of several CSV or Excel files, query translation approaches have been applied taking as input a lightweight schema with table and column names, and considering each source as a single table that can be loaded into a relational database system (RDB). This naïve approach does not consider implicit constraints in this type of data, e.g., referential integrity among data sources, datatypes, or data integrity; We propose Morph-CSV, a framework that enforces constraints and can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV resorts to both a Constraints component and a set of operators that apply each type of constraint to the input with the aim of enhancing query completeness and performance. We evaluate Morph-CSV against a set of real-world open tabular datasets in the domain of the public transport; Morph-CSV is compared with existing approaches in terms of query result completeness and performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2018

Scripting Relational Database Engine Using Transducer

We allow database user to script a parallel relational database engine w...
research
07/21/2017

Cost-Driven Ontology-Based Data Access (Extended Version)

In ontology-based data access (OBDA), users are provided with a conceptu...
research
08/24/2018

Integrity Authentication for SQL Query Evaluation on Outsourced Databases: A Survey

Spurred by the development of cloud computing, there has been considerab...
research
02/12/2021

Updatable Materialization of Approximate Constraints

Modern big data applications integrate data from various sources. As a r...
research
09/11/2023

Quantifying Uncertainty in Aggregate Queries over Integrated Datasets

Data integration is a notoriously difficult and heuristic-driven process...
research
06/15/2018

Efficient Handling of SPARQL OPTIONAL for OBDA (Extended Version)

OPTIONAL is a key feature in SPARQL for dealing with missing information...
research
04/04/2018

R2RML Mappings in OBDA Systems: Enabling Comparison among OBDA Tools

In today's large enterprises there is a significant increasing trend in ...

Please sign up or login with your details

Forgot password? Click here to reset