A geospatial source selector for federated GeoSPARQL querying

by   Antonis Troumpoukis, et al.

Background: Geospatial linked data brings into the scope of the Semantic Web and its technologies, a wealth of datasets that combine semantically-rich descriptions of resources with their geo-location. There are, however, various Semantic Web technologies where technical work is needed in order to achieve the full integration of geospatial data, and federated query processing is one of these technologies. Methods: In this paper, we explore the idea of annotating data sources with a bounding polygon that summarizes the spatial extent of the resources in each data source, and of using such a summary as an (additional) source selection criterion in order to reduce the set of sources that will be tested as potentially holding relevant data. We present our source selection method, and we discuss its correctness and implementation. Results: We evaluate the proposed source selection using three different types of summaries with different degrees of accuracy, against not using geospatial summaries. We use datasets and queries from a practical use case that combines crop-type data with water availability data for food security. The experimental results suggest that more complex summaries lead to slower source selection times, but also to more precise exclusion of unneeded sources. Moreover, we observe the source selection runtime is (partially or fully) recovered by shorter planning and execution runtimes. As a result, the federated sources are not burdened by pointless querying from the federation engine. Conclusions: The evaluation draws on data and queries from the agroenvironmental domain and shows that our source selection method substantially improves the effectiveness of federated GeoSPARQL query processing.


page 3

page 9

page 10

page 14

page 18

page 30

page 32


Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD

The federated query extension of SPARQL 1.1 allows executing queries dis...

Optimizing Federated Queries Based on the Physical Design of a Data Lake

The optimization of query execution plans is known to be crucial for red...

SGMFQP:An Ontology-based Swine Gut Microbiota Federated Query Platform

Gut microbiota plays a crucial role in modulating pig development and he...

FedQPL: A Language for Logical Query Plans over Heterogeneous Federations of RDF Data Sources (Extended Version)

Federations of RDF data sources provide great potential when queried for...

A Framework for Federated SPARQL Query Processing over Heterogeneous Linked Data Fragments

Linked Data Fragments (LDFs) refer to Web interfaces that allow for acce...

An Empirical Evaluation of Cost-based Federated SPARQL Query Processing Engines

Finding a good query plan is key to the optimization of query runtime. T...

VoIDext: Vocabulary and patterns for enhancing interoperable datasets with virtual links

Semantic heterogeneity remains a problem when interoperating with data f...

Please sign up or login with your details

Forgot password? Click here to reset