The Odyssey Approach for Optimizing Federated SPARQL Queries

05/17/2017
by   Gabriela Montoya, et al.
0

Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans. Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore enables Odyssey to produce better query execution plans. Our experimental results show that Odyssey produces query execution plans that are better in terms of data transfer and execution time than state-of-the-art optimizers. Our experiments using the FedBench benchmark show execution time gains of at least 25 times on average.

READ FULL TEXT
research
02/19/2020

Optimizing Federated Queries Based on the Physical Design of a Data Lake

The optimization of query execution plans is known to be crucial for red...
research
10/23/2018

Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD

The federated query extension of SPARQL 1.1 allows executing queries dis...
research
05/17/2022

Rank-based Heuristics for Optimizing the Execution of Product Data Models

The Product Data Model (PDM) is an example of a data-centric approach to...
research
05/02/2019

Can the Optimizer Cost be Used to Predict Query Execution Times?

Predicting the execution time of queries is an important problem with ap...
research
09/17/2022

Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark

Access plan recommendation is a query optimization approach that execute...
research
03/24/2023

Efficient Execution of SPARQL Queries with OPTIONAL and UNION Expressions

The proliferation of RDF datasets has resulted in studies focusing on op...
research
07/29/2019

Precomputing Datalog evaluation plans in large-scale scenarios

With the more and more growing demand for semantic Web services over lar...

Please sign up or login with your details

Forgot password? Click here to reset