gSMat: A Scalable Sparse Matrix-based Join for SPARQL Query Processing

07/20/2018
by   Xiaowang Zhang, et al.
0

Resource Description Framework (RDF) has been widely used to represent information on the web, while SPARQL is a standard query language to manipulate RDF data. Given a SPARQL query, there often exist many joins which are the bottlenecks of efficiency of query processing. Besides, the real RDF datasets often reveal strong data sparsity, which indicates that a resource often only relates to a few resources even the number of total resources is large. In this paper, we propose a sparse matrix-based (SM-based) SPARQL query processing approach over RDF datasets which con- siders both join optimization and data sparsity. Firstly, we present a SM-based storage for RDF datasets to lift the storage efficiency, where valid edges are stored only, and then introduce a predicate- based hash index on the storage. Secondly, we develop a scalable SM-based join algorithm for SPARQL query processing. Finally, we analyze the overall cost by accumulating all intermediate results and design a query plan generated algorithm. Besides, we extend our SM-based join algorithm on GPU for parallelizing SPARQL query processing. We have evaluated our approach compared with the state-of-the-art RDF engines over benchmark RDF datasets and the experimental results show that our proposal can significantly improve SPARQL query processing with high scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2020

Storage, Indexing, Query Processing, and Benchmarking in Centralized and Distributed RDF Engines: A Survey

The recent advancements of the Semantic Web and Linked Data have changed...
research
12/04/2020

Hiperfact: In-Memory High Performance Fact Processing – Rethinking the Rete Inference Algorithm

The Rete forward inference algorithm forms the basis for many rule engin...
research
07/04/2018

TripleID-Q: RDF Query Processing Framework using GPU

Resource Description Framework (RDF) data represents information linkage...
research
12/21/2022

Resource Utilization Monitoring for Raw Data Query Processing

Scientific experiments, simulations, and modern applications generate la...
research
10/05/2021

Scalable Relational Query Processing on Big Matrix Data

The use of large-scale machine learning methods is becoming ubiquitous i...
research
06/07/2021

Sub-trajectory Similarity Join with Obfuscation

User trajectory data is becoming increasingly accessible due to the prev...
research
01/10/2023

Change Propagation Without Joins

We revisit the classical change propagation framework for query evaluati...

Please sign up or login with your details

Forgot password? Click here to reset