GSmart: An Efficient SPARQL Query Engine Using Sparse Matrix Algebra – Full Version

06/26/2021
by   Yuedan Chen, et al.
0

Efficient execution of SPARQL queries over large RDF datasets is a topic of considerable interest due to increased use of RDF to encode data. Most of this work has followed either relational or graph-based approaches. In this paper, we propose an alternative query engine, called gSmart, based on matrix algebra. This approach can potentially better exploit the computing power of high-performance heterogeneous architectures that we target. gSmart incorporates: (1) grouped incident edge-based SPARQL query evaluation, in which all unevaluated edges of a vertex are evaluated together using a series of matrix operations to fully utilize query constraints and narrow down the solution space; (2) a graph query planner that determines the order in which vertices in query graphs should be evaluated; (3) memory- and computation-efficient data structures including the light-weight sparse matrix (LSpM) storage for RDF data and the tree-based representation for evaluation results; (4) a multi-stage data partitioner to map the incident edge-based query evaluation into heterogeneous HPC architectures and develop multi-level parallelism; and (5) a parallel executor that uses the fine-grained processing scheme, pre-pruning technique, and tree-pruning technique to lower inter-node communication and enable high throughput. Evaluations of gSmart on a CPU+GPU HPC architecture show execution time speedups of up to 46920.00x compared to the existing SPARQL query engines on a single node machine. Additionally, gSmart on the Tianhe-1A supercomputer achieves a maximum speedup of 6.90x scaling from 2 to 16 CPU+GPU nodes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2017

Context-Free Path Querying by Matrix Multiplication

Graph data models are widely used in many areas, for example, bioinforma...
research
12/13/2020

Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures

Designing efficient and scalable sparse linear algebra kernels on modern...
research
09/15/2022

MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems

Sparse linear algebra kernels play a critical role in numerous applicati...
research
07/24/2023

HiHGNN: Accelerating HGNNs through Parallelism and Data Reusability Exploitation

Heterogeneous graph neural networks (HGNNs) have emerged as powerful alg...
research
05/12/2020

Heterogeneous CPU/GPU co-execution of CFD simulations on the POWER9 architecture: Application to airplane aerodynamics

High fidelity Computational Fluid Dynamics simulations are generally ass...
research
02/03/2021

HiCOPS: High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry based Omics Data

Database-search algorithms, that deduce peptides from Mass Spectrometry ...
research
04/07/2020

A GPU-friendly Geometric Data Model and Algebra for Spatial Queries: Extended Version

The availability of low cost sensors has led to an unprecedented growth ...

Please sign up or login with your details

Forgot password? Click here to reset