STREAK: An Efficient Engine for Processing Top-k SPARQL Queries with Spatial Filters

10/20/2017
by   Jyoti Leeka, et al.
0

The importance of geo-spatial data in critical applications such as emergency response, transportation, agriculture etc., has prompted the adoption of recent GeoSPARQL standard in many RDF processing engines. In addition to large repositories of geo-spatial data -- e.g., LinkedGeoData, OpenStreetMap, etc. -- spatial data is also routinely found in automatically constructed knowledgebases such as Yago and WikiData. While there have been research efforts for efficient processing of spatial data in RDF/SPARQL, very little effort has gone into building end-to-end systems that can holistically handle complex SPARQL queries along with spatial filters. In this paper, we present Streak, a RDF data management system that is designed to support a wide-range of queries with spatial filters including complex joins, top-k, higher-order relationships over spatially enriched databases. Streak introduces various novel features such as a careful identifier encoding strategy for spatial and non-spatial entities, the use of a semantics-aware Quad-tree index that allows for early-termination and a clever use of adaptive query processing with zero plan-switch cost. We show that Streak can scale to some of the largest publicly available semantic data resources such as Yago3 and LinkedGeoData which contain spatial entities and quantifiable predicates useful for result ranking. For experimental evaluations, we focus on top-k distance join queries and demonstrate that Streak outperforms popular spatial join algorithms as well as state of the art end-to-end systems like Virtuoso and PostgreSQL.

READ FULL TEXT

page 11

page 12

research
04/07/2020

GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams

Apache Flink is an open-source system for scalable processing of batch a...
research
07/08/2019

LocationSpark: In-memory Distributed Spatial Query Processing and Optimization

Due to the ubiquity of spatial data applications and the large amounts o...
research
07/08/2019

In-memory Distributed Spatial Query Processing and Optimization

Due to the ubiquity of spatial data applications and the large amounts o...
research
05/05/2020

Conditional Cuckoo Filters

Bloom filters, cuckoo filters, and other approximate set membership sket...
research
03/27/2022

GPU-Powered Spatial Database Engine for Commodity Hardware: Extended Version

Given the massive growth in the volume of spatial data, there is a great...
research
01/29/2018

Join Query Optimization Techniques for Complex Event Processing Applications

Complex event processing (CEP) is a prominent technology used in many mo...
research
02/16/2020

Multidimensional Enrichment of Spatial RDF Data for SOLAP – Full Version

Large volumes of spatial data and multidimensional data are being publis...

Please sign up or login with your details

Forgot password? Click here to reset