GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams

04/07/2020
by   Salman Ahmed Shaikh, et al.
0

Apache Flink is an open-source system for scalable processing of batch and streaming data. Flink does not natively support efficient processing of spatial data streams, which is a requirement of many applications dealing with spatial data. Besides Flink, other scalable spatial data processing platforms including GeoSpark, Spatial Hadoop, etc. do not support streaming workloads and can only handle static/batch workloads. To fill this gap, we present GeoFlink, which extends Apache Flink to support spatial data types, indexes and continuous queries over spatial data streams. To enable the efficient processing of spatial continuous queries and for the effective data distribution across Flink cluster nodes, a gird-based index is introduced. GeoFlink currently supports spatial range, spatial kNN and spatial join queries on point data type. An extensive experimental study on real spatial data streams shows that GeoFlink achieves significantly higher query throughput than ordinary Flink processing.

READ FULL TEXT

page 11

page 12

research
02/27/2020

SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data

The proliferation of GPS-enabled devices has led to the development of n...
research
10/20/2017

STREAK: An Efficient Engine for Processing Top-k SPARQL Queries with Spatial Filters

The importance of geo-spatial data in critical applications such as emer...
research
10/26/2022

RMLStreamer-SISO: an RDF stream generator from streaming heterogeneous data

Stream-reasoning query languages such as CQELS and C-SPARQL enable query...
research
07/08/2019

In-memory Distributed Spatial Query Processing and Optimization

Due to the ubiquity of spatial data applications and the large amounts o...
research
01/19/2022

Tracking Where Events Take Place: Reverse Spatial Term Queries on Streaming Data

A large volume of content generated by online users is geo-tagged and th...
research
05/31/2021

System-aware dynamic partitioning for batch and streaming workloads

When processing data streams with highly skewed and nonstationary key di...
research
12/07/2021

Designing a Real-Time IoT Data Streaming Testbed for Horizontally Scalable Analytical Platforms: Czech Post Case Study

There is a growing trend for enterprise-level Internet of Things (IoT) a...

Please sign up or login with your details

Forgot password? Click here to reset