Parallel Index-based Stream Join on a Multicore CPU

03/01/2019
by   Amirhesam Shahvarani, et al.
0

There is increasing interest in using multicore processors to accelerate stream processing. For example, indexing sliding window content to enhance the performance of streaming queries is greatly improved by utilizing the computational capabilities of a multicore processor. However, designing an effective concurrency control mechanism that addresses the problem of concurrent indexing in highly dynamic settings remains a challenge. In this paper, we introduce an index data structure, called the Partitioned In-memory Merge-Tree, to address the challenges that arise when indexing highly dynamic data, which are common in streaming settings. To complement the index, we design an algorithm to realize a parallel index-based stream join that exploits the computational power of multicore processors. Our experiments using an octa-core processor show that our parallel stream join achieves up to 5.5 times higher throughput than a single-threaded approach.

READ FULL TEXT

page 9

page 10

research
11/13/2018

PanJoin: A Partition-based Adaptive Stream Join

In stream processing, stream join is one of the critical sources of perf...
research
01/23/2023

Sliding Window String Indexing in Streams

Given a string S over an alphabet Σ, the 'string indexing problem' is to...
research
10/15/2019

Optimizing Semi-Stream CACHEJOIN for Near-Real-Time Data Warehousing

Streaming data join is a critical process in the field of near-real-time...
research
10/05/2003

The Graphics Card as a Streaming Computer

Massive data sets have radically changed our understanding of how to des...
research
06/10/2021

Stream processors and comodels

In 2009, Ghani, Hancock and Pattinson gave a coalgebraic characterisatio...
research
06/12/2020

Indexing Data on the Web: A Comparison of Schema-level Indices for Data Search – Extended Technical Report

Indexing the Web of Data offers many opportunities, in particular, to fi...
research
08/24/2021

Making RDBMSs Efficient on Graph Workloads Through Predefined Joins

Joins in native graph database management systems (GDBMSs) are predefine...

Please sign up or login with your details

Forgot password? Click here to reset