Stream Processing With Dependency-Guided Synchronization

04/09/2021
by   Konstantinos Kallas, et al.
0

Real-time data processing applications with low latency requirements have led to the increasing popularity of stream processing systems. While such systems offer convenient APIs that can be used to achieve data parallelism automatically, they offer limited support for computations that require synchronization between parallel nodes. In this paper, we propose *dependency-guided synchronization (DGS)*, an alternative programming model and stream processing API for stateful streaming computations with complex synchronization requirements. In the proposed model, the input is viewed as partially ordered, and the program consists of a set of parallelization constructs which are applied to decompose the partial order and process events independently. Our API maps to an execution model called *synchronization plans* which supports synchronization between parallel nodes. Our evaluation shows that APIs offered by two widely used systems – Flink and Timely Dataflow – cannot suitably expose parallelism in some representative applications. In contrast, DGS enables implementations with scalable performance, the resulting synchronization plans offer throughput improvements when implemented manually in existing systems, and the programming overhead is small compared to writing sequential code.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2018

Scaling Ordered Stream Processing on Shared-Memory Multicores

Many modern applications require real-time processing of large volumes o...
research
05/09/2019

Exploiting Fine-Grain Ordered Parallelism in Dense Matrix Algorithms

Dense linear algebra kernels are critical for wireless applications, and...
research
05/19/2022

Cloudprofiler: TSC-based inter-node profiling and high-throughput data ingestion for cloud streaming workloads

To conduct real-time analytics computations, big data stream processing ...
research
11/12/2018

Time synchronization in vehicular ad-hoc networks: A survey ontheory and practice

Time synchronization in communication networks provides a common time fr...
research
06/12/2020

Streaming Computations with Region-Based State on SIMD Architectures

Streaming computations on massive data sets are an attractive candidate ...
research
11/17/2019

PriorityGraph: A Unified Programming Model for Optimizing Ordered Graph Algorithms

Many graph problems can be solved using ordered parallel graph algorithm...
research
03/21/2021

Graph Transformation and Specialized Code Generation For Sparse Triangular Solve (SpTRSV)

Sparse Triangular Solve (SpTRSV) is an important computational kernel us...

Please sign up or login with your details

Forgot password? Click here to reset