Scaling Stream Processing with Transactional State Management on Multicores

04/08/2019
by   Shuhao Zhang, et al.
0

Transactional state management relieves users from managing state consistency during stream processing by themselves. This paper introduces TStream, a highly scalable data stream processing system (DSPS) with built-in transactional state management. TStream is specifically designed for modern shared-memory multicore architectures. TStream's key contribution is a novel asynchronous state transaction processing paradigm. By detaching and postponing state accesses from the stream application computation logic, TStream minimizes unnecessary stalls caused by state management in stream processing. The postponed state accesses naturally form a batch, and we further propose an operation-chain based execution model that aggressively extracts parallelism opportunities within each batch of state access operations guaranteeing consistency without locks. To confirm the effectiveness of our proposal, we compared TStream against four alternative designs on a 40-core machine. Our extensive experiment study show that TStream yields much higher throughput and scalability with limited latency penalty when processing different types of workloads.

READ FULL TEXT

page 4

page 10

research
04/07/2019

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures

We introduce BriskStream, an in-memory data stream processing system (DS...
research
07/24/2023

MorphStream: Scalable Processing of Transactions over Streams on Multicores

Transactional Stream Processing Engines (TSPEs) form the backbone of mod...
research
07/17/2023

Harnessing Scalable Transactional Stream Processing for Managing Large Language Models [Vision]

Large Language Models (LLMs) have demonstrated extraordinary performance...
research
11/25/2021

STRETCH: Virtual Shared-Nothing Parallelism for Scalable and Elastic Stream Processing

Stream processing applications extract value from raw data through Direc...
research
11/08/2021

LMStream: When Distributed Micro-Batch Stream Processing Systems Meet GPU

This paper presents LMStream, which ensures bounded latency while maximi...
research
07/20/2023

TransNFV: Integrating Transactional Semantics for Efficient State Management in Virtual Network Functions

Managing shared mutable states in high concurrency state access operatio...
research
09/09/2023

Benne: A Modular and Self-Optimizing Algorithm for Data Stream Clustering

In various real-world applications, ranging from the Internet of Things ...

Please sign up or login with your details

Forgot password? Click here to reset