Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing

11/03/2017
by   Li Wang, et al.
0

Elasticity is highly desirable for stream processing systems to guarantee low latency against workload dynamics, such as surges in data arrival rate and fluctuations in data distribution. Existing systems achieve elasticity following a resource-centric approach that uses dynamic key partitioning across the parallel instances, i.e. executors, to balance the workload and scale operators. However, such operator-level key repartitioning needs global synchronization and prohibits rapid elasticity. To address this problem, we propose an executor-centric approach, whose core idea is to avoid operator-level key repartitioning while implementing each executor as the building block of elasticity. Following this new approach, we design the Elasticutor framework with two level of optimizations: i) a novel implementation of executors, i.e., elastic executors, that perform elastic multi-core execution via efficient intra-executor load balancing and executor scaling and ii) a global model-based scheduler that dynamically allocates CPU cores to executors based on the instantaneous workloads. We implemented a prototype of Elasticutor and conducted extensive experiments. Our results show that Elasticutor doubles the throughput and achieves an average processing latency up to 2 orders of magnitude lower than previous methods, for a dynamic workload of real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2018

Trevor: Automatic configuration and scaling of stream processing pipelines

Operating a distributed data stream processing workload efficiently at s...
research
06/03/2018

Efficient Time-Evolving Stream Processing at Scale

Time-evolving stream datasets exist ubiquitously in many real-world appl...
research
11/18/2022

PIM-tree: A Skew-resistant Index for Processing-in-Memory

The performance of today's in-memory indexes is bottlenecked by the memo...
research
10/03/2019

SEUSS: Rapid serverless deployment using environment snapshots

Modern FaaS systems perform well in the case of repeat executions when f...
research
10/06/2020

Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo

Resource provisioning in multi-tenant stream processing systems faces th...
research
06/09/2021

DynamiQ: Planning for Dynamics in Network Streaming Analytics Systems

The emergence of programmable data-plane targets has motivated a new hyb...
research
04/07/2019

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures

We introduce BriskStream, an in-memory data stream processing system (DS...

Please sign up or login with your details

Forgot password? Click here to reset