Phoebe: QoS-Aware Distributed Stream Processing through Anticipating Dynamic Workloads

06/20/2022
by   Morgan K. Geldenhuys, et al.
0

Distributed Stream Processing systems have become an essential part of big data processing platforms. They are characterized by the high-throughput processing of near to real-time event streams with the goal of delivering low-latency results and thus enabling time-sensitive decision making. At the same time, results are expected to be consistent even in the presence of partial failures where exactly-once processing guarantees are required for correctness. Stream processing workloads are oftentimes dynamic in nature which makes static configurations highly inefficient as time goes by. Static resource allocations will almost certainly either negatively impact upon the Quality of Service and/or result in higher operational costs. In this paper we present Phoebe, a proactive approach to system auto-tuning for Distributed Stream Processing jobs executing on dynamic workloads. Our approach makes use of parallel profiling runs, QoS modeling, and runtime optimization to provide a general solution whereby configuration parameters are automatically tuned to ensure a stable service as well as alignment with recovery time Quality of Service targets. Phoebe makes use of Time Series Forecasting to gain an insight into future workload requirements thereby delivering scaling decisions which are accurate, long-lived, and reliable. Our experiments demonstrate that Phoebe is able to deliver a stable service while at the same time reducing resource over-provisioning.

READ FULL TEXT

page 1

page 8

research
09/06/2021

Khaos: Dynamically Optimizing Checkpointing for Dependable Distributed Stream Processing

Distributed Stream Processing systems are becoming an increasingly essen...
research
08/10/2021

Evaluation of Load Prediction Techniques for Distributed Stream Processing

Distributed Stream Processing (DSP) systems enable processing large stre...
research
09/14/2018

Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning

Fine tuning distributed systems is considered to be a craftsmanship, rel...
research
02/02/2021

QoS-Aware Power Minimization of Distributed Many-Core Servers using Transfer Q-Learning

Web servers scaled across distributed systems necessitate complex runtim...
research
02/11/2021

Chiron: Optimizing Fault Tolerance in QoS-aware Distributed Stream Processing Jobs

Fault tolerance is a property which needs deeper consideration when deal...
research
02/20/2023

Profiling and Optimizing Java Streams

The Stream API was added in Java 8 to allow the declarative expression o...
research
01/31/2018

Henge: Intent-driven Multi-Tenant Stream Processing

We present Henge, a system to support intent-based multi-tenancy in mode...

Please sign up or login with your details

Forgot password? Click here to reset