Zero-Shot Cost Models for Distributed Stream Processing

07/08/2022
by   Roman Heinrich, et al.
0

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize to the dynamics of streaming workloads out-of-the-box. This means a model once trained can accurately predict performance metrics such as latency and throughput even if the characteristics of the data and workload or the deployment of operators to hardware changes at runtime. That way, the model can be used to solve tasks such as optimizing the placement of operators to minimize the end-to-end latency of a streaming query or maximize its throughput even under varying conditions. Our evaluation on a well-known DSPS, Apache Storm, shows that the model can predict accurately for unseen workloads and queries while generalizing across real-world benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/03/2022

Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction

In this paper, we introduce zero-shot cost models which enable learned c...
research
05/11/2020

Performance Modeling and Vertical Autoscaling of Stream Joins

Streaming analysis is widely used in cloud as well as edge infrastructur...
research
06/09/2021

DynamiQ: Planning for Dynamics in Network Streaming Analytics Systems

The emergence of programmable data-plane targets has motivated a new hyb...
research
11/08/2021

LMStream: When Distributed Micro-Batch Stream Processing Systems Meet GPU

This paper presents LMStream, which ensures bounded latency while maximi...
research
02/23/2018

Benchmarking Distributed Stream Processing Engines

Over the last years, stream data processing has been gaining attention b...
research
02/27/2020

SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data

The proliferation of GPS-enabled devices has led to the development of n...
research
07/28/2023

FleXR: A System Enabling Flexibly Distributed Extended Reality

Extended reality (XR) applications require computationally demanding fun...

Please sign up or login with your details

Forgot password? Click here to reset