DynamiQ: Planning for Dynamics in Network Streaming Analytics Systems

by   Rohan Bhatia, et al.

The emergence of programmable data-plane targets has motivated a new hybrid design for network streaming analytics systems that combine these targets' fast packet processing speeds with the rich compute resources available at modern stream processors. However, these systems require careful query planning; that is, specifying the minute details of executing a given set of queries in a way that makes the best use of the limited resources and programmability offered by data-plane targets. We use such an existing system, Sonata, and real-world packet traces to understand how executing a fixed query workload is affected by the unknown dynamics of the traffic that defines the target's input workload. We observe that static query planning, as employed by Sonata, cannot handle even small changes in the input workload, wasting data-plane resources to the point where query execution is confined mainly to userspace. This paper presents the design and implementation of DynamiQ, a new network streaming analytics platform that employs dynamic query planning to deal with the dynamics of real-world input workloads. Specifically, we develop a suite of practical algorithms for (i) computing effective initial query plans (to start query execution) and (ii) enabling efficient updating of portions of such an initial query plan at runtime (to adapt to changes in the input workload). Using real-world packet traces as input workload, we show that compared to Sonata, DynamiQ reduces the stream processor's workload by two orders of magnitude.


Zero-Shot Cost Models for Distributed Stream Processing

This paper proposes a learned cost estimation model for Distributed Stre...

SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data

The proliferation of GPS-enabled devices has led to the development of n...

TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization

Stream processing engines (SPEs) are widely used for large scale streami...

Quegel: A General-Purpose Query-Centric Framework for Querying Big Graphs

Pioneered by Google's Pregel, many distributed systems have been develop...

Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo

Resource provisioning in multi-tenant stream processing systems faces th...

Trevor: Automatic configuration and scaling of stream processing pipelines

Operating a distributed data stream processing workload efficiently at s...

Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing

Elasticity is highly desirable for stream processing systems to guarante...

Please sign up or login with your details

Forgot password? Click here to reset