TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization

01/27/2023
by   Anand Jayarajan, et al.
0

Stream processing engines (SPEs) are widely used for large scale streaming analytics over unbounded time-ordered data streams. Modern day streaming analytics applications exhibit diverse compute characteristics and demand strict latency and throughput requirements. Over the years, there has been significant attention in building hardware-efficient stream processing engines (SPEs) that support several query optimization, parallelization, and execution strategies to meet the performance requirements of large scale streaming analytics applications. However, in this work, we observe that these strategies often fail to generalize well on many real-world streaming analytics applications due to several inherent design limitations of current SPEs. We further argue that these limitations stem from the shortcomings of the fundamental design choices and the query representation model followed in modern SPEs. To address these challenges, we first propose TiLT, a novel intermediate representation (IR) that offers a highly expressive temporal query language amenable to effective query optimization and parallelization strategies. We subsequently build a compiler backend for TiLT that applies such optimizations on streaming queries and generates hardware-efficient code to achieve high performance on multi-core stream query executions. We demonstrate that TiLT achieves up to 326x (20.49x on average) higher throughput compared to state-of-the-art SPEs (e.g., Trill) across eight real-world streaming analytics applications. TiLT source code is available at https://github.com/ampersand-projects/tilt.git.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2019

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark

Real-world data from diverse domains require real-time scalable analysis...
research
12/01/2020

LifeStream: A High-performance Stream Processing Engine for Waveform Data

Hospitals around the world collect massive amount of physiological data ...
research
05/19/2022

Cloudprofiler: TSC-based inter-node profiling and high-throughput data ingestion for cloud streaming workloads

To conduct real-time analytics computations, big data stream processing ...
research
01/16/2020

Hardware-Conscious Stream Processing: A Survey

Data stream processing systems (DSPSs) enable users to express and run s...
research
06/09/2021

DynamiQ: Planning for Dynamics in Network Streaming Analytics Systems

The emergence of programmable data-plane targets has motivated a new hyb...
research
11/03/2019

A Streaming Analytics Language for Processing Cyber Data

We present a domain-specific language called SAL(the Streaming Analytics...
research
08/28/2023

Graph Analytics on Evolving Data (Abstract)

We consider the problem of graph analytics on evolving graphs. In this s...

Please sign up or login with your details

Forgot password? Click here to reset