Railgun: managing large streaming windows under MAD requirements

06/23/2021
by   Ana Sofia Gomes, et al.
0

Some mission critical systems, e.g., fraud detection, require accurate, real-time metrics over long time sliding windows on applications that demand high throughput and low latencies. As these applications need to run 'forever' and cope with large, spiky data loads, they further require to be run in a distributed setting. We are unaware of any streaming system that provides all those properties. Instead, existing systems take large simplifications, such as implementing sliding windows as a fixed set of overlapping windows, jeopardizing metric accuracy (violating regulatory rules) or latency (breaching service agreements). In this paper, we propose Railgun, a fault-tolerant, elastic, and distributed streaming system supporting real-time sliding windows for scenarios requiring high loads and millisecond-level latencies. We benchmarked an initial prototype of Railgun using real data, showing significant lower latency than Flink and low memory usage independent of window size. Further, we show that Railgun scales nearly linearly, respecting our msec-level latencies at high percentiles (<250ms @ 99.9 1 million events per second.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/01/2020

Railgun: streaming windows for mission critical systems

Some mission critical systems, such as fraud detection, require accurate...
research
06/01/2019

Approximate Quantiles for Datacenter Telemetry Monitoring

Datacenter systems require efficient troubleshooting and effective resou...
research
09/15/2016

Virtualizing System and Ordinary Services in Windows-based OS-Level Virtual Machines

OS-level virtualization incurs smaller start-up and run-time overhead th...
research
10/05/2018

Memento: Making Sliding Windows Efficient for Heavy Hitters

Cloud operators require real-time identification of Heavy Hitters (HH) a...
research
07/17/2018

Optimization of the n-dimensional sliding window inter-channel correlation algorithm for multi-core architecture

Calculating the correlation in a sliding window is a common method of st...
research
12/08/2019

A study on Modern Messaging Systems- Kafka, RabbitMQ and NATS Streaming

Distributed messaging systems form the core of big data streaming, cloud...
research
12/19/2018

Fast Botnet Detection From Streaming Logs Using Online Lanczos Method

Botnet, a group of coordinated bots, is becoming the main platform of ma...

Please sign up or login with your details

Forgot password? Click here to reset