Approximate Quantiles for Datacenter Telemetry Monitoring

06/01/2019
by   Gangmuk Lim, et al.
0

Datacenter systems require efficient troubleshooting and effective resource scheduling so as to minimize downtimes and to efficiently utilize limited resources. In doing so, datacenter operators employ streaming analytics for collecting and processing datacenter telemetry over a temporal window. The quantile operator is key in these systems as it can summarize the typical and abnormal behavior of the monitored system. Computing quantiles in real-time is resource-intensive as it requires processing hundreds of millions of events in seconds while providing high quantile accuracy. We overcome these challenges through workload-driven approximation. Our study uncovers three insights: (i) values are dominated by a set of recurring small values, (ii) distribution of small values is consistent across different time scales, and (iii) tail values are dominated by a small set of large values.We propose QLOVE, an efficient and accurate quantile approximation algorithm that capitalizes on these insights. QLOVE minimizes memory footprint of the quantile operator via compression and frequency-based summarization of small values. While these summaries are stored and processed at sub-window granularity for memory efficiency, they can extend to compute quantiles on user-defined temporal windows. Low value error for tail quantiles is achieved by retaining a few tail values per sub-window. QLOVE estimates quantiles with high throughput and less than 5 state-of-the-art algorithms either have a high relative value error (13-35 deliver lower throughput (15-92

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2021

Railgun: managing large streaming windows under MAD requirements

Some mission critical systems, e.g., fraud detection, require accurate, ...
research
03/07/2020

Aion: Better Late than Never in Event-Time Streams

Processing data streams in near real-time is an increasingly important t...
research
03/16/2018

Quantile correlation coefficient: a new tail dependence measure

We propose a new measure related with tail dependence in terms of correl...
research
11/26/2020

TailCoR

Economic and financial crises are characterised by unusually large event...
research
01/17/2019

A Multi-Level Simulation Optimization Approach for Quantile Functions

Quantile is a popular performance measure for a stochastic system to eva...
research
03/06/2018

Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries

Interactive analytics increasingly involves querying for quantiles over ...
research
08/07/2023

Dirigo: Self-scaling Stateful Actors For Serverless Real-time Data Processing

We propose Dirigo, a distributed stream processing service built atop vi...

Please sign up or login with your details

Forgot password? Click here to reset