Automated Cause Analysis of Latency Outliers Using System-Level Dependency Graphs

07/13/2022
by   Sneh Patel, et al.
0

Detecting performance issues and identifying their root causes in the runtime is a challenging task. Typically, developers use methods such as logging and tracing to identify bottlenecks. These solutions are, however, not ideal as they are time-consuming and require manual effort. In this paper, we propose a method to automate the task of detecting latency outliers using system-level traces and then comparing them to identify the root cause(s). Our method makes use of dependency graphs to show internal interactions between threads and system resources. With these graphs, one can pinpoint where performance issues occur. However, a single trace can be composed of a large number of requests, each generating one graph. To automate the task of identifying outliers within the dataset, we use machine learning density-based models and statistical calculations such as -score. Our evaluation shows an accuracy greater than 97 on outlier detection, making them appropriate for in-production servers and industry-level use cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2021

DepGraph: Localizing Performance Bottlenecks in Multi-Core Applications Using Waiting Dependency Graphs and Software Tracing

This paper addresses the challenge of understanding the waiting dependen...
research
03/08/2021

Automatic Cause Detection of Performance Problems in Web Applications

The execution of similar units can be compared by their internal behavio...
research
02/08/2021

Feature Engineering for Scalable Application-Level Post-Silicon Debugging

We present systematic and efficient solutions for both observability enh...
research
02/11/2022

The Benefit of Hindsight: Tracing Edge-Cases in Distributed Systems

Today's distributed tracing frameworks only trace a small fraction of al...
research
10/21/2021

DeLag: Detecting Latency Degradation Patterns in Service-based Systems

Performance debugging in production is a fundamental activity in modern ...
research
12/05/2019

Causal structure based root cause analysis of outliers

We describe a formal approach to identify 'root causes' of outliers obse...
research
06/17/2019

Slicing the IO execution with ReLayTracer

Analyzing IO performance anomalies is a crucial task in various computin...

Please sign up or login with your details

Forgot password? Click here to reset