Spectrum-Based Log Diagnosis

08/16/2020
by   Carl Martin Rosenberg, et al.
0

We present and evaluate Spectrum-Based Log Diagnosis (SBLD), a method to help developers quickly diagnose problems found in complex integration and deployment runs. Inspired by Spectrum-Based Fault Localization, SBLD leverages the differences in event occurrences between logs for failing and passing runs, to highlight events that are stronger associated with failing runs. Using data provided by our industrial partner, we empirically investigate the following questions: (i) How well does SBLD reduce the effort needed to identify all failure-relevant events in the log for a failing run? (ii) How is the performance of SBLD affected by available data? (iii) How does SBLD compare to searching for simple textual patterns that often occur in failure-relevant events? We answer (i) and (ii) using summary statistics and heatmap visualizations, and for (iii) we compare three configurations of SBLD (with resp. minimum, median and maximum data) against a textual search using Wilcoxon signed-rank tests and the Vargha-Delaney measure of stochastic superiority. Our evaluation shows that (i) SBLD achieves a significant effort reduction for the dataset used, (ii) SBLD benefits from additional logs for passing runs in general, and it benefits from additional logs for failing runs when there is a proportional amount of logs for passing runs in the data. Finally, (iii) SBLD and textual search are roughly equally effective at effort-reduction, while textual search has a slightly better recall. We investigate the cause, and discuss how it is due to the characteristics of a specific part of our data. We conclude that SBLD shows promise as a method for diagnosing failing runs, that its performance is positively affected by additional data, but that it does not outperform textual search on the dataset considered. Future work includes investigating SBLD's generalizability on additional datasets.

READ FULL TEXT

page 6

page 7

page 8

research
09/07/2020

Improving Problem Identification via Automated Log Clustering using Dimensionality Reduction

Goal: We consider the problem of automatically grouping logs of runs tha...
research
03/21/2023

LogQA: Question Answering in Unstructured Logs

Modern systems produce a large volume of logs to record run-time status ...
research
02/13/2019

Delog: A Privacy Preserving Log Filtering Framework for Online Compute Platforms

In many software applications, logs serve as the only interface between ...
research
03/20/2019

Is Basketball a Game of Runs?

Basketball is often referred to as "a game of runs." We investigate the ...
research
01/09/2023

Making Sense of Failure Logs in an Industrial DevOps Environment

Processing and reviewing nightly test execution failure logs for large i...
research
02/18/2022

Pinpointing Anomaly Events in Logs from Stability Testing – N-Grams vs. Deep-Learning

As stability testing execution logs can be very long, software engineers...
research
03/26/2018

Algorithm Configuration: Learning policies for the quick termination of poor performers

One way to speed up the algorithm configuration task is to use short run...

Please sign up or login with your details

Forgot password? Click here to reset