TraceSim: A Method for Calculating Stack Trace Similarity

09/26/2020
by   Roman Vasiliev, et al.
0

Many contemporary software products have subsystems for automatic crash reporting. However, it is well-known that the same bug can produce slightly different reports. To manage this problem, reports are usually grouped, often manually by developers. Manual triaging, however, becomes infeasible for products that have large userbases, which is the reason for many different approaches to automating this task. Moreover, it is important to improve quality of triaging due to the big volume of reports that needs to be processed properly. Therefore, even a relatively small improvement could play a significant role in overall accuracy of report bucketing. The majority of existing studies use some kind of a stack trace similarity metric, either based on information retrieval techniques or string matching methods. However, it should be stressed that the quality of triaging is still insufficient. In this paper, we describe TraceSim – a novel approach to address this problem which combines TF-IDF, Levenshtein distance, and machine learning to construct a similarity metric. Our metric has been implemented inside an industrial-grade report triaging system. The evaluation on a manually labeled dataset shows significantly better results compared to baseline approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2021

S3M: Siamese Stack (Trace) Similarity Measure

Automatic crash reporting systems have become a de-facto standard in sof...
research
04/30/2022

Aggregation of Stack Trace Similarities for Crash Report Deduplication

The automatic collection of stack traces in bug tracking systems is an i...
research
04/24/2023

Answering Follow-up Questions on Bug Reports with Structured Information Retrieval and Deep Learning

Software bug reports reported on bug-tracking systems often lack crucial...
research
03/22/2022

Enhancing Mobile App Bug Reporting via Real-time Understanding of Reproduction Steps

One of the primary mechanisms by which developers receive feedback about...
research
07/14/2022

Bug Fix Time Optimization Using Matrix Factorization and Iterative Gale-Shaply Algorithms

Bug triage is an essential task in software maintenance phase. It assign...
research
05/11/2023

PExReport: Automatic Creation of Pruned Executable Cross-Project Failure Reports

Modern software development extensively depends on existing libraries wr...
research
07/30/2020

Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization

Supervised approaches for Neural Abstractive Summarization require large...

Please sign up or login with your details

Forgot password? Click here to reset