Not All Bugs Are the Same: Understanding, Characterizing, and Classifying the Root Cause of Bugs

by   Gemma Catolino, et al.

Modern version control systems such as Git or SVN include bug tracking mechanisms, through which developers can highlight the presence of bugs through bug reports, i.e., textual descriptions reporting the problem and what are the steps that led to a failure. In past and recent years, the research community deeply investigated methods for easing bug triage, that is, the process of assigning the fixing of a reported bug to the most qualified developer. Nevertheless, only a few studies have reported on how to support developers in the process of understanding the type of a reported bug, which is the first and most time-consuming step to perform before assigning a bug-fix operation. In this paper, we target this problem in two ways: first, we analyze 1,280 bug reports of 119 popular projects belonging to three ecosystems such as Mozilla, Apache, and Eclipse, with the aim of building a taxonomy of the root causes of reported bugs; then, we devise and evaluate an automated classification model able to classify reported bugs according to the defined taxonomy. As a result, we found nine main common root causes of bugs over the considered systems. Moreover, our model achieves high F-Measure and AUC-ROC (64 overall, respectively).


page 1

page 2

page 3

page 4


Developer Load Normalization Using Iterative Kuhn-Munkres Algorithm: An Optimization Triaging Approach

Bug triage can be defined as the process of assigning a developer to a b...

Root cause prediction based on bug reports

This paper proposes a supervised machine learning approach for predictin...

Using Word Embedding and Convolution Neural Network for Bug Triaging by Considering Design Flaws

Resolving bugs in the maintenance phase of software is a complicated tas...

Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering

Tracking user reported bugs requires considerable engineering effort in ...

Characterizing Bugs in Python and R Data Analytics Programs

R and Python are among the most popular languages used in many critical ...

Recommending Bug-fixing Comments from Issue Tracking Discussions in Support of Bug Repair

In practice, developers search for related earlier bugs and their associ...

An Empirical Study on the Bugs Found while Reusing Pre-trained Natural Language Processing Models

In NLP, reusing pre-trained models instead of training from scratch has ...

Please sign up or login with your details

Forgot password? Click here to reset