SZZ in the time of Pull Requests

09/07/2022
by   Fernando Petrulio, et al.
0

In the multi-commit development model, programmers complete tasks (e.g., implementing a feature) by organizing their work in several commits and packaging them into a commit-set. Analyzing data from developers using this model can be useful to tackle challenging developers' needs, such as knowing which features introduce a bug as well as assessing the risk of integrating certain features in a release. However, to do so one first needs to identify fix-inducing commit-sets. For such an identification, the SZZ algorithm is the most natural candidate, but its performance has not been evaluated in the multi-commit context yet. In this study, we conduct an in-depth investigation on the reliability and performance of SZZ in the multi-commit model. To obtain a reliable ground truth, we consider an already existing SZZ dataset and adapt it to the multi-commit context. Moreover, we devise a second dataset that is more extensive and directly created by developers as well as Quality Assurance (QA) engineers of Mozilla. Based on these datasets, we (1) test the performance of B-SZZ and its non-language-specific SZZ variations in the context of the multi-commit model, (2) investigate the reasons behind their specific behavior, and (3) analyze the impact of non-relevant commits in a commit-set and automatically detect them before using SZZ.

READ FULL TEXT

page 6

page 9

page 17

page 18

research
03/19/2022

On Debugging the Performance of Configurable Software Systems: Developer Needs and Tailored Tool Support

Determining whether a configurable software system has a performance bug...
research
03/22/2021

Mea culpa: How developers fix their own simple bugs differently from other developers

In this work, we study how the authorship of code affects bug-fixing com...
research
08/09/2023

Evaluating SZZ Implementations: An Empirical Study on the Linux Kernel

The SZZ algorithm is used to connect bug-fixing commits to the earlier c...
research
02/05/2021

Evaluating SZZ Implementations Through a Developer-informed Oracle

The SZZ algorithm for identifying bug-inducing changes has been widely u...
research
11/20/2019

Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection

Defect prediction research has a strong reliance on published data sets ...
research
06/20/2022

PR-SZZ: How pull requests can support the tracing of defects in software repositories

The SZZ algorithm represents a standard way to identify bug fixing commi...
research
12/27/2021

Evaluating Software User Feedback Classifiers on Unseen Apps, Datasets, and Metadata

Listening to user's requirements is crucial to building and maintaining ...

Please sign up or login with your details

Forgot password? Click here to reset