PR-SZZ: How pull requests can support the tracing of defects in software repositories

06/20/2022
by   Peter Bludau, et al.
0

The SZZ algorithm represents a standard way to identify bug fixing commits as well as inducing counterparts. It forms the basis for data sets used in numerous empirical studies. Since its creation, multiple extensions have been proposed to enhance its performance. For historical reasons, related work relies on commit messages to map bug tickets to possibly related code with no additional data used to trace inducing commits from these fixes. Therefore, we present an updated version of SZZ utilizing pull requests, which are widely adopted today. We evaluate our approach in comparison to existing SZZ variants by conducting experiments and analyzing the usage of pull requests, inner commits, and merge strategies. We base our results on 6 open-source projects with more than 50k commits and 35k pull requests. With respect to bug fixing commits, on average 18 commit, resulting in an overall F-score of 0.75, an improvement of 40 percentage points. By selecting an inducing commit, we manage to reduce the false-positives and increase precision by on average 16 percentage points in comparison to existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2018

Traceability in the Wild: Automatically Augmenting Incomplete Trace Links

Software and systems traceability is widely accepted as an essential ele...
research
02/28/2022

ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction

In this paper, we present ApacheJIT, a large dataset for Just-In-Time de...
research
09/24/2021

Broccoli: Bug localization with the help of text search engines

Bug localization is a tedious activity in the bug fixing process in whic...
research
02/05/2021

Evaluating SZZ Implementations Through a Developer-informed Oracle

The SZZ algorithm for identifying bug-inducing changes has been widely u...
research
04/20/2023

Finding Bug-Inducing Program Environments

Some bugs cannot be exposed by program inputs, but only by certain progr...
research
11/20/2019

Issues with SZZ: An empirical assessment of the state of practice of defect prediction data collection

Defect prediction research has a strong reliance on published data sets ...
research
09/07/2022

SZZ in the time of Pull Requests

In the multi-commit development model, programmers complete tasks (e.g.,...

Please sign up or login with your details

Forgot password? Click here to reset