Do Code Review Measures Explain the Incidence of Post-Release Defects?

by   Andrey Krutauz, et al.

Aim: In contrast to studies of defects found during code review, we aim to clarify whether code reviews measures can explain the prevalence of post-release defects. Method: We replicate a study by McIntoshet. al that uses additive regression to model the relationship between defects and code reviews. To increase external validity, we apply the same methodology on a new software project. We discuss our findings with the first author of the original study, McIntosh. We then investigate how to reduce the impact of correlated predictors in the variable selection process and how to increase understanding of the inter-relationships among the predictors by employing Bayesian Network (BN) models. Context: As in the original study, we use the same measures authors obtained for Qt project in the original study. We mine data from version control and issue tracker of Google Chrome and operationalize measures that are close analogs to the large collection of code, process, and code review measures used in the replicated the study. Results: Both the data from the original study and the Chrome data showed high instability of the influence of code review measures on defects with the results being highly sensitive to variable selection procedure. Models without code review predictors had as good or better fit than those with review predictors. Replication, however, confirms with the bulk of prior work showing that prior defects, module size, and authorship have the strongest relationship to post-release defects. The application of BN models helped explain the observed instability by demonstrating that the review-related predictors do not affect post-release defects directly and showed indirect effects. For example, changes that have no review discussion tend to be associated with files that have had many prior defects which in turn increase the number of post-release defects.


page 1

page 2

page 3

page 4


Variable Selection Using Bayesian Additive Regression Trees

Variable selection is an important statistical problem. This problem bec...

Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS Project

Peer code review has been found to be effective in identifying security ...

Does Code Review Promote Conformance? A Study of OpenStack Patches

Code Review plays a crucial role in software quality, by allowing review...

Assessing the Impact of File Ordering Strategies on Code Review Process

Popular modern code review tools (e.g. Gerrit and GitHub) sort files in ...

Deriving a Usage-Independent Software Quality Metric

Context:The extent of post-release use of software affects the number of...

Effect of Technical and Social Factors on Pull Request Quality for the NPM Ecosystem

Pull request (PR) based development, which is a norm for the social codi...

Please sign up or login with your details

Forgot password? Click here to reset