Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study

03/16/2021
by   Zadia Codabux, et al.
0

Context: Technical Debt is a metaphor used to describe code that is "not quite right." Although TD studies have gained momentum, TD has yet to be studied as thoroughly in non-Object-Oriented (OO) or scientific software such as R. R is a multi-paradigm programming language, whose popularity in data science and statistical applications has amplified in recent years. Due to R's inherent ability to expand through user-contributed packages, several community-led organizations were created to organize and peer-review packages in a concerted effort to increase their quality. Nonetheless, it is well-known that most R users do not have a technical programming background, being from multiple disciplines. Objective: The goal of this study is to investigate TD in the peer-review documentation of R packages led by rOpenSci. Method: We collected over 5000 comments from 157 packages that had been reviewed and approved to be published at rOpenSci. We manually analyzed a sample dataset of these comments posted by package authors, editors of rOpenSci, and reviewers during the review process to investigate the TD types present in these reviews. Results: The findings of our study include (i) a taxonomy of TD derived from our analysis of the peer-reviews (ii) documentation debt as being the most prevalent type of debt (iii) different user roles are concerned with different types of TD. For instance, reviewers tend to report some TD types more than other roles, and the TD types they report are different from those reported by the authors of a package. Conclusion: TD analysis in scientific software or peer-review is almost non-existent. Our study is a pioneer but within the context of R packages. However, our findings can serve as a starting point for replication studies, given our public datasets, to perform similar analyses in other scientific software or to investigate the rationale behind our findings.

READ FULL TEXT

page 1

page 7

page 9

research
01/11/2022

Automatic Detection and Analysis of Technical Debts in Peer-Review Documentation of R Packages

Technical debt (TD) is a metaphor for code-related problems that arise a...
research
12/17/2021

Do conspicuous manuscripts experience shorter time in the duration of peer review?

A question often asked by authors is how long would it take for the peer...
research
08/08/2023

Safeguarding Scientific Integrity: Examining Conflicts of Interest in the Peer Review Process

This case study analyzes the expertise, potential conflicts of interest,...
research
02/07/2022

Exploratory analysis of text duplication in peer-review reveals peer-review fraud and paper mills

Comments received from referees during peer-review were analysed to dete...
research
10/08/2019

Peer Reviewing Revisited: Assessing Research with Interlinked Semantic Comments

Scientific publishing seems to be at a turning point. Its paradigm has s...
research
03/08/2021

A Review of Spatiotemporal Models for Count Data in R Packages. A Case Study of COVID-19 Data

Spatio-temporal models for count data are required in a wide range of sc...
research
03/25/2021

The landscape of software for tensor computations

Tensors (also commonly seen as multi-linear operators or as multi-dimens...

Please sign up or login with your details

Forgot password? Click here to reset