Log In Sign Up

The List is the Process: Reliable Pre-Integration Tracking of Commits on Mailing Lists

by   Ralf Ramsauer, et al.

A considerable corpus of research on software evolution focuses on mining changes in software repositories, but omits their pre-integration history. We present a novel method for tracking this otherwise invisible evolution of software changes on mailing lists by connecting all early revisions of changes to their final version in repositories. Since artefact modifications on mailing lists are communicated by updates to fragments (i.e., patches) only, identifying semantically similar changes is a non-trivial task that our approach solves in a language-independent way. We evaluate our method on high-profile open source software (OSS) projects like the Linux kernel, and validate its high accuracy using an elaborately created ground truth. Our approach can be used to quantify properties of OSS development processes, which is an essential requirement for using OSS in reliable or safety-critical industrial products, where certifiability and conformance to processes are crucial. The high accuracy of our technique allows, to the best of our knowledge, for the first time to quantitatively determine if an open development process effectively aligns with given formal process requirements.


The Sound of Silence: Mining Security Vulnerabilities from Secret Integration Channels in Open-Source Projects

Public development processes are a key characteristic of open source pro...

The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories

Communication surrounding the development of an open source project larg...

Automated and manual testing as part of the research software development process of RCE

Research software is often developed by individual researchers or small ...

A Systematic Review on Learning and Suggesting Source Code Changes in Version History

Software systems are in continuous evolution through source code changes...

A Software Ecosystem Reshaped by a Paradigm Shift: the CSI-Piemonte Case

Context: Changes in the software development paradigm, when operated by ...

Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources

Real-time monitoring and responses to emerging public health threats rel...