Analysing Time-Stamped Co-Editing Networks in Software Development Teams using git2net

11/21/2019
by   Christoph Gote, et al.
0

Data from software repositories have become an important foundation for the empirical study of software engineering processes. A recurring theme in the repository mining literature is the inference of developer networks capturing e.g. collaboration, coordination, or communication from the commit history of projects. Most of the studied networks are based on the co-authorship of software artefacts. Because this neglects detailed information on code changes and code ownership we introduce git2net, a scalable python software that facilitates the extraction of fine-grained co-editing networks in large git repositories. It uses text mining techniques to analyse the detailed history of textual modifications within files. We apply our tool in two case studies using GitHub repositories of multiple Open Source as well as a commercial software project. Specifically, we use data on more than 1.2 million commits and more than 25'000 developers to test a hypothesis on the relation between developer productivity and co-editing patterns in software teams. We argue that git2net opens up a massive new source of high-resolution data on human collaboration patterns that can be used to advance theory in empirical software engineering, computational social science, and organisational studies.

READ FULL TEXT
research
03/25/2019

git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories

Data from software repositories have become an important foundation for ...
research
04/28/2023

A Network Perspective on the Influence of Code Review Bots on the Structure of Developer Collaborations

Background: Despite a growing body of literature on the impact of softwa...
research
01/12/2022

Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set

Massive data from software repositories and collaboration tools are wide...
research
01/17/2019

Mining Treatment-Outcome Constructs from Sequential Software Engineering Data

Many investigations in empirical software engineering look at sequences ...
research
04/17/2020

An Annotated Dataset of Stack Overflow Post Edits

To improve software engineering, software repositories have been mined f...
research
07/05/2020

Understanding coordination in global software engineering: A mixed-methods study on the use of meetings and Slack

Given the relevance of coordination in the field of global software engi...
research
02/24/2022

Should I Get Involved? On the Privacy Perils of Mining Software Repositories for Research Participants

Mining Software Repositories (MSRs) is an evidence-based methodology tha...

Please sign up or login with your details

Forgot password? Click here to reset