Should I Get Involved? On the Privacy Perils of Mining Software Repositories for Research Participants

02/24/2022
by   Melina Vidoni, et al.
0

Mining Software Repositories (MSRs) is an evidence-based methodology that cross-links data to uncover actionable information about software systems. Empirical studies in software engineering often leverage MSR techniques as they allow researchers to unveil issues and flaws in software development so as to analyse the different factors contributing to them. Hence, counting on fine-grained information about the repositories and sources being mined (e.g., server names, and contributors' identities) is essential for the reproducibility and transparency of MSR studies. However, this can also introduce threats to participants' privacy as their identities may be linked to flawed/sub-optimal programming practices (e.g., code smells, improper documentation), or vice-versa. Moreover, this can be extensible to close collaborators and community members resulting "guilty by association". This position paper aims to start a discussion about indirect participation in MSRs investigations, the dichotomy of 'privacy vs. utility' regarding sharing non-aggregated data, and its effects on privacy restrictions and ethical considerations for participant involvement.

READ FULL TEXT

page 1

page 2

page 3

research
07/16/2020

Privacy Engineering Meets Software Engineering. On the Challenges of Engineering Privacy ByDesign

Current day software development relies heavily on the use of service ar...
research
07/18/2022

Software Artifact Mining in Software Engineering Conferences: A Meta-Analysis

Background: Software development results in the production of various ty...
research
03/09/2021

PeQES: A Platform for Privacy-enhanced Quantitative Empirical Studies

Empirical sciences and in particular psychology suffer a methodological ...
research
11/21/2019

Analysing Time-Stamped Co-Editing Networks in Software Development Teams using git2net

Data from software repositories have become an important foundation for ...
research
03/02/2021

Stop Building Castles on a Swamp! The Crisis of Reproducing Automatic Search in Evidence-based Software Engineering

The evidence-based approach has increasingly been employed to synthesize...
research
06/20/2023

Fingerprinting and Building Large Reproducible Datasets

Obtaining a relevant dataset is central to conducting empirical studies ...
research
06/12/2021

Amplifying Privacy: Scaling Up Transparency Research Through Delegated Access Requests

In recent years, numerous studies have used 'data subject access request...

Please sign up or login with your details

Forgot password? Click here to reset