Towards Identifying Paid Open Source Developers - A Case Study with Mozilla Developers

04/06/2018
by   Maëlick Claes, et al.
0

Open source development contains contributions from both hired and volunteer software developers. Identification of this status is important when we consider the transferability of research results to the closed source software industry, as they include no volunteer developers. While many studies have taken the employment status of developers into account, this information is often gathered manually due to the lack of accurate automatic methods. In this paper, we present an initial step towards predicting paid and unpaid open source development using machine learning and compare our results with automatic techniques used in prior work. By relying on code source repository meta-data from Mozilla, and manually collected employment status, we built a dataset of the most active developers, both volunteer and hired by Mozilla. We define a set of metrics based on developers' usual commit time pattern and use different classification methods (logistic regression, classification tree, and random forest). The results show that our proposed method identify paid and unpaid commits with an AUC of 0.75 using random forest, which is higher than the AUC of 0.64 obtained with the best of the previously used automatic methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2020

Analysis of open source license selection for the GitHub programming community

Developers usually select different open source licenses to restrain the...
research
02/07/2023

Machine learning benchmarks for the classification of equivalent circuit models from solid-state electrochemical impedance spectra

Analysis of Electrochemical Impedance Spectroscopy (EIS) data for electr...
research
10/05/2022

IRJIT – An Information Retrieval Technique for Just-in-time Defect Identification

Defect identification at commit check-in time prevents the introduction ...
research
04/05/2021

Predicting Crash Fault Residence via Simplified Deep Forest Based on A Reduced Feature Set

The software inevitably encounters the crash, which will take developers...
research
11/26/2021

Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github's Sponsor Mechanism

While many forms of financial support are currently available, there are...
research
07/15/2019

Patterns of Effort Contribution and Demand and User Classification based on Participation Patterns in NPM Ecosystem

Background: Open source requires participation of volunteer and commerci...
research
08/16/2020

Prediction of Homicides in Urban Centers: A Machine Learning Approach

Relevant research has been standing out in the computing community aimin...

Please sign up or login with your details

Forgot password? Click here to reset