Identifying Experts in Software Libraries and Frameworks among GitHub Users

03/19/2019
by   Joao Eduardo Montandon, et al.
0

Software development increasingly depends on libraries and frameworks to increase productivity and reduce time-to-market. Despite this fact, we still lack techniques to assess developers expertise in widely popular libraries and frameworks. In this paper, we evaluate the performance of unsupervised (based on clustering) and supervised machine learning classifiers (Random Forest and SVM) to identify experts in three popular JavaScript libraries: facebook/react, mongodb/node-mongodb, and socketio/socket.io. First, we collect 13 features about developers activity on GitHub projects, including commits on source code files that depend on these libraries. We also build a ground truth including the expertise of 575 developers on the studied libraries, as self-reported by them in a survey. Based on our findings, we document the challenges of using machine learning classifiers to predict expertise in software libraries, using features extracted from GitHub. Then, we propose a method to identify library experts based on clustering feature data from GitHub; by triangulating the results of this method with information available on Linkedin profiles, we show that it is able to recommend dozens of GitHub users with evidences of being experts in the studied JavaScript libraries. We also provide a public dataset with the expertise of 575 developers on the studied libraries.

READ FULL TEXT
research
03/16/2023

Intertwining Communities: Exploring Libraries that Cross Software Ecosystems

Using libraries in applications has helped developers reduce the costs o...
research
09/20/2018

Should I Bug You? Identifying Domain Experts in Software Projects Using Code Complexity Metrics

In any sufficiently complex software system there are experts, having a ...
research
08/16/2022

Identifying Source Code File Experts

In software development, the identification of source code file experts ...
research
01/16/2018

Why and How Java Developers Break APIs

Modern software development depends on APIs to reuse code and increase p...
research
07/04/2022

Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation

In the past two decades, several Machine Learning (ML) libraries have be...
research
04/16/2022

ZeroIn: Characterizing the Data Distributions of Commits in Software Repositories

Modern software development is based on a series of rapid incremental ch...
research
11/02/2021

Dazed and Confused: What's Wrong with Crypto Libraries?

Recent studies have shown that developers have difficulties in using cry...

Please sign up or login with your details

Forgot password? Click here to reset