Bot Detection in GitHub Repositories
Contemporary social coding platforms like GitHub promote collaborative development. Many open-source software repositories hosted in these platforms use machine accounts (bots) to automate and facilitate a wide range of effort-intensive and repetitive activities. Determining if an account corresponds to a bot or a human contributor is important for socio-technical development analytics, for example, to understand how humans collaborate and interact in the presence of bots, to assess the positive and negative impact of using bots, to identify the top project contributors, to identify potential bus factors, and so on. Our project aims to include the trained machine learning (ML) classifier from the BoDeGHa bot detection tool as a plugin to the GrimoireLab software development analytics platform. In this work, we present the procedure to form a pipeline for retrieving contribution and contributor data using Perceval, distinguishing bots from humans using BoDeGHa, and visualising the results using Kibana.
READ FULL TEXT