An Exploratory Study of Bot Commits

by   Tapajit Dey, et al.

Background: Bots help automate many of the tasks performed by software developers and are widely used to commit code in various social coding platforms. At present, it is not clear what types of activities these bots perform and understanding it may help design better bots, and find application areas which might benefit from bot adoption. Aim: We aim to categorize the Bot Commits by the type of change (files added, deleted, or modified), find the more commonly changed file types, and identify the groups of file types that tend to get updated together. Method: 12,326,137 commits made by 461 popular bots (that made at least 1000 commits) were examined to identify the frequency and the type of files added/ deleted/ modified by the commits, and association rule mining was used to identify the types of files modified together. Result: Majority of the bot commits modify an existing file, a few of them add new files, while deletion of a file is very rare. Commits involving more than one type of operation are even rarer. Files containing data, configuration, and documentation are most frequently updated, while HTML is the most common type in terms of the number of files added, deleted, and modified. Files of the type "Markdown", "Ignore List", "YAML", "JSON" were the types that are updated together with other types of files most frequently. Conclusion: We observe that majority of bot commits involve single file modifications, and bots primarily work with data, configuration, and documentation files. A better understanding if this is a limitation of the bots and, if overcome, would lead to different kinds of bots remains an open question.


page 1

page 2

page 3

page 4


SAIC: Identifying Configuration Files for System Configuration Management

Systems can become misconfigured for a variety of reasons such as operat...

Detecting and Characterizing Bots that Commit Code

Background: Some developer activity traditionally performed manually, su...

NapierOne: A modern mixed file data set alternative to Govdocs1

It was found when reviewing the ransomware detection research literature...

SocialStegDisc: Application of steganography in social networks to create a file system

The concept named SocialStegDisc was introduced as an application of the...

Towards the Assisted Decomposition of Large-Active Files

Tightly coupled and interdependent systems inhibit productivity by requi...

Detecting Layout Templates in Complex Multiregion Files

Spreadsheets are among the most commonly used file formats for data mana...

First Come First Served: The Impact of File Position on Code Review

The most popular code review tools (e.g., Gerrit and GitHub) present the...

Please sign up or login with your details

Forgot password? Click here to reset