Generating GitHub Repository Descriptions: A Comparison of Manual and Automated Approaches

10/25/2021
by   Jazlyn Hellman, et al.
0

Given the vast number of repositories hosted on GitHub, project discovery and retrieval have become increasingly important for GitHub users. Repository descriptions serve as one of the first points of contact for users who are accessing a repository. However, repository owners often fail to provide a high-quality description; instead, they use vague terms, the purpose of the repository is poorly explained, or the description is omitted entirely. In this work, we examine the current practice of writing GitHub repository descriptions. Our investigation leads to the proposal of the LSP (Language, Software technology, and Purpose) template to formulate good descriptions for GitHub repositories that are clear, concise, and informative. To understand the extent to which current automated techniques can support generating repository descriptions, we compare the performance of state-of-the-art text summarization methods on this task. Finally, our user study with GitHub users reveals that automated summarization can adequately be used for default description generation for GitHub repositories, while the descriptions which follow the LSP template offer the most effective instrument for communicating with GitHub users.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2022

Automatic Pull Request Title Generation

Pull Requests (PRs) are a mechanism on modern collaborative coding platf...
research
05/27/2020

Github Data Exposure and Accessing Blocked Data using the GraphQL Security Design Flaw

This research study was conducted to illustrate how it is easily possibl...
research
02/20/2018

Categorizing the Content of GitHub README Files

README files play an essential role in shaping a developer's first impre...
research
07/05/2023

(Semi)automated disambiguation of scholarly repositories

The full exploitation of scholarly repositories is pivotal in modern Ope...
research
05/14/2022

ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts

Systems that can automatically define unfamiliar terms hold the promise ...
research
11/07/2021

NarrationBot and InfoBot: A Hybrid System for Automated Video Description

Video accessibility is crucial for blind and low vision users for equita...

Please sign up or login with your details

Forgot password? Click here to reset