9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay

01/22/2019
by   Hideaki Hata, et al.
0

Links are an essential feature of the World Wide Web, and source code repositories are no exception. However, despite their many undisputed benefits, links can suffer from decay, insufficient versioning, and lack of bidirectional traceability. In this paper, we investigate the role of links contained in source code comments from these perspectives. We conducted a large-scale study of around 9.6 million links to establish their prevalence, and we used a mixed-methods approach to identify the links' targets, purposes, decay, and evolutionary aspects. We found that links are prevalent in source code repositories, that licenses, software homepages, and specifications are common types of link targets, and that links are often included to provide metadata or attribution. Links are rarely updated, but many link targets evolve. Almost 10 of the links included in source code comments are dead. We then submitted a batch of link-fixing pull requests to open source software repositories, resulting in most of our fixes being merged successfully. Our findings indicate that links in source code comments can indeed be fragile, and our work opens up avenues for future work to address these problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

18 Million Links in Commit Messages: Purpose, Evolution, and Decay

Commit messages contain diverse and valuable types of knowledge in all a...
research
04/01/2020

GitHub Repositories with Links to Academic Papers: Open Access, Traceability, and Evolution

Traceability between published scientific breakthroughs and their implem...
research
11/18/2017

Automatic link extraction: The good, the bad and the ugly in software ecosystem mining

This abstract presents the automatic link extraction pitfalls based on o...
research
07/06/2018

Recommending Insightful Comments for Source Code using Crowdsourced Knowledge

Recently, automatic code comment generation is proposed to facilitate pr...
research
04/06/2018

Traceability in the Wild: Automatically Augmenting Incomplete Trace Links

Software and systems traceability is widely accepted as an essential ele...
research
04/03/2023

What You See is Not What You Get: The Role of Email Presentation in Phishing Susceptibility

Phishing is one of the most prevalent social engineering attacks that ta...

Please sign up or login with your details

Forgot password? Click here to reset