Associating Natural Language Comment and Source Code Entities

12/13/2019
by   Sheena Panthaplackel, et al.
0

Comments are an integral part of software development; they are natural language descriptions associated with source code elements. Understanding explicit associations can be useful in improving code comprehensibility and maintaining the consistency between code and comments. As an initial step towards this larger goal, we address the task of associating entities in Javadoc comments with elements in Java source code. We propose an approach for automatically extracting supervised data using revision histories of open source projects and present a manually annotated evaluation dataset for this task. We develop a binary classifier and a sequence labeling model by crafting a rich feature set which encompasses various aspects of code, comments, and the relationships between them. Experiments show that our systems outperform several baselines learning from the proposed supervision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2020

Learning to Update Natural Language Comments Based on Code Changes

We formulate the novel task of automatically updating an existing natura...
research
03/24/2023

PENTACET data – 23 Million Contextual Code Comments and 500,000 SATD comments

Most Self-Admitted Technical Debt (SATD) research utilizes explicit SATD...
research
04/01/2017

Topic modeling of public repositories at scale using names in source code

Programming languages themselves have a limited number of reserved keywo...
research
08/06/2018

Executable Trigger-Action Comments

Natural language elements, e.g., todo comments, are frequently used to c...
research
08/28/2020

CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis

Large scale analysis of source code, and in particular scientific source...
research
09/20/2021

To Automatically Map Source Code Entities to Architectural Modules with Naive Bayes

Background: The process of mapping a source code entity onto an architec...
research
05/26/2018

Splitting source code identifiers using Bidirectional LSTM Recurrent Neural Network

Programmers make rich use of natural language in the source code they wr...

Please sign up or login with your details

Forgot password? Click here to reset