On the Generation, Structure, and Semantics of Grammar Patterns in Source Code Identifiers

07/15/2020
by   Christian D. Newman, et al.
0

Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier naming practices and how accurately we are able to automatically model these patterns is vital if researchers are to support developers and automated analysis approaches in comprehending and creating identifiers correctly and optimally. This paper investigates identifiers by studying sequences of part-of-speech annotations, referred to as grammar patterns. This work advances our understanding of these patterns and our ability to model them by 1) establishing common naming patterns in different types of identifiers, such as class and attribute names; 2) analyzing how different patterns influence comprehension; and 3) studying the accuracy of state-of-the-art techniques for part-of-speech annotations, which are vital in automatically modeling identifier naming patterns, in order to establish their limits and paths toward improvement. To do this, we manually annotate a dataset of 1,335 identifiers from 20 open-source systems and use this dataset to study naming patterns, semantics, and tagger accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Understanding Digits in Identifier Names: An Exploratory Study

Before any software maintenance can occur, developers must read the iden...
research
09/01/2021

An Ensemble Approach for Annotating Source Code Identifiers with Part-of-speech Tags

This paper presents an ensemble part-of-speech tagging approach for sour...
research
06/01/2021

Studying Duplicate Logging Statements and Their Relationships with Code Clones

In this paper, we focus on studying duplicate logging statements, which ...
research
07/25/2020

Automated Query Generation for Design Pattern Mining in Source Code

Identifying which design patterns already exist in source code can help ...
research
03/16/2021

Using Grammar Patterns to Interpret Test Method Name Evolution

It is good practice to name test methods such that they are comprehensib...
research
07/12/2020

Editable AI: Mixed Human-AI Authoring of Code Patterns

Developers authoring HTML documents define elements following patterns w...
research
07/20/2021

On the Interplay of Smells Large Class, Complex Class and Duplicate Code

Bad smells have been defined to describe potential problems in code, pos...

Please sign up or login with your details

Forgot password? Click here to reset