Towards a Model to Appraise and Suggest Identifier Names

Unknowingly, identifiers in the source code of a software system play a vital role in determining the quality of the system. Ambiguous and confusing identifier names lead developers to not only misunderstand the behavior of the code but also increases comprehension time and thereby causes a loss in productivity. Even though correcting poor names through rename operations is a viable option for solving this problem, renaming itself is an act of rework and is not immune to defect injection. In this study, we aim to understand the motivations that drive developers to name and rename identifiers and the decisions they make in determining the name. Using our results, we propose the development of a linguistic model that determines identifier names based on the behavior of the identifier. As a prerequisite to constructing the model, we conduct multiple studies to determine the features that should feed into the model. In this paper, we discuss findings from our completed studies and justify the continuation of research on this topic through further studies.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/12/2021

How Developers Choose Names

The names of variables and functions serve as implicit documentation and...
07/12/2019

Mercem: Method Name Recommendation Based on Call Graph Embedding

Comprehensibility of source code is strongly affected by identifier name...
03/15/2021

Using Non-Verbal Expressions as a Tool in Naming Research

Variable and function names are extremely important for program comprehe...
11/16/2020

The Person Index Challenge: Extraction of Persons from Messy, Short Texts

When persons are mentioned in texts with their first name, last name and...
03/19/2021

Does Code Structure Affect Comprehension? On Using and Naming Intermediate Variables

Intermediate variables can be used to break complex expressions into mor...
03/16/2021

Using Grammar Patterns to Interpret Test Method Name Evolution

It is good practice to name test methods such that they are comprehensib...
04/16/2020

Deep Generation of Coq Lemma Names Using Elaborated Terms

Coding conventions for naming, spacing, and other essentially stylistic ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Software maintenance is the most costly phase of the software development lifecycle [8, 13], with a significant portion of this (about 58%) dedicated to source code comprehension [38]

. Developers must use identifier names to comprehend the code that they will update. Identifiers are names (i.e., lexical tokens) that uniquely identify entities in the code (such as classes, methods, variables, etc.). It has been estimated that identifiers contribute to, on average, 70% of a software system’s codebase

[12]. It has also been shown that poor identifier names can cause developers to spend, on average, 19% more time on comprehension activities [17]; a result supported by [22]. It has also been repeatedly shown that the name of a good identifier explicitly reflects its role [18, 23].

The need for strong identifier names is reflected in standard software engineering practices, which provide guidelines [11, 20], best practices [27, 15], and quality metrics [32, 9, 31] to assist developers in naming identifiers. The idea is that, through the use of unambiguous and intent-revealing names, identifiers assist in communicating the purpose and behavior of the source code and eventually the functionality of the system to developers, which research has shown to be true [10].

Unfortunately, naming conventions and best practices can only guide developers to strong names; they cannot be used to provide a developer with a high-quality name, and they cannot be used to provide a holistic comparison of multiple candidate names for an identifier. Further, quality metrics for readability [32, 9, 31] do not work at the level of identifiers names; they cannot inform a developer if a name is high-quality. Instead, they explicitly look at source code structure and use static analysis to estimate, for example, complexity and use that to measure comprehension. In short, there are currently no methods that can be used to determine whether an identifier name is high-quality or not. Furthermore, there are no models that accurately tell us how developers create an identifier name that reflects source code behavior.

The goal of this work is to begin the creation of a model that understands the relationship between the name of an identifier and the behavior of the source code entity it represents. To accomplish this, we need first to understand how developers choose names for identifiers. We can achieve this by studying instances where developers rename existing identifiers in the source code (i.e., rename refactorings) Developers perform renames to reflect better the meaning of the identifier, which can either result in a change or preservation in meaning [6].

To automate the process of identifier name evaluation, the features that impact an identifiers name will be used as input to our model. This will theoretically allow us to provide developers with real-time, context-aware suggestions and appraisals of identifier names. Our proposed model will reduce the time and costs involved in software maintenance and ensure that production-ready code is readable and understandable before deployment. We also envision that the broader impact of this work will drive further research into program comprehension and result in improvements to software engineering tools, for example, in the area of source code generation.

Ii Rename Taxonomy

When renaming an identifier, a developer may add, remove or replace terms to/from the original name. Either one of these actions (or even a combination of these actions) updates semantic meaning of the name. In other words, this act of renaming can either change the meaning of the name or preserve the original meaning. In this study, we utilize the taxonomy created by Arnaoudova et al. [6] to examine rename refactorings and categorize them into the different types prescribed by this taxonomy.

At its basic form, preservation of meaning occurs if the terms in the name are reordered or special characters are included/excluded. For more complex forms, meaning preservation is maintained if the replaced terms are synonyms or are a singular/plural of the original. An example of preservation in meaning occurred when a developer renamed pictureLock photoLock. In this instance, the term ‘picture’ is a synonym of ‘photo’ and hence preserves the meaning of the original name.

A change in meaning can occur for multiple reasons. Developers can perform a specialization of an identifiers name by replacing a term in the original name with a hyponym or adding an adjective or noun to the original name as in the instance when the developer performed the following rename: button customMediaRouteButton. In this example, the newly added terms are nouns hence the semantic change is considered a narrowing in meaning. By replacing a term with a hypernym or through the removal of terms (e.g., author_name contact), a developer can generalize (i.e., broaden) the meaning of the identifiers name. Adding new terms to the identifiers name (without causing a specialization) like scanPortsButton scanWellKnownPortsButton, leads to an addition to the meaning of the name. In this example, the newly added terms are an adverb and verb. Hence the semantic change is not considered a specialization, therefore it is categorized under add meaning. Similarly, the removal of terms (without causing a generalization) removes details from the meanings name (e.g., mPendingDeletedMessages mPendingMessages). Additionally, an identifiers meaning is also modified if terms in the original name are replaced with antonyms (e.g., chartTop chartBottom).

Iii Motivating Example

A prerequisite to constructing the linguistic model is determining how and why identifier names change. To this extent, a key feature that we can exploit from renames is that we can see the actions performed by developers before and after a rename. In other words, using a static analysis approach, we can determine the types of refactorings that occur either before, with or after a rename refactoring. Furthermore, combining messages in the commit log and the type of semantic change that the identifier name undergoes, we can aim to contextualize the rationale behind the rename. For example, in [1] a developer moves a class from one package to another with the message: “Incremental changes, some package refactorings etc”. The next refactoring operation [2] on this class is the renaming of the class from JsonViewResultJsonView. This rename broadens the meaning of the name by removing the term ‘Result’, making the identifier more general in meaning. The commit message for this rename is: “Cleaned up some file names for easier usage…”, meaning the developer was likely going through and renaming things after the move class refactoring. Considering such patterns in the implementation lifecycle of the system, there exists the possibility of extracting appropriate features from the source code to construct our proposed identifier name appraisal and recommendation model. We envision our model being available as an extension in the developers integrated development environment (IDE) thereby providing the developer with real-time suggestions and appraisals for identifier names during the implementation and maintenance phases.

Iv Approach

Our goal of constructing a high performing linguistic model is composed of multiple studies. These are studies that aim to determine the features that are most appropriate to feed into the model. Hence, in this section, we discuss the results of our completed studies and propose future studies on this topic.

Iv-a Completed Studies

Contextualizing renames to commit messages: We briefly report on the findings of our prior study [30] in which we investigated the contextualizing of different semantic changes of identifier names. A topic modeling analysis using Latent Dirichlet Allocation (LDA) yielded interesting results but proved insufficient to pinpoint the developer’s intention. Words like ad and add frequently occurred with narrow, broaden and add meaning categories. This may indicate that the addition of code correlates with these types of changes. Interestingly, we observed that preserve and remove meaning lacked the words ad and add in their commit message. If we assume adding tends to modify meaning somehow (i.e., narrow, broaden, add) then preserve and remove meaning should not include these terms. Instead, terms like rename and refactor are more common in these categories than in others. Interestingly, remove meaning does not include terms like remove, delete, etc. Even though commit messages provide interesting trends around semantic updates to identifier names, they cannot be solely relied on for contextualizing renames. In other words, results from our LDA analysis did not yield clear-cut topics for each of the semantic categories. There were overlapping of terms between some topics, or some topics were missing key terms that are usually associated with the semantic change. The topics generated by LDA were too high-level; unable to provide us with fine-grain data about the context around renaming practices. Therefore, in the next study, we considered co-occurrence of other refactorings with renames.

Co-occurrence of renames and other refactorings: This study [29] involved identifying refactorings preceding or following a rename. Our study of 800 systems showed that in most scenarios, renaming of an element does not generally seem to be influenced by, nor does itself influence another type of refactoring on the same element. However, there is a subset of renames that occur directly before or after another refactoring. From this subset, we observed that a majority of the time developers perform a refactoring operation just before the rename, these two operations happen in a short (commit) interval. We also showed that developers frequently change the semantic meaning of an identifier name when performing a rename after a refactoring. Contextualizing these refactorings with the commit log proved useful for filtering out a set of commit messages closely related to different types of renames. However, while the rationale for some semantic changes can be derived from the commit log in addition to the refactorings that occurred just prior to the rename, we still encountered high-level LDA topics. In other words, the level of detail described by developers about their activities/tasks in the commit message is not sufficient for fine-grained NLP-based analysis. This made it difficult to understand the reasoning behind the application of renames fully. Hence, our findings show that a significant amount of work is needed to automatically derive these motivations more effectively from commit messages, other natural language software artifacts, and general source code changes. A final contribution from this study showed that developers with limited project experience are more inclined to perform only rename refactorings than other types of refactorings (that may alter the systems design); indicating that renames are applied by developers that may not be very familiar with the system they are developing for.

Abbreviations in source code: In addition to our work on renames, we also investigated abbreviation expansion. The act of expanding abbreviations is valuable for studying identifiers since it allows us to remove the threat that a developer might not be familiar with a given abbreviation, and makes it easier for tools such as part of speech taggers to work effectively. In [28]

, we studied the expansion of abbreviations that appear in the source code of five open-source systems. This study enabled us to understand the effectiveness of different abbreviation expansion techniques on systems with varying quality of documentation. Additionally, we manually created a gold set of over 850 abbreviation-expansion pairs. This will help us study the effect of abbreviations on comprehension and has the potential to increase the quality of our proposed model.

In summary, while results from our analysis of identifier renames, commits, and refactorings were promising, these results do not provide a complete picture on the motivations that drive developers to rename identifiers. In short, the results were too high-level to serve as data for project-specific rename contexts. Hence, to construct our model, we need to extend our investigation to look beyond static analysis and begin investigating renames from the perspective of developers. That is, we will study how developers apply renames and their thought process in determining a name. Additionally, we would need to interact with developers to understand their reasoning for changing each word in the target identifier. This data will assist us in determining what data sources are important to consider when building our model. It will also help us determine ways to automate the contextualization of renames since we will know what developers tend to look for when applying renames.

Iv-B Proposed Studies

As explained in our completed studies, we have shown that even though we did notice patterns in identifier renames, trying to contextualizing these renames using static analysis of source code is not straightforward. The results we obtained are very generic, and, at most, provides more of a high-level outline as to why developers perform rename operations. While useful, these results cannot be directly incorporated into our proposed model. Hence, we need to further investigate developer implementation (and maintenance) activities to derive features for our model. As such, we need to shift away from empirical studies and focus on studies where we have explicit developer involvement. With developer involvement, we will be able to derive a more fine-grained rationale behind identifier renames, and also the thought process involved in deciding on a new, and more appropriate, name. Therefore, going forward, we propose a more identifier-oriented exploratory study on the developer’s viewpoint of identifier renames. This exploratory study will constitute of an eye-tracking study on developer actions and reactions to identifier names in source code. From this study, we aim to answer the following research questions:

RQ1: What source code elements do developers look at when renaming an identifier?

RQ2: Do developers look at certain code elements more when applying different types of semantic changes?

RQ3: What are the trends in the types of semantic change applied to an identifier and the reason a developer applied that semantic change?

Eye-tracking in software engineering studies is not new. It is primarily utilized for studies that involve the comprehension of software artifacts such as models and source code [35]. Other studies have used this technology to study developer interactions in performing change tasks [21], defect identification [37], and debugging [24] among others. However, at present, there does not exist work that focuses on software refactorings, and more specifically rename refactorings. With eye-tracking, the medium of our study will be the developer’s environment, and we plan on utilizing iTrace [16] to integrate eye-tracking into this environment. This will provide us with the opportunity to determine elements or concepts in the source code (or even in the IDE) that developers rely on when either performing a rename or comprehending a rename. Unlike in static analysis, where we are presented with after-the-fact results associated with a rename, eye-tracking provides us with the ability to capture/measure concepts such as fixations, scanpaths, and areas of interest [34] which are, in reality, part of the implementation process. With these concepts, we can refine the efficiency of our proposed model. For example, while static analysis informs us of the refactorings that occur prior to a rename, eye-tracking will aid us in understanding the number and types of elements developers refer to before performing a rename. Additionally, this technique can act as a proxy to measure the comprehensibility of identifier names and its likelihood to undergo a rename. For instance, a developer renaming an identifier after a relatively high gaze/fixture duration can act as an indicator of a poor name. Furthermore, patterns around the semantic change a name undergo based on the gaze/fixture duration can be used to support developers in their naming activities. Finally, we plan on interviewing the participants of the study to understand their rationale for performing renames (if any) and their thought process on deciding the new name for the identifier. As a means of mitigating risks involved in the experiment, we plan on conducting a series of trials to uncover shortcomings in our methodology (such as task clarity/complexity, participant behavior, and environmental factors). Additionally, each participant in the experiment will be allotted time to become familiar with the environment before the commencement of the experiment.

V Literature Review

We divided our discussion of related work into two areas - studies that explore identifier renamings from a natural language perspective and studies that investigate the quality attributes that an identifier should exhibit.

V-a Identifier Renaming

A survey on identifier renaming conducted by Arnaoudova et al. [6]

showed that developers primarily perform renames in conjunction with other refactorings with most of the renamings due to updates in existing functionality. In the same study, the authors proposed REPENT, an approach to first detect identifier renamings in the source code, and then analyze and classify the detected renamings based on their semantic change. Through an empirical study, the authors demonstrate a high accuracy of their approach in the detection of identifier renamings and show the impact of proper naming has on minimizing software development effort.

Allamanis et al. [3]

implemented NATURALIZE, a framework that utilized statistical language models in mining natural source code naming conventions. The authors demonstrated the high accuracy of NATURALIZE by utilizing it in a sample set of open source projects. NATURALIZE learns the coding conventions in the source code via syntactic restrictions, sub-grammars on existing identifier names and utilizes this knowledgebase to evaluate snippets of new code for potential variables that should be renamed. In an extension of their work

[4]

, the authors proposed an approach to suggest renaming methods based on their bodies and renaming classes based on their methods. Their recommendation uses a neural probabilistic language model to input a set of words which is fed to a hidden layer of the neural network.

Liu et al. [25] proposed a simplistic approach to identify renaming opportunities due to renaming refactorings. Their approach relies upon the source code containing identifiers that are similar to the renamed identifier. If a similar identifier is detected, their approach recommends the developer to rename these identifiers as well. An empirical study on four applications yielded high precision values for their approach. An exploratory study on the lexical similarities between method arguments and parameters by Liu et al. [26] demonstrated mixed results, i.e., either very low or high similarities. The authors state that further research is needed.

Studies by Høst and Østvold [19, 18] on method names showed that even though method names and behavior are mutually dependent, there is more research required in this area to better determine high-quality names.

In summary, prior work in this area has focused on either the type of semantic change occurring in a renamed identifier or the identification of identifiers that are candidates for renaming based on similar identifiers in the codebase. However, research into the thought process of a developer in determining the rationale for performing a rename or, for that matter, determining the correct choice of a replacement name is lacking.

V-B Identifier Quality

In terms of metrics for source code readability and understandability measurements, most research has focused on complete code snippets [31, 9]. However, within these quality models are components that focus on identifier name characteristics such as length, number of dictionary terms the identifier comprises of, and the broadness/specialization of the name. Studies on the length of identifier names by Lawrie et al. [22] and Hofmeister et al. [17] show that names consisting of abbreviations are harder to comprehend than full-word identifiers. Similarly, a study by Schankin et al. [33] shows that descriptive names improve program comprehension. Studies on systems using camel case and underscores for identifier names [7, 36] have shown that developer experience plays an important part in comprehending such names.

Arnaoudova et al. [5] proposed a catalog of linguistic anti-patterns in the source code. The seventeen anti-patterns span across methods and attributes. Contained in this catalog are anti-patterns that are related to the name of the identifier and its purpose/behavior. Furthermore, via a developer survey, the authors confirmed that the presence of linguistic anti-patterns in source code is a poor programming practice. Using a subset of these linguistic anti-patterns Fakhoury et al. [14] demonstrated that a developers’ cognitive load increases when reviewing code containing such anti-patterns.

As described, current research has focused on the quality of code snippets/chunks and not on individual identifier names. While this corpus of studies considers structural characteristics of an identifiers name, they do not consider the relationship between the name of the identifier and its intended purpose nor do they provide a formal definition for high-quality names.

Vi Challenges and Constraints

This section highlights only the key challenges and constraints we encountered. Our research is constrained to Java as the external tools used in our studies are Java specific. Additionally, obtaining a representative dataset is also challenging; we are constrained to open-source Java systems and these systems vary vastly in size. Presently, there does not exist a goldset of high-quality identifier names for us to study and use as benchmarks. Similarly, a software engineering specific set of stopwords are not available for text prepossessing activities, which is prerequisite for topic modeling and n-gram analysis.

Vii Conclusion

Identifiers play an essential role in informing developers about the behavior of the software system. Poor identifier names result in increased code comprehension time and hence, loss in developer productivity. To address this issue of poor names, developers perform renaming operations on identifiers. However, renames are considered rework and can hurt code quality and developer productivity. To help developers name identifiers with high-quality names during implementation, we need to understand the thought process of the developer. Our static analysis based research has shown us that contextualizing semantic changes of identifier names with commit messages is not sufficient. Therefore, we propose studies to investigate concepts that cannot be captured through static code analysis. Hence, our future studies will involve eye-tracking. We envision our findings feeding into a linguistic model that provides developers with real-time, context-aware identifier name suggestions during implementation.

References

  • [1] Note: https://github.com/3wks/thundr/commit/53aaf15 Cited by: §III.
  • [2] Note: https://github.com/3wks/thundr/commit/9b02920 Cited by: §III.
  • [3] M. Allamanis, E. T. Barr, C. Bird, and C. Sutton (2014) Learning natural coding conventions. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, External Links: ISBN 978-1-4503-3056-5, Link, Document Cited by: §V-A.
  • [4] M. Allamanis, E. T. Barr, C. Bird, and C. Sutton (2015) Suggesting accurate method and class names. In 10th Joint Meeting on Foundations of Software Engineering, New York, NY, USA, pp. 38–49. External Links: ISBN 978-1-4503-3675-8, Link, Document Cited by: §V-A.
  • [5] V. Arnaoudova, M. Di Penta, and G. Antoniol (2016-02-01) Linguistic antipatterns: what they are and how developers perceive them. Empirical Software Engineering 21 (1), pp. 104–158. External Links: ISSN 1573-7616, Document, Link Cited by: §V-B.
  • [6] V. Arnaoudova, L. M. Eshkevari, M. D. Penta, R. Oliveto, G. Antoniol, and Y. Gueheneuc (2014-05) REPENT: analyzing the nature of identifier renamings. IEEE Trans. Softw. Eng. 40 (5), pp. 502–532. External Links: ISSN 0098-5589, Link, Document Cited by: §I, §II, §V-A.
  • [7] D. Binkley, M. Davis, D. Lawrie, and C. Morrell (2009-05) To camelcase or under_score. In IEEE 17th International Conference on Program Comprehension, Vol. , pp. 158–167. External Links: Document, ISSN 1092-8138 Cited by: §V-B.
  • [8] E. Burch and H. K. Hsiang-Jui Kungs (1997-10) Modeling software maintenance requests: a case study. In International Conference on Software Maintenance, Vol. , pp. 40–47. External Links: Document, ISSN 1063-6773 Cited by: §I.
  • [9] R. P. L. Buse and W. R. Weimer (2010) Learning a metric for code readability. IEEE Transactions on Software Engineering 36 (4). External Links: Document, ISSN 0098-5589 Cited by: §I, §I, §V-B.
  • [10] S. Butler (2009) The effect of identifier naming on source code readability and quality. In Proceedings of the Doctoral Symposium for ESEC/FSE on Doctoral Symposium, New York, NY, USA, pp. 33–34. External Links: ISBN 978-1-60558-731-8, Link, Document Cited by: §I.
  • [11] () C# coding conventions - c# programming guide — microsoft docs. Note: https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/inside-a-program/coding-conventions Cited by: §I.
  • [12] F. Deissenbock and M. Pizka (2005-05) Concise and consistent naming [software system identifier naming]. In 13th International Workshop on Program Comprehension, Vol. , pp. 97–106. External Links: Document, ISSN 1092-8138 Cited by: §I.
  • [13] L. Erlikh (2000-05) Leveraging legacy system dollars for e-business. IT Professional 2 (3), pp. 17–23. External Links: Document, ISSN 1520-9202 Cited by: §I.
  • [14] S. Fakhoury, Y. Ma, V. Arnaoudova, and O. Adesope (2018)

    The effect of poor source code lexicon and readability on developers’ cognitive load

    .
    In Proceedings of the 26th Conference on Program Comprehension, New York, NY, USA, pp. 286–296. External Links: ISBN 978-1-4503-5714-2, Link, Document Cited by: §V-B.
  • [15] P. Goodliffe (2007) Code craft: the practice of writing excellent code. No Starch Press Series, No Starch Press. External Links: ISBN 9781593271190, LCCN 2006015575, Link Cited by: §I.
  • [16] D. T. Guarnera, C. A. Bryant, A. Mishra, J. I. Maletic, and B. Sharif (2018) ITrace: eye tracking infrastructure for development environments. In ACM Symposium on Eye Tracking Research & Applications, New York, NY, USA, pp. 105:1–105:3. External Links: ISBN 978-1-4503-5706-7, Link, Document Cited by: §IV-B.
  • [17] J. Hofmeister, J. Siegmund, and D. V. Holt (2017) Shorter identifier names take longer to comprehend. In International Conference on Software Analysis, Evolution and Reengineering, Vol. . External Links: ISSN Cited by: §I, §V-B.
  • [18] E. W. Høst and B. M. Østvold (2009) Debugging method names. In Object-Oriented Programming, S. Drossopoulou (Ed.), Berlin, Heidelberg, pp. 294–317. External Links: ISBN 978-3-642-03013-0 Cited by: §I, §V-A.
  • [19] E. W. Høst and B. M. Østvold (2009) The java programmer’s phrase book. In Software Language Engineering, Berlin, Heidelberg, pp. 322–341. External Links: ISBN 978-3-642-00434-6 Cited by: §V-A.
  • [20] () Java style guide. Note: https://google.github.io/styleguide/javaguide.html Cited by: §I.
  • [21] K. Kevic, B. M. Walters, T. R. Shaffer, B. Sharif, D. C. Shepherd, and T. Fritz (2015) Tracing software developers’ eyes and interactions for change tasks. In 10th Joint Meeting on Foundations of Software Engineering, New York, NY, USA, pp. 202–213. External Links: ISBN 978-1-4503-3675-8, Link, Document Cited by: §IV-B.
  • [22] D. Lawrie, C. Morrell, H. Feild, and D. Binkley (2007-12-01) Effective identifier names for comprehension and memory. Innovations in Systems and Software Engineering 3 (4), pp. 303–318. External Links: ISSN 1614-5054, Document, Link Cited by: §I, §V-B.
  • [23] B. Liblit, A. Begel, and E. Sweetser (2006) Cognitive perspectives on the role of naming in computer programs. In PPIG, Cited by: §I.
  • [24] Y. Lin, C. Wu, T. Hou, Y. Lin, F. Yang, and C. Chang (2016-08) Tracking students’ cognitive processes during program debugging—an eye-movement approach. IEEE Transactions on Education 59 (3), pp. 175–186. External Links: Document, ISSN 0018-9359 Cited by: §IV-B.
  • [25] H. Liu, Q. Liu, Y. Liu, and Z. Wang (2015-Sep.) Identifying renaming opportunities by expanding conducted rename refactorings. IEEE Transactions on Software Engineering 41 (9), pp. 887–900. External Links: Document, ISSN 0098-5589 Cited by: §V-A.
  • [26] H. Liu, Q. Liu, C. Staicu, M. Pradel, and Y. Luo (2016) Nomen est omen: exploring and exploiting similarities between argument and parameter names. In International Conference on Software Engineering, Vol. . External Links: Document, ISSN 1558-1225 Cited by: §V-A.
  • [27] R. C. Martin (2009) Clean code: a handbook of agile software craftsmanship. Pearson Education. Cited by: §I.
  • [28] C. Newman, M. Decker, R. Alsuhaibani, D. Kaushik, A. Peruma, and E. Hill (2019-Sep.) An empirical study of abbreviations and expansions in software artifacts. In International Conference on Software Maintenance and Evolution, Cited by: §IV-A.
  • [29] A. Peruma, M. W. Mkaouer, M. J. Decker, and C. D. Newman (2019) Contextualizing rename decisions using refactorings and commit messages. In International Working Conference on Source Code Analysis and Manipulation, Cited by: §IV-A.
  • [30] A. Peruma, M. W. Mkaouer, M. J. Decker, and C. D. Newman (2018) An empirical investigation of how and why developers rename identifiers. In International Workshop on Refactoring, External Links: Link, Document Cited by: §IV-A.
  • [31] D. Posnett, A. Hindle, and P. Devanbu (2011) A simpler model of software readability. In Proceedings of the 8th Working Conference on Mining Software Repositories, New York, NY, USA. External Links: ISBN 978-1-4503-0574-7, Link, Document Cited by: §I, §I, §V-B.
  • [32] S. Scalabrino, M. Linares-Vásquez, R. Oliveto, and D. Poshyvanyk (2018) A comprehensive model for code readability. Journal of Software: Evolution and Process 30 (6), pp. e1958. Note: e1958 smr.1958 External Links: Document, Link, https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.1958 Cited by: §I, §I.
  • [33] A. Schankin, A. Berger, D. V. Holt, J. C. Hofmeister, T. Riedel, and M. Beigl (2018) Descriptive compound identifier names improve source code comprehension. In Proceedings of the 26th Conference on Program Comprehension, New York, NY, USA. External Links: ISBN 978-1-4503-5714-2, Link, Document Cited by: §V-B.
  • [34] Z. Sharafi, T. Shaffer, B. Sharif, and Y. Guéhéneuc (2015-12) Eye-tracking metrics in software engineering. In Asia-Pacific Software Engineering Conference, Vol. , pp. 96–103. External Links: Document, ISSN 1530-1362 Cited by: §IV-B.
  • [35] Z. Sharafi, Z. Soh, and Y. Guéhéneuc (2015) A systematic literature review on the usage of eye-tracking in software engineering. Information and Software Technology 67, pp. 79 – 107. External Links: ISSN 0950-5849, Document, Link Cited by: §IV-B.
  • [36] B. Sharif and J. I. Maletic (2010-06) An eye tracking study on camelcase and under_score identifier styles. In IEEE 18th International Conference on Program Comprehension, Vol. , pp. 196–205. External Links: Document, ISSN 1092-8138 Cited by: §V-B.
  • [37] B. Sharif, M. Falcone, and J. I. Maletic (2012) An eye-tracking study on the role of scan time in finding source code defects. In Proceedings of the Symposium on Eye Tracking Research and Applications, New York, NY, USA, pp. 381–384. External Links: ISBN 978-1-4503-1221-9, Link, Document Cited by: §IV-B.
  • [38] X. Xia, L. Bao, D. Lo, Z. Xing, A. E. Hassan, and S. Li (2018) Measuring program comprehension: a large-scale field study with professionals. IEEE Transactions on Software Engineering 44 (10). External Links: Document, ISSN 0098-5589 Cited by: §I.