Automatically Generating Documentation for Lambda Expressions in Java

When lambda expressions were introduced to the Java programming language as part of the release of Java 8 in 2014, they were the language's first step into functional programming. Since lambda expressions are still relatively new, not all developers use or understand them. In this paper, we first present the results of an empirical study to determine how frequently developers of GitHub repositories make use of lambda expressions and how they are documented. We find that 11 6 present a tool called LambdaDoc which can automatically detect lambda expressions in a Java repository and generate natural language documentation for them. Our evaluation of LambdaDoc with 23 professional developers shows that they perceive the generated documentation to be complete, concise, and expressive, while the majority of the documentation produced by our participants without tool support was inadequate. Our contribution builds an important step towards automatically generating documentation for functional programming constructs in an object-oriented language.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

04/17/2018

EBG: A Lazy Functional Programming Language Implemented on the Java Virtual Machine

This technical report describes the implementation of a lazy functional ...
05/19/2020

Choreographies as Objects

We present Choral, the first framework for programming choreographies (m...
07/07/2020

From API to NLI: A New Interface for Library Reuse

Developers frequently reuse APIs from existing libraries to implement ce...
05/10/2021

Why Aren't Regular Expressions a Lingua Franca? An Empirical Study on the Re-use and Portability of Regular Expressions

This paper explores the extent to which regular expressions (regexes) ar...
03/27/2018

Proactive Empirical Assessment of New Language Feature Adoption via Automated Refactoring: The Case of Java 8 Default Methods

Programming languages and platforms improve over time, sometimes resulti...
04/29/2021

Test Smell Detection Tools: A Systematic Mapping Study

Test smells are defined as sub-optimal design choices developers make wh...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction and Motivation

Modern programming languages enhance existing features and add new ones on a regular basis. For example, as part of the release of Java 8 in 2014, several new features and enhancements were introduced, ranging from improved type inference and method parameter reflection to the introduction of lambda expressions. According to Oracle,111https://www.oracle.com/technetwork/java/javase/8-whats-new-2157071.html, last accessed 5 Jan 2019. lambda expressions “enable you to treat functionality as a method argument, or code as data. Lambda expressions let you express instances of single-method interfaces (referred to as functional interfaces) more compactly.”

The introduction of lambda expressions to Java was motivated by the suitability of functional programming for synchronised, parallel, and event-driven programming. Other programming languages, including Groovy, Scala, and Python, already supported functional programming. Lambda expressions are useful since they are shorter than anonymous classes and can result in a similar outcome.

The word lambda is derived from the Greek letter lambda () to represent a function of the abstract [Sharan2014]. Lambda expressions in Java can be defined as nameless code blocks of functions that contain a collection of formal parameters and a body that is joined by an arrow (->). Listing 1 shows examples of lambda expressions in Java [Sharan2014]. A lambda expression that declares its parameters types is called an explicit lambda expression and a lambda expression that does not declare its parameters types is called an implicit lambda expression [Sharan2014].

//Ex1: Explicit lambda expression
//Takes an integer parameter and returns
//the parameter value incremented by 1
(int x) -> x + 1
//Ex2: Implicit lambda expression
//Takes two integer parameters and returns
//the maximum of the two
(x, y) ->
{int max = x > y ? x : y;
return max;}
Listing 1: Examples of lambda expressions

One of the primary objectives of using lambda expressions is to maintain a short syntax and to allow the compiler to deduce the details. However, along with the compiler, developers are also often left to deduce the details on their own. Understanding lambda expressions is not trivial, and good software documentation has long been known to be rare [Lethbridge2003]. To make lambda expressions more accessible to developers, we follow in a long line of work on automatically documenting particular parts of source code (e.g., test cases [Li2016] and database code [Linares-Vasquez2016]) to develop a novel approach called LambdaDoc to automatically document lambda expressions in Java. To the best of our knowledge, LambdaDoc is the first tool of its nature.

In this paper, we contribute:

  • An empirical study of how lambda expressions are used in 435 engineered software repositories hosted on GitHub.

  • An empirical study of the documentation of lambda expressions in GitHub repositories via source code comments.

  • LambdaDoc, a novel approach to automatically document lambda expressions in Java, along with an evaluation with 23 professional developers.

The remainder of this paper is structured as follows. We introduce our research questions as well as our methods for data collection and analysis in Section II. LambdaDoc is described in detail in Section LABEL:sec:lambdadoc and our findings are reported in Section LABEL:sec:findings, separately for each of our research questions. We discuss threats to validity in Section LABEL:sec:threats and related work in Section LABEL:sec:related, before we conclude the paper and outline future work in Section LABEL:sec:conclusions.

Ii Research Method

In this section, we outline our research questions as well as the data collection and analysis methods used to answer them.

Ii-a Research Questions

To study how lambda expressions are used by Java developers, we first needed to devise an approach for detecting lambda expressions in non-compilable code and code snippets in commits. We chose not to rely on existing static analysis approaches (e.g., PMD [Louridas2006]) since these approaches often require compilable code to identify lambda expressions. To evaluate our custom-built lambda expression detector, we first addressed the following research question:

RQ1 How accurate is our approach for detecting lambda expressions?

Since lambda expressions have been recently introduced in Java, little is known about the prevalence of lambda expression usage in Java projects. Hence, we aimed to establish how frequently lambda expressions are used by Java developers. In addition, we investigated the kind of lambda expressions that are commonly used:

RQ2 How are lambda expressions used?

RQ2.1 How many repositories use lambda expressions, and what is the amount of lambda expressions per repository?

RQ2.2 Are the same lambda expressions used in multiple repositories?

In addition to establishing how Java developers are making use of lambda expressions, a goal of our work was the automated generation of documentation for these expressions. To inspire our work on documentation generation, we next asked how lambda expressions are currently documented in the form of source code comments:

RQ3 How are lambda expressions documented?

RQ3.1 How many lambda expressions are accompanied by comments?

RQ3.2 What kind of comments accompany lambda expressions?

Finally, to evaluate our documentation generation approach for lambda expressions LambdaDoc, we asked professional developers to evaluate the automatically generated documentation:

RQ4 How well can LambdaDoc document lambda expressions?

RQ4.1 How do developers document lambda expressions when asked to provide comments?

RQ4.2 How is the documentation generated using LambdaDoc perceived by developers?

Ii-B Data Collection

Ii-B1 Repositories

Motivated by previous work on the perils of mining GitHub [Kalliamvakou2014], e.g., a large portion of repositories on GitHub are not for software development, we used the RepoReaper framework developed by Munaiah et al. [Munaiah2017]

to select repositories for our work. RepoReaper was developed to address the difficulty to differentiate between repositories with engineered software projects and those with assignments and noise. The ratio of unwanted repositories in a stochastic sample could distort research and cause illogical and possibly incorrect conclusions. RepoReaper contains repositories classified as organisation and utility.

To select repositories for our study, we first obtained all 51,392 Java repositories which had been classified as containing engineered software projects by the Random Forest classification of RepoReaper. We then randomly sampled from these 51,392 repositories in batches of 1,000 until we had obtained at least 400 repositories which contained at least one lambda expression detected by our lambda expression detector (see Section

II-B2

). This way, we were able to ensure that our conclusions concerning the ratio of repositories with a specific characteristic would generalise to the entire population of engineered Java repositories on GitHub containing lambda expressions with a confidence level of 95% and a confidence interval of 5.

222https://www.surveysystem.com/sscalc.htm, last accessed 5 Jan 2019. After cloning and analysing 4,000 repositories (i.e., four batches of 1,000, the number it took to find at least 400 repositories containing at least one lambda expression), we had retrieved a total of 435 repositories containing lambda expressions, i.e., 11%. These 435 repositories are a statistically representative sample of all engineered Java repositories containing lambda expressions. They contained a total of 497,108 Java files, out of which 9,933 contained at least one lambda expression. In total, we collected 54,071 lambda expressions across the 435 Java repositories.

Ii-B2 Detection of Lambda Expressions

We developed an approach to detect lambda expressions and collect their metadata, i.e., start line number, start character position, end line number, end character position, number of lines, number of parameters, and type (explicit vs. implicit).

To identify lambda expressions in the source code of each Java file, our lambda expression detector first reads source code line-by-line until encountering a lambda arrow (->) that is not part of a comment or a string. The detector then checks whether the lambda expression covers multiple lines—a multi-line lambda expression starts with an open-parenthesis or an open-bracket and ends with the corresponding closing symbol while a single-line lambda expression ends with a semicolon or a parenthesis. The detector then determines if the lambda expression is explicit or implicit by checking whether the parameter list contains parameter types. For example, in Listing 1, the lambda expression Ex1 is identified as a single-line lambda expression that has an implicit type, while the lambda expression Ex2 is identified as a multi-line lambda expression that has an explicit type. Finally, our lambda expression detector extracts source code comments written directly above the lambda expression. For multi-line lambda expressions, our detector also extracts source code comments written within the lambda expression.

Ii-B3 Practitioner Survey

To evaluate the documentation generated by our automated approach, we employed a practitioner survey which followed a similar structure used by Linares-Vásquez et al. [Linares-Vasquez2016]. We recruited participants through Amazon Mechanical Turk333https://www.mturk.com/, last accessed 22 Jan 2019. and the required qualification was “Employment Industry – Software & IT Services”. This methodology has been successfully used by previous work [Prana2019].

Question Answer options
1 Is developing software part of your job? Yes / No
2 What is your job title? Open-ended
3 For how many years have you been developing software? Less than one year / 1–2 years / 2–4 years / 4–6 years / More than 6 years
4 How many years of experience do you have in Java development? Less than one year / 1–2 years / 2–4 years / 4–6 years / More than 6 years
5 How would you rate your expertise in Java? Beginner / Intermediate / Expert
6 How confident are you in your ability to WRITE lambda expressions in Java? Not confident at all / Slightly confident / Somewhat confident / Fairly confident / Completely confident
7 How confident are you in your ability to READ lambda expressions in Java? Not confident at all / Slightly confident / Somewhat confident / Fairly confident / Completely confident
8 What is your area of software development? (e.g., web, systems, embedded) Open-ended
9 Consider the following lambda expression in Java. Could you please write a one-sentence summary of what this lambda expression does? Open-ended
10 Consider now the following sentence which aims to describe the lambda expression. Only focusing on the content of the sentence without considering the way it has been presented, do you think the description is COMPLETE? The sentence is considered to be complete / The sentence misses some important information to understand the lambda expression / The sentence misses the majority of the important information to understand the lambda expression
11 Consider the same lambda expression and sentence. Only focusing on the content of the sentence without considering the way it has been presented, do you think the sentence is CONCISE? The sentence is considered to be concise / The sentence contains some redundant/useless information / The sentence contains a lot of redundant/useless information
12 Consider the same lambda expression and sentence. Only focusing on the content of the sentence without considering the completeness and conciseness, do you think the sentence is EXPRESSIVE? The sentence is easy to read and understand / The sentence is somewhat readable and understandable / The sentence is hard to read and understand
13 Please try LambdaDoc for any lambda expressions of your choice at link to LambdaDoc web application. Please provide feedback on LambdaDoc here. Open-ended
14 How likely is it that you would recommend this tool to a friend or colleague? 5-point Likert scale from “not likely at all” to “extremely likely”
15 How would you prefer this tool to be implemented? As an Eclipse plugin / As a GUI Application / As a website/web service / Other (multiple-choice)
16 Which of the following software engineering tasks would you use this type of documentation of lambda expressions in Java for? Implementation / Testing / Documentation / Maintenance / Other (multiple-choice)
17 Do you have any further comments about lambda expressions or this survey? Open-ended
TABLE I: Survey questions. Each horizontal like indicates a page break of the survey. Note that Questions 9–12 were repeated five times for each of the lambda expressions and corresponding documentation shown in Table II-B3.
Lambda expression Generated documentation
callInContext( REPO_USER_2, repo2.getId(), MASTER_BRANCH, () -> createNode ( NodePath.ROOT, ”repo2Node” ) );
This lambda expression does not take any parameter and returns the result of the execution of the “create Node” method with two parameters “NodePath ROOT and “repo2Node” ”.
(Integer t, Integer t1) -> Double.compare(splitEvaluation[t], splitEvaluation[t1]) \end{lstlisting} & This lambda expression takes 2 parameters Integer t and Integer t1 and returns the result of the execution of Double’s“comparedto’methodwithtwoparameterselementof“splitEvaluation’arraytandelementof“splitEvaluation’arrayt1.\\
\vspace{-1.5\baselineskip}\begin{lstlisting}[language=java]
.peek(batch->count3=count3+batch.size())\end{lstlisting}&Thislambdaexpressiontakes1parameterbatchandreturnscount3equalcount3plustheresultoftheexecutionofthe“size’methodonit.\\
\vspace{-1.5\baselineskip}\begin{lstlisting}[language=java]
.beforeResolved(ExecutableComponent.class,ec->ec.set(”c”))\end{lstlisting}&Thislambdaexpressiontakes1parameterecandreturnstheresultoftheexecutionofthe“set’methodonitwithparameter“c’’.\\
\vspace{-1.5\baselineskip}\begin{lstlisting}[language=java]
returnstream.flatMap(t->Stream.of(value,t))\end{lstlisting}&Thislambdaexpressiontakes1parametertandreturnstheresultoftheexecutionofStream’s of method with two parameters value and t. \\
\bottomrule
\end{tabular}
\end{table*}
Table~\ref{tab:surveyquestions} shows our survey questions as well as the answer options for each one.
The first section of the survey (Questions 1–8) collected demographic information and established the participantsexperiencewithsoftwaredevelopment,Java,andreadingandwritinglambdaexpressions.
Then,Questions9–12wereusedtoevaluateourgenerateddocumentationforeachofthefivelambdaexpressionsthatwererandomlysampledfromalldetectedlambdaexpressionsofthiswork(seeTable~\ref{tab:fivesummaries}).
Question9askedparticipantstowriteasummaryofwhatthelambdaexpressionshowninthequestiondoes.
Then,Questions10–12askedparticipantstoevaluateourgenerateddocumentationforthelambdaexpressionshowninQuestion9.
Thegenerateddocumentationwasevaluatedintermsofcompleteness,conciseness,andexpressiveness.
NotethatparticipantswerenotabletoseeourgenerateddocumentationuntiltheyhadansweredQuestion9.
Table~\ref{tab:fivesummaries}showsthelambdaexpressionsandourgenerateddocumentationsthatweusedforourevaluation.
Thelastsectionofthesurveyprovidedparticipantswithalinktoawebapplicationof\textsc{LambdaDoc}whereuserscouldsubmitalambdaexpressionandhavedocumentationgeneratedforit.
Finally,Questions13–17askedparticipantsabouttheusefulnessofourtool.
%beforeconcludingthesurvey.
%repeatedfivetimesforeachofthelambdaexpressionsandthecorrespondingdocumentationshowninTable~\ref{tab:fivesummaries}.Thefivelambdaexpressionswererandomlysampledfromalllambdaexpressionscollectedaspartofthiswork.WedescribehowthesesummariesweregeneratedindetailinSection~\ref{sec:lambdadoc}.
%Foreachlambdaexpression/documentationpairshownin,wethenaskedparticipantstowriteasummaryofthewhatthelambdaexpressiondoesbeforeshowingthemoursummaryandaskingthemtorateitintermsofitscompleteness,conciseness,andexpressiveness.
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
␣␣␣␣y=0.2cm,
␣␣␣␣ylabel=participants,
␣␣␣␣xlabel=softwaredevelopmentexperienceinyears,
␣␣␣␣symbolicxcoords={0–1,1–2,2–4,4–6,6+},
␣␣␣␣xtick=data,
␣␣␣␣ymin=0]
␣␣␣␣\addplot[ybar,fill=white]coordinates{
␣␣␣␣(0–1,2)
␣␣␣␣(1–2,5)
␣␣␣␣(2–4,8)
␣␣␣␣(4–6,4)
␣␣␣␣(6+,4)
␣␣␣␣};
\end{axis}
\end{tikzpicture}
\caption{Softwaredevelopmentexperienceofsurveyparticipants}
\label{fig:softwaredevexperience}
\end{figure}
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
␣␣␣␣y=0.2cm,
␣␣␣␣ylabel=participants,
␣␣␣␣xlabel=Javaexperienceinyears,
␣␣␣␣symbolicxcoords={0–1,1–2,2–4,4–6,6+},
␣␣␣␣xtick=data,
␣␣␣␣ymin=0]
␣␣␣␣\addplot[ybar,fill=white]coordinates{
␣␣␣␣(0–1,3)
␣␣␣␣(1–2,8)
␣␣␣␣(2–4,8)
␣␣␣␣(4–6,3)
␣␣␣␣(6+,1)
␣␣␣␣};
\end{axis}
\end{tikzpicture}
\caption{Javaexperienceofsurveyparticipants}
\label{fig:javaexperience}
\end{figure}
Weobtainedresponsesfrom23participants.Onlyoneparticipantindicatedthatdevelopingsoftwarewasnotpartoftheirjob.Thejobtitlesofparticipantsvariedfrom“Dataadministrator’to“Juniorsoftwareengineer’’,butalltitleswererelatedtosoftwaredevelopmentorinformationtechnology.Mostparticipantsspecifiedeitherwebdevelopment(\nicefrac{8}{23}=35\%)orbackend/systemsdevelopment(\nicefrac{7}{23}=30\%)astheirareaofsoftwaredevelopment,withotheranswersindicatingvariousareasfromcomputervisiontodatabases.ThemajorityofparticipantshadconsiderablesoftwaredevelopmentandJavaexperience,asshowninFigures~\ref{fig:softwaredevexperience}and~\ref{fig:javaexperience},i.e.,70\%and52\%ofparticipantshadsoftwaredevelopmentandJavaexperienceofatleast2years,respectively.ParticipantsconsideredtheirexpertiseinJavatoeitherbeatthelevelofbeginners(\nicefrac{10}{23}=43\%)orintermediate(\nicefrac{11}{23}=48\%),withonly2experts(9\%).
Table~\ref{tab:confidence}showsthataconsiderablegroupofparticipantsconsideredthemselvesfairlyconfidentinbothreadingandwritinglambdaexpressions(\nicefrac{8}{23}=35\%),withotheranswersvaryingbetweenslightlyandsomewhatconfident.Noneoftheparticipantsindicatedtobecompletelyconfidentintheirabilitiestoeitherreadorwritelambdaexpressions,furthermotivatingourwork.Only2participants(9\%)consideredthemselvesnotconfidentatall.
\begin{table}
\centering
\caption{Participantconfidenceinreadingandwritinglambdaexpressions}
\label{tab:confidence}
\begin{tabular}{llr}
\toprule
Confidenceinreading&Confidenceinwriting&Count\\
\midrule
Notconfidentatall&Notconfidentatall&2\\
Slightlyconfident&Slightlyconfident&2\\
Fairlyconfident&Slightlyconfident&2\\
Slightlyconfident&Somewhatconfident&4\\
Somewhatconfident&Somewhatconfident&3\\
Somewhatconfident&Fairlyconfident&2\\
Fairlyconfident&Fairlyconfident&8\\
\bottomrule
\end{tabular}
\end{table}
\subsection{DataAnalysis}
\label{sec:analysis}
Inthissection,weoutlinethedataanalysismethodsusedtoanswerourresearchquestions.
\subsubsection{Accuracyoflambdaexpressiondetection}
Toevaluatetheaccuracyofourtoolingtodetectlambdaexpressionsinsourcecodefilesandcommits(i.e.,lambdaexpressiondetector),twoauthorsofthispaperwhowerenotinvolvedintheimplementationofthelambdaexpressiondetectormanuallyannotatedasampleoflinesfromsourcecodefilesandcommitswhichcontainedanarrow(\texttt{->})toindicatewhetherthearrowmarkedthebeginningofalambdaexpressionorservedsomeotherfunction(e.g.,aspartofasourcecommentorstring).Wehypothesisethatdetectinglambdaexpressionsincommitsisharderthaninsourcecodefilessinceadevelopermightonlycommitpartofalambdaexpression,e.g.,forthoseexpressionswhichspanmultiplelines.
Thetwoauthorsindependentlyannotated100suchrandomlysampledlinesfromsourcecodefilesand100suchrandomlysampledlinesfromcommits.Theyachievedperfectagreement(100\%),andoneofthemfurtherannotated300linesfromeachset.Basedontheannotationofatotalof400linesfromsourcecodefilesand400linesfromcommits,ourconclusionsregardingtheaccuracyofthelambdaexpressiondetectiongeneralisetothepopulationofalllinescontainingarrowswithaconfidenceintervalof5ataconfidencelevelof95\%.\footnote{\url{https://www.surveysystem.com/sscalc.htm},lastaccessed5Jan2019.}Wethencomparedthemanualannotationwiththeresultsfromourlambdaexpressiondetector.
\subsubsection{Lambdaexpressionusage}
ToanalysethefrequencywithwhichlambdaexpressionsareusedinGitHubrepositories,weanalyseddescriptivestatisticsofourdataset.Toinvestigatewhethertherearelambdaexpressionswhichareusedinmultiplerepositories,weexaminedthenumberofuniquelambdaexpressions.
Notethatlambdaexpressionswerenormalisedbyremovingwhitespacebeforeanalysinguniquelambdaexpressionsandthatwecomparedlambdaexpressionstextually.
Furthermore,weinvestigatedthecharacteristicsoflambdaexpressionsbyanalysingtheirmetadata,e.g.,thenumberofparametersandtypes.
%normalisedeachlambdaexpressioninourdatasetbyremovingwhitespace.Wethenreportwhichexpressionsareusedthemostandbymostprojects.
\subsubsection{Documentationoflambdaexpressions}
ToinvestigatetheextenttowhichlambdaexpressionsarealreadydocumentedinGitHubrepositories,wequantitativelyanalysedhowmanyofthelambdaexpressionswereaccompaniedbyacommenteitherdirectlyabovetheexpression(i.e.,\textit{abovecomments})orwithintheexpression(i.e.,\textit{withincomments}).Wethenqualitativelyanalysedastatisticallyrepresentativesampleofthecommentswefoundinordertodeterminewhetherthecommentsactuallydescribedthefunctionalityofthelambdaexpressions.Forthequalitativeannotation,oneoftheauthorsestablishedacodingschemebasedonapreliminaryanalysisof100comments.Anotherauthorthenusedthiscodingschemaonthesamedata,allowingustocalculateinter-rateragreement.Twoauthorsthenappliedthiscodingschematoannotateatotalof200abovecommentsand200withincomments.
\subsubsection{\textsc{LambdaDoc}evaluation}
Toexplorehowdevelopersdocumentlambdaexpressions,wequalitativelyanalysedtheresponsestoQuestion9ofoursurvey(cf.~Table~\ref{tab:surveyquestions}).Sincethegoalofthisanalysiswastoestablishthelevelofdetailwithwhichdevelopersdocumentlambdaexpressions,weusedthepre-definedcategories“adequate’’,“incomplete’’,and“inadequate’fortheannotation.Oneauthorannotatedall115documentationattempts(23participants$\times$5\mbox{documentationattempts}).
Finally,tounderstandtheperceptionsofdevelopersregardingourtool\textsc{LambdaDoc},wequantitativelyanalysedtheresponsestosurveyQuestions10–16(cf.~Table~\ref{tab:surveyquestions}).
\section{\textsc{LambdaDoc}}
\label{sec:lambdadoc}
\begin{algorithm}
␣␣␣␣␣␣␣␣\caption{\textsc{LambdaDoc}(lambdaExpression)}
␣␣␣␣␣␣␣␣\label{alg:lambdadoc}
␣␣␣␣␣␣␣␣\begin{algorithmic}[1]
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{doc}\gets\emptyset$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{intro}\gets\text{“Thislambdaexpression’’}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{exp}\gets\text{lambdaExpression.split(“$->$’’)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{pc}\gets\text{numberOfParameters(exp[0])}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{pc}==0$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{pText}\gets\text{“doesnottakeanyparameter’’}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\Else
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{pc}==1$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{p}\gets\text{parameterName(exp[0])}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{pText}\gets\text{“takes1parameter’+p}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\Else
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{pc}>1$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{ps}\gets\text{parameterNames(exp[0])}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{pText}\gets\text{“takes’+pc+“parameters’+ps}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndIf
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndIf
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndIf
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{exp[1].contains(“.’’)}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{i}\gets\text{0}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{mText}\gets\emptyset$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\For{$\text{i}<\text{exp[1].length()}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{exp[1].charAt(i)==operator}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{mText}\gets\text{mText+operatorToWord(operator)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{i}++$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndIf
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\If{$\text{exp[1].charAt(i)}==\text{“.’’}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{dot}\gets\text{i}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\While{$\text{exp[1].charAt(i)}!=\text{“(’’}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{mText}\gets\text{mText+exp[1].charAt(i)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{i}++$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndWhile
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{j}\gets\text{dot}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{oText}\gets\emptyset$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\For{$\text{j}>0$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\While{$\text{exp[1].charAt(j).isDigitOrNum}$}
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{oText}\gets\text{oText+exp[1].charAt(j)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{j}–$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\EndWhile
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{oText}\gets\text{reverseString(oText)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{mText}\gets\text{camelCaseSplit(mText)}$
␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣␣\State$\text{rText}\gets\text{oText+“’s + mText)}$
                \EndFor
                \EndIf
                \EndFor
                \State $\text{doc} \gets \text{intro + pText + and returns + rText}$
                \Else
                \While{$\text{exp[1].contains(operator)}$}
                \State $\text{rText} \gets \text{operatorToWord(rText)}$
                \EndWhile
                \State $\text{doc} \gets \text{intro + pText + and returns + rText}$
                \EndIf
                \State $\text{return doc}$
        \end{algorithmic}
\end{algorithm}
In this section, we present our \textsc{LambdaDoc} approach to generate documentation for a lambda expression.
% briefly outline how our lambda expression documentation approach \textsc{LambdaDoc} works.
The documentation generated by \textsc{LambdaDoc} starts with the phrase This lambda expression, followed by details of parameters and body of the lambda expressions.
Table~\ref{tab:fivesummaries} shows examples of the documentation generated by \textsc{LambdaDoc}.
Algorithm~\ref{alg:lambdadoc} shows the algorithm of \textsc{LambdaDoc} in pseudocode.
The first step of the algorithm is to identify the parameters and the body of a lambda expression by dividing the expression at the arrow (\texttt{->}) and to determine the number of parameters.
If there are no parameters, the text does not take any parameter is added to the documentation.
Otherwise, the names of the parameters are determined, and a sentence fragment is generated which indicates the number of parameters along with their names.
The body is analysed to determine if it contains at least one method call based on the presence of a dot (\texttt{.}) not followed by a digit. If the body contains a method, its object, name, and parameters are identified. The method name is split based on camel case.
Then object, method name, and parameters are concatenated in the documentation. Operators (e.g., \texttt{+}) are replaced with a word representing them (e.g., plus).
Note that \textsc{LambdaDoc} is designed to generate documentation for lambda expressions which contain a single statement.
\section{Findings}
\label{sec:findings}
In this section, we report the findings for our \mbox{research questions}.
\subsection{Accuracy of lambda expression detection}
% \subsection{(RQ1) How accurate is our approach for detecting lambda expressions?}
\label{sec:accuracy}
Out of the 400 lines from source code files which contained an arrow (\texttt{->}), our manual annotation revealed that 258 (65\%) of the lines contained the start of a lambda expression while the remaining 142 (35\%) lines contained arrows for other reasons, mostly as part of strings or in source code comments.
Using the same set of source code lines, we used our lambda expression detector to identify whether the lines contained a lambda expression or not.
We found that our lambda expression detector achieved a recall of 1, i.e., all of the 258 lines containing lambda expressions from the manual annotation result were identified as containing lambda expressions by the lambda expression detector.
Furthermore, our lambda expression detector achieved a precision of 1, i.e., all of the lines that were identified as containing lambda expressions by our lambda expression detector are those 258 lines containing lambda expressions from the \mbox{manual annotation}.
Similarly, our manual annotation of the 400 lines from commits which contained an arrow identified 269 (67\%) lines in which the arrow belonged to a lambda expression and 131 (33\%) lines where the arrow was not part of a lambda expression.
Based on this manual annotation result, our lambda expression detector achieved a recall and precision of 1. Note that this result could be impacted by lambda expressions which do not contain subsequent characters indicating arrows (\texttt{->}); however, we are not aware of such expressions.
% Our manual annotation matched the result from our lambda detection approach perfectly, i.e., with a precision and recall of 1.0.
\begin{tcolorbox}
\textbf{Summary:} Our approach for detecting lambda expressions is able to identify lambda expressions in source code files and commits with perfect precision and recall.
\end{tcolorbox}
% \subsection{(RQ2) How are lambda expressions used?}
\subsection{Lambda expression use}
\label{sec:frequency}
As reported in Section~\ref{sec:collection}, 11\% (\nicefrac{435}{4,000}) of the Java repositories in our sample made use of lambda expressions at least once.
For these 435 repositories, Figure~\ref{fig:lambdasperrepository} shows the number of lambda expressions that are contained in each repository. The distribution has a long taila few the repositories extensively use lambda expressions (e.g., 10\% of the repositories contain more than 100 expressions) while the majority of the repositories sometimes uses them. In our sample, aol/simple-react\footnote{\url{https://github.com/aol/cyclops}, last accessed 19 Jan 2019.} and elastic/elasticsearch\footnote{\url{https://github.com/elastic/elasticsearch}, last accessed 19 Jan 2019.} are the most prolific users of lambda expressions, with 18,754 and 11,886 expressions, respectively.
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
y = 0.5cm,
yticklabels = {1, 10, 100, 1000, 10000},
ylabel = lambda expressions (log scale),
xlabel = repositories,
ymode = log,
xticklabels={,,},
ymin = 1]
\addplot[only marks] table [x index=0,y index=1,col sep=comma] {lambdasperrepository.dat};
\end{axis}
\end{tikzpicture}
\caption{Number of lambda expressions per repository, considering only the subset of repositories which use lambda expressions (log scale)}
\label{fig:lambdasperrepository}
\end{figure}
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
y = 0.5cm,
yticklabels = {1, 10, 100},
ylabel = lines (log scale),
xlabel = lambda expressions,
ymode = log,
xticklabels={,,},
ymin = 1]
\addplot[only marks] table [x index=0,y index=1,col sep=comma] {linesperlambda.dat};
\end{axis}
\end{tikzpicture}
\caption{Number of lines per lambda expression (log scale)}
\label{fig:linesperlambda}
\end{figure}
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
y = 0.25cm,
ylabel = parameters,
xlabel = lambda expressions,
xticklabels={,,},
ymin = 0]
\addplot[only marks] table [x index=0,y index=1,col sep=comma] {paramsperlambda.dat};
\end{axis}
\end{tikzpicture}
\caption{Number of parameters per lambda expression}
\label{fig:paramsperlambda}
\end{figure}
Out of the 54,071 lambda expressions in our data set, 33,916 were unique (after removing white space). Table~\ref{tab:commonlambdas} shows the three lambda expressions which were used in the largest number of the studied Java repositories.
As shown in the table, lambda expressions can play an important role in exception handling. An additional analysis of the 33,916 unique lambda expressions showed that 4,920 (15\%) contain the string exception.
We now report common characteristics of lambda expressions in our data set.
Figure~\ref{fig:linesperlambda} shows that 70\% of the lambda expressions are single-line, while there are outliers with up to more than 300 lines per one lambda expression.
Figure~\ref{fig:paramsperlambda} shows that the majority of the lambda expressions (57\%) have exactly one parameter
while a sizeable minority of lambda expressions (27\%) does not have any parameters.
The other lambda expressions have two to eight parameters.
We also found that 98\% of the lambda expressions were implicit, i.e., they do not declare their parameter types.
\begin{table}
\centering
\caption{Lambda expressions used across repositories}
\label{tab:commonlambdas}
\begin{tabular}{p{6.6cm}r}
\toprule
Lambda expression & Repositories \\
\midrule
\vspace{-1.5\baselineskip} \begin{lstlisting}[language=java]
return () -> { try { return task.call(); } catch (Exception e) { handle(e); throw e; } };\end{lstlisting} & 7 \\
\vspace{-1.5\baselineskip} \begin{lstlisting}[language=java]
return () -> { try { task.run(); } catch (Exception e) { handle(e); } }; \end{lstlisting} & 7 \\
\vspace{-1.5\baselineskip} \begin{lstlisting}[language=java]
.map(user -> new ResponseEntity<>(user, HttpStatus.OK)) \end{lstlisting} & 7 \\
\bottomrule
\end{tabular}
\end{table}
\begin{tcolorbox}
\textbf{Summary:} 11\% of Java GitHub repositories use lambda expressions at least once. Exception handling is a common purpose of using lambda expressions. Lambda expressions are usually implicit, single-line, and have one parameter.
% while 10\% of the projects use more than 100 expressions.
\end{tcolorbox}
\subsection{Documentation of lambda expressions}
\label{sec:documentation}
Out of the 54,071 lambda expressions in our data set, the vast majority (50,984 = 94\%) was not accompanied by any documentation, neither right above the expression nor within the expression. We found 1,531  (3\%) lambda expressions with a comment right above, 1,298 (2\%) lambda expressions with a comment within, and an additional 258 (0.5\%) lambda expressions with a comment above and a comment within.
To understand whether the comments which accompany lambda expressions actually document the expression, we manually annotated a randomly sampled set of 200 comments from above a lambda expression and another 200 comments from within a lambda expression. One author annotated 50 comments from each set to establish the following \mbox{coding schema}:
\begin{itemize}
\item high-level documentation: the comment describes the lambda expression, but at a very high level. An example is the comment “\texttt{// start bottom-up}” above a 53-line lambda expression—while it captures the core purpose of the expression, it does not explain how this purpose was achieved.
\item reasonably detailed explanation: the comment appears to explain the lambda expression reasonably well. An example is the comment “\texttt{// Increment the number of connections for this node by one}” above a one-line expression with an increment statement.
\item documentation of a detail: the comment seems relevant, but does not capture the lambda expression as a whole. An example is the comment “\texttt{// this exception should cause the link chain to explode}” within a 7-line lambda expression next to a \texttt{throw} statement—while the comment explains this statement, it does not explain the lambda expression as a whole.
\item source code fragment: the comment looks like source code (or pseudocode).
\item other: comments that do not fit into any of the above categories, e.g., TODO comments or comments describing expected output.
\end{itemize}
Another author then used this coding schema to annotate the same 100 comments independently, achieving an agreement of \nicefrac{46}{50} for comments above lambda expressions and \nicefrac{44}{50} for comments within lambda expressions (weighted kappa~\cite{Cohen1960} with five categories: 0.906,\footnote{\url{https://www.graphpad.com/quickcalcs/kappa1/?K=5}, last accessed 22 Jan 2019.} i.e., almost perfect agreement~\cite{Landis1977}). Most of the inconsistent annotations were about “source code fragment” vs.~“other” in cases where the comment \mbox{indicated values}.
\begin{table}
\centering
\caption{Frequency of different kinds of source code comments accompanying lambda expressions}
\label{tab:codecomments}
\begin{tabular}{llr@{\hspace{0.2cm}}r}
\toprule
Location & Type & \multicolumn{2}{c}{Frequency} \\
\midrule
above & high-level documentation & 123 & (62\%)\\
above & reasonably detailed explanation & 22 & (11\%)\\
above & documentation of a detail & 5 & (3\%)\\
above & source code fragment & 11 & (6\%)\\
above & other & 39 & (20\%)\\
\midrule
within & high-level documentation & 0 & (0\%)\\
within & reasonably detailed explanation & 27 & (14\%)\\
within & documentation of a detail & 129 & (65\%)\\
within & source code fragment & 17 & (9\%)\\
within & other & 27 & (14\%)\\
\bottomrule
\end{tabular}
\end{table}
Given the almost perfect agreement, one author then annotated the remaining 150 comments from each group, for a total of 400 annotated source code comments. Table~\ref{tab:codecomments} shows the result of the annotation. The majority of comments above lambda expressions describe the expression, but only at a high level. The majority of comments within lambda expressions document a detail, but cannot be considered as documentation of the entire expression.
This observation encouraged us to employ a rule-based approach for the generation of documentation by \textsc{LambdaDoc}. Given the low quality of existing comments, a machine learning approach appeared infeasible, although future work should confirm this assumption.
\begin{tcolorbox}
\textbf{Summary:} Only 6\% of the lambda expressions in our data set are accompanied by source code comments. Most of these comments describe the lambda expression at a high level or document a detail within the expression.
\end{tcolorbox}
\subsection{\textsc{LambdaDoc} evaluation}
\label{sec:evaluation}
Our manual analysis of the 115 lambda expression documentation attempts produced by the survey participants (5 lambda expressions $\times$ 23 participants) confirmed our conjecture that many developers do not know how to read lambda expressions, which motivated our work on \textsc{LambdaDoc} in the first place. 57 (50\%) of the documentation attempts were inadequate, e.g., calculations and It can be passed around as if it was an object and executed on demand. Another 39 (34\%) were incomplete, e.g., it does compare the variables declared and This expression takes two parameters and return[s] the result after execution of function. Only 19 (17\%) of the lambda expression documentation attempts produced by our participants could be considered adequate. Such positive examples include compares two integers using splitEvaluation (for the second lambda expression in Table~\ref{tab:fivesummaries}) and call set with c for the given ExecutableComponent (for the fourth lambda expression in Table~\ref{tab:fivesummaries}).
\begin{figure}
\centering
\begin{tikzpicture}[bar width = 7pt]
\begin{axis}[
hide axis,
bar shift = -8pt,
y = 0.25cm,
ymin = 0,
ymax = 15,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north west, at={(0.02,0.98)}},
legend entries={complete},
]
\addplot[color=black!60!green, fill=black!60!green] coordinates {(lambda1,9) (lambda2,6) (lambda3,7) (lambda4,8) (lambda5,8)};
\addplot[color=black!60!green] coordinates {(lambda1,3) (lambda2,4) (lambda3,3) (lambda4,4) (lambda5,3)};
\end{axis}
\begin{axis}[
hide axis,
y = 0.25cm,
ymin = 0,
ymax = 15,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north, at={(0.5,0.98)}},
legend entries={misses some},
]
\addplot[color=black!60!yellow, fill=black!60!yellow] coordinates {(lambda1,4) (lambda2,7) (lambda3,5) (lambda4,4) (lambda5,4)};
\addplot[color=black!60!yellow] coordinates {(lambda1,7) (lambda2,2) (lambda3,6) (lambda4,6) (lambda5,5)};
\end{axis}
\begin{axis}[
bar shift = 8pt,
y = 0.25cm,
ymin = 0,
ymax = 15,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north east, at={(0.98,0.98)}},
legend entries={misses most},
]
\addplot[color=black!60!red, fill=black!60!red] coordinates {(lambda1,0) (lambda2,0) (lambda3,1) (lambda4,1) (lambda5,1)};
\addplot[color=black!60!red] coordinates {(lambda1,0) (lambda2,4) (lambda3,1) (lambda4,0) (lambda5,2)};
\end{axis}
\end{tikzpicture}
\caption{Completeness ratings. Green: The sentence is considered to be complete; Yellow: The sentence misses some important information to understand the lambda expression; Red: The sentence misses the majority of the important information to understand the lambda expression; Solid: Not confident in reading lambda expressions.}
\label{fig:completeness}
\end{figure}
After asking our participants to produce their own documentation, we asked them to assess the documentation generated by \textsc{LambdaDoc} in terms of its completeness, conciseness, and expressiveness (cf.~Table~\ref{tab:surveyquestions}). For our analysis, we distinguished participants based on their self-assessed confidence with regard to reading lambda expressions (cf.~Table~\ref{tab:surveyquestions}, \mbox{Question 7}).
Figure~\ref{fig:completeness} shows the results for completeness. The non-filled part of each bar represents responses by participants who declared themselves to be at least fairly confident in reading lambda expressions. The most positive responses are shown in green, medium responses are shown in yellow, and negative responses are shown in red. As the figure shows, participants generally agreed that the documentation produced by \textsc{LambdaDoc} is completethe response The sentence is considered to be complete received the highest number of responses for all but one of the lambda expressions. Responses from participants who considered themselves not confident were slightly more positivesuggesting that \textsc{LambdaDoc} might be especially helpful to newcomers to functional programming in Java.
\begin{figure}
\centering
\begin{tikzpicture}[bar width = 7pt]
\begin{axis}[
hide axis,
bar shift = -8pt,
y = 0.25cm,
ymin = 0,
ymax = 20,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north west, at={(0.02,0.98)}},
legend entries={concise},
]
\addplot[color=black!60!green, fill=black!60!green] coordinates {(lambda1,7) (lambda2,10) (lambda3,6) (lambda4,8) (lambda5,3)};
\addplot[color=black!60!green] coordinates {(lambda1,5) (lambda2,7) (lambda3,3) (lambda4,2) (lambda5,7)};
\end{axis}
\begin{axis}[
hide axis,
y = 0.25cm,
ymin = 0,
ymax = 20,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north, at={(0.4,0.98)}},
legend entries={some redundancy},
]
\addplot[color=black!60!yellow, fill=black!60!yellow] coordinates {(lambda1,6) (lambda2,3) (lambda3,6) (lambda4,4) (lambda5,10)};
\addplot[color=black!60!yellow] coordinates {(lambda1,3) (lambda2,2) (lambda3,6) (lambda4,5) (lambda5,3)};
\end{axis}
\begin{axis}[
bar shift = 8pt,
y = 0.25cm,
ymin = 0,
ymax = 20,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north east, at={(0.98,0.98)}},
legend entries={a lot of redundancy},
]
\addplot[color=black!60!red, fill=black!60!red] coordinates {(lambda1,0) (lambda2,0) (lambda3,1) (lambda4,1) (lambda5,0)};
\addplot[color=black!60!red] coordinates {(lambda1,2) (lambda2,1) (lambda3,1) (lambda4,3) (lambda5,0)};
\end{axis}
\end{tikzpicture}
\caption{Conciseness ratings. Green: The sentence is considered to be concise; Yellow: The sentence contains some redundant/useless information; Red: The sentence contains a lot of redundant/useless information; Solid: Not confident in reading lambda expressions.}
\label{fig:conciseness}
\end{figure}
Figure~\ref{fig:conciseness} shows the participant responses with regard to conciseness. For most of the lambda expressions, the positive answer The sentence is considered to be concise was selected most often, while the documentation generated for the third and fifth expression was considered to contain some \mbox{redundant/useless} information. As the documentation in Table~\ref{tab:fivesummaries} shows, we opted for \textsc{LambdaDoc} to produce precise and detailed documentationfuture work should explore whether there exists a better balance between precision and conciseness. Very few participants indicated that \textsc{LambdaDoc} produces a lot of redundant/useless information.
\begin{figure}
\centering
\begin{tikzpicture}[bar width = 7pt]
\begin{axis}[
hide axis,
bar shift = -8pt,
y = 0.25cm,
ymin = 0,
ymax = 17,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north west, at={(0.02,0.98)}},
legend entries={easy to read},
]
\addplot[color=black!60!green, fill=black!60!green] coordinates {(lambda1,7) (lambda2,8) (lambda3,7) (lambda4,9) (lambda5,6)};
\addplot[color=black!60!green] coordinates {(lambda1,6) (lambda2,3) (lambda3,4) (lambda4,5) (lambda5,5)};
\end{axis}
\begin{axis}[
hide axis,
y = 0.25cm,
ymin = 0,
ymax = 17,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north, at={(0.5,0.98)}},
legend entries={somewhat readable},
]
\addplot[color=black!60!yellow, fill=black!60!yellow] coordinates {(lambda1,5) (lambda2,5) (lambda3,5) (lambda4,3) (lambda5,7)};
\addplot[color=black!60!yellow] coordinates {(lambda1,3) (lambda2,6) (lambda3,3) (lambda4,3) (lambda5,2)};
\end{axis}
\begin{axis}[
bar shift = 8pt,
y = 0.25cm,
ymin = 0,
ymax = 17,
ybar stacked,
enlarge x limits = 0.15,
ylabel = participants,
symbolic x coords = {lambda1, lambda2, lambda3, lambda4, lambda5},
xtick = data,
legend style={font=\footnotesize, draw=none, anchor=north east, at={(0.98,0.98)}},
legend entries={hard to read},
]
\addplot[color=black!60!red, fill=black!60!red] coordinates {(lambda1,1) (lambda2,0) (lambda3,1) (lambda4,1) (lambda5,0)};
\addplot[color=black!60!red] coordinates {(lambda1,1) (lambda2,1) (lambda3,3) (lambda4,2) (lambda5,3)};
\end{axis}
\end{tikzpicture}
\caption{Expressiveness ratings. Green: The sentence is easy to read and understand; Yellow: The sentence is somewhat readable and understandable; Red: The sentence is hard to read and understand; Solid: Not confident in reading lambda expressions.}
\label{fig:expressiveness}
\end{figure}
The answer The sentence is easy to read and understand was selected most often in response to our question about the expressiveness of the content generated by \textsc{LambdaDoc}, as shown in Figure~\ref{fig:expressiveness}. Only our treatment of array indices which affected the documentation of the second lambda expression (cf.~Table~\ref{tab:fivesummaries}) prompted an equal number of somewhat readable and understandable ratings. As with the previous questions, the differences between participants confident in reading lambda expressions and those who indicated not be confident appear negligible.
\begin{figure}
\centering
\begin{tikzpicture}
\begin{axis}[
    y = 0.2cm,
    ylabel = participants,
    xlabel = rating on Likert scale,
    symbolic x coords={1, 2, 3, 4, 5},
    xtick=data,
    ymin = 0]
    \addplot[ybar,fill=white] coordinates {
    (1, 1)
    (2, 2)
    (3, 7)
    (4, 6)
    (5, 7)
    };
\end{axis}
\end{tikzpicture}
\caption{Survey responses to How likely is it that you would recommend this tool to a friend or colleague?. 1  not likely at all; 5  extremely likely.}
\label{fig:recommendation}
\end{figure}
Participants also left positive feedback after trying the online version of our tool, e.g., Very helpful in adapting to a new way of writing code. Figure~\ref{fig:recommendation} shows the participantsresponsestooursurveyquestion“Howlikelyisitthatyouwouldrecommendthistooltoafriendorcolleague?’ona5-pointLikertscale.Themajorityofparticipantswerepositiveaboutissuingsucharecommendation.Finally,responsesastohowparticipantswouldliketohave\textsc{LambdaDoc}implementedvariedfromGUIapplication(11)andwebsite/webservice(10)toEclipseplugin(9,multipleanswerspossible).Similarly,participantscouldseethemselvesusing\textsc{LambdaDoc}fordocumentation(13),testing(10),implementation(8),andmaintenance(4).
\begin{tcolorbox}
\textbf{Summary:}Whenaskedtodocumentlambdaexpressions,mostoftheresponsesproducedbyourparticipantswereinadequate.Incontrast,thedocumentationproducedby\textsc{LambdaDoc}waslargelyperceivedtobecomplete,concise,andeasytoreadandunderstand.
\end{tcolorbox}
\section{ThreatstoValidity}
\label{sec:threats}
Aswithallempiricalstudies,thereareanumberofthreatsthatmayaffectthevalidityofourresults.
Threatstoconstructvalidityconcernthesuitabilityofourevaluationmetrics.FollowingtheworkofLinares-V\’{a}squezetal.~\cite{Linares-Vasquez2016}andothers,weevaluatedthedocumentationgeneratedby\textsc{LambdaDoc}intermsofitsperceivedcompleteness,conciseness,andexpressiveness,usingsurveyquestionsverysimilartothoseusedinpreviouswork.Futureworkshouldexploreotherdimensionsofthegenerateddocumentation,e.g.,helpfulness.Ouralgorithmfordetectinglambdaexpressionscouldpotentiallyleadtoinaccuracies.However,thefindingsfromourfirstresearchquestionshowthatinastatisticallyrepresentativesample,alllambdaexpressionswerecorrectlyidentified.Itisalsopossiblethatanothergroupofresearcherswouldhaveidentifieddifferentkindsofdocumentationinthesourcecodecommentswhichaccompanieslambdaexpressions.However,ourinter-rateragreementwasalmostperfect,increasingtheconfidenceinourfindings.
Threatstoexternalvalidityaffectthegeneralisabilityofourfindings.Wecannotclaimthatourfindingsgeneralisebeyondtheparticulardatasetwehaveconsideredinthiswork.Ourworkmaynotgeneralisetootherprogramminglanguagesorotherfunctionalprogrammingconstructs.Thenumberofstudyparticipantsandthenumberoflambdaexpressionsusedintheevaluationof\textsc{LambdaDoc}arealsonecessarilylimited.Askingdifferentparticipantsabouttheirperceptionsof\textsc{LambdaDoc}mighthaveresultedindifferentfindings.Alllambdaexpressionsusedintheevaluation(cf.~Table~\ref{tab:fivesummaries})weresingle-lineexpressions.Whilewefoundthatsingle-linelambdaexpressionsareinthemajorityonGitHub(cf.~Figure~\ref{fig:linesperlambda}),futureworkshouldinvestigatetheperceivedcompleteness,conciseness,andexpressivenessofdocumentationgeneratedformulti-linelambdaexpressions.
Threatstointernalvalidityrelatetoerrorsorinaccuraciesinourimplementation.Ourcurrentimplementationisunabletodetectembeddedlambdaexpressionsifbotharrowsareonthesameline.Apartfromthisissue,wehavedouble-checkedoursourcecodeandfixedallerrorswefound.Still,therecouldbeadditionalerrorswhichwedidnotnotice.
\section{RelatedWork}
\label{sec:related}
Aftertheintroductionoflambdaexpressions,priorworkstartedtoinvestigatetheuseoflambdaexpressionsanditsimpact.
Uesbecketal.~foundthatusinglambdaexpressionsinC++hasanegativeimpactonprogrammingspeedofinexperiencedusers~\cite{Uesbeck2016}.
Toencouragedeveloperstoadoptnewlanguagefeatures,KhatchadourianandMasuharasubmittedpullrequestsintroducinglanguagefeaturestoopensourceprojects~\cite{Khatchadourian2018}.
Mazinanianetal.~investigatedtheadoptionoflambdaexpressionsin241Javaopensourceprojectsandfoundthatprojectsmigratetolambdaexpressionsbyconvertingclassestolambdaexpressions,replacingloops/conditionalswithstreams,andenhancingfunctionalitybywrappingexistingcodetolambdaexpressions~\cite{Mazinanian2017}.
Complementingthepriorwork,inthiswork,wefoundthatexceptionhandlingisacommonpurposeofusinglambdaexpressionsinJavaandthatlambdaexpressionsareusuallyimplicit,single-line,andhaveoneparameter.
Whilesoftwaredocumentationmakesiteasierfordeveloperstocomprehendsoftwareartefacts,manually-writtendocumentationbecomesatedioustaskfordevelopers.
Severalstudiesempiricallyinvestigatetheessentialsofsoftwaredocumentation.
DeSouzaetal.~showedthat␣␣developersperceivedthatsourcecodecommentsarethesecondmostimportantsoftwareartefactinAgilesoftwaredevelopment\cite{deSouza2005}.
Linetal.~foundthatdevelopersspenteffortonmaintainingAPIdocumentation,e.g.,literalpolishes~\cite{Lin2011}.
Flurietal.~foundthatnewlyaddedcodewasrarelydocumentedandasourcecodecommentwasoftenchangedalongwiththeassociatedsourcecode~\cite{Fluri2007}.
Lietal.~reportedthatwhiledevelopersneeddocumentationtounderstandunittestcases,alargeproportionofC\#projectsonGitHublackedcommentsforunittestcases~\cite{Li2016}.
Moreover,Ibrahimetal.~foundthatneglectingtoupdateacommentincreasestheprobabilityofhavingfuturedefectsinasoftwaresystem~\cite{Ibrahim2012}.
Inthispaper,weempiricallyinvestigatedthesourcecodecommentsaccompanyinglambdaexpressionsandfoundthatonly6\%ofthelambdaexpressionshavecorrespondingsourcecodecomments.
%contextsofusingautomatedsourcecodesummarization
SeveralresearchershavedevelopedapproachestoautomaticallysummariseJavasourcecode.
Forexample,
Morenoetal.~developedanapproachtosummarisetheinformationanddetailsofJavaclasses~\cite{Moreno2013}.
McBurneyetal.~analysedthemethodcallsandleveragedthePageRankalgorithmtogenerateadescriptionofthebehaviourofaJavamethod~\cite{McBurney2016}.
YingandRobillarddevelopedanapproachfortheautomatedsummarisationofcodefragments~\cite{Ying2013}.
Buseetal.~developedanapproachtoautomaticallysummariseconditionsofJavaexceptions~\cite{Buse2008}.
Furthermore,automateddocumentationgenerationhasbeendevelopedforothersoftwareartefacts.
Priorworkdevelopedanapproachtoautomaticallysummariseunittestcases~\cite{Li2016}andtestfailures~\cite{Sai2011}.
%Lietal.~developed\texttt{UnitTestScribe}whichcombinesstaticcodeanalysis,backwardslicing,andnaturallanguageprocesstogenerateadocumentationforunittestcases~\cite{Li2016}.
Tohelpdevelopersunderstanddatabaseschematawhenwritingdatabase-relatedcode,Linares-V\’{a}squezetal.~developedanapproachtoanalysedatabaseschemataandSQLstatementstoautomaticallydescribedatabaseusageatthesourcecodemethodlevel~\cite{Linares-Vasquez2016}.
Racchettietal.~proposedanapproachtoautomaticallygeneratedocumentationforProgrammableLogicController(PLC)code~\cite{Racchetti2015}.
HassanandHillpresentedatechniquetowardsautomaticallygeneratingcommentsforJavastatementssuitablefornoviceprogrammers~\cite{Hassan2018}.
Moreover,priorstudiesproposedanapproachtosummariseindividualcodechanges~\cite{Buse2010}andthesoftwareevolutionbasedoncodechanges~\cite{Kim2013}.
Recently,Robillardetal.~outlinedaresearchagendaforgeneratingdeveloperdocumentationon-demand~\cite{Robillard2017}.
%Buseetal.~analysedthesymbolicexecutionanddevelopedanapproachtogeneratecommitmessageforcodechanges~\cite{Buse2010}.
%Kimetal.~inferschangerulesthatexplaintheevolutionofsoftware.
Inthiswork,wearethefirsttodevelopanapproachtoautomaticallygeneratedocumentationforlambdaexpressionsinJava.
%Tobeadded
%\cite{Subramanian2014}Live