The customers and users need for new products and services according to high quality standards have increased in the last time. In that sense, the productive processes must be aligned with the organization and development process in order to achieving this goal.
Software product line (SPL) is an approach that can deal with those needs, increasing productivity without sacrificing quality. SPL is a set of software intensive systems that share a common set of features and are developed for a specific segment or domain using a defined process . There are several known benefits of the SPL use: reduction in the cycles of product development, increase in productivity by an order of magnitude, decrease in cost and a substantial improvement in the quality of products [2, 3].
The domain analysis for SPL is, to the best of our knowledge, the most important stage in the development process. Here, feature models (FMs) are a domain analysis artifact used to describe all identified features. Furthermore, it is de facto standard for managing variability .
The aim of this paper is to synthesize the current state of research reported in the literature regarding the application domain, underlying model, origin, degree of empirical validation and quality of existing feature modeling tools used in SPL. The motivation is to check for improvement tendencies in the field, covering the period between 2000 and 2019. In particular, empirical validation of the tools, has been repeatedly pointed out as important deficiencies of the field. Also, we include an initial quality assessment for the different feature modeling tools that we will find.
We think this study may be of interest to both academic researchers and industry professionals who wish to get an updated view of feature modeling tools, which are their shortcomings and strengths in terms of some quality criteria. With this knowledge, they will better assess the potential benefits and risks associated with adopting each feature modeling tool. Furthermore, it could be of interest to researchers looking for gaps in research for doing additional studies on feature modeling tools for SPLs. In addition, we see this study as a continuation of what we have done on different aspects for FMs and modeling languages used in SPLs [5, 6, 7].
Therefore, this technical report presents the protocol definition for a systematic mapping study (SMS) that we will conduct to identify and assess the set of relevant papers on feature model tools.
2 Research method
This study has been carried out according to the SMS methodology described by , as a methodology that aims to “identify all research related to a specific topic rather than addressing the specific questions that conventional SLRs address”. Similarly,  indicate, “a systematic mapping is a method to build a classification scheme and structure a Software Engineering (SE) field of interest. The analysis of results focuses on frequencies of publications for categories within the scheme. Thereby, the coverage of the research field can be determined”
. In this study we will search for existing research related to feature modeling tools in the context of SPLs, and we have classified and analyzed them according to certain predefined criteria.
Next, in Section 2.1 we define the SMS protocol. Then, in Section 2.2 we describe the study selection and in Section 2.3 we define the preliminary data extraction protocol. Finally, in Section 2.4, we briefly describe the tool support used for our SMS. The whole process followed for the SMS is shown in Figure 1, adapted from .
2.1 Protocol Definition (Planning)
In this section we present the main steps performed in the protocol definition for this SMS.
2.1.1 Aim and Need
The aim of this SMS is twofold. On the one hand, we have established some issues about application domain, underlying feature model, origin and degree of empirical validation of feature modelling tools for SPLs. On the other hand, we have assessed the “quality level” for the selected feature modelling tools.
Therefore, the importance of this study lies in the issues mentioned above, in addition to other aspects included in this study, namely origin of the papers, context of application, year of publication, publisher and target audience, among others.
We think that a clear picture of all these characteristics may help professionals reduce the associate risk for choosing a tool. Additionally, we aim to foster a discussion among the members of the community about the qualities that feature modelling tools for SPLs should have in order to promote the creation and sharing of high-quality specifications.
2.1.2 Research Questions
|ID||Question||Aim and Classification Schema|
|RQ1||What is the feature modelling tool’s application domain?||To determine if the tool is multipurpose or has been developed/used in specific domains.|
|RQ2||What model underlies the selected feature modelling tool?||To determine what model each tool is linked to, e.g. FODA and its variants, cardinality based model or others.|
|RQ3||Where have feature modelling tools for SPLs been developed?||To identify the origin of the tools: academia, industry or joint.|
|RQ4||What is the degree of empirical validation of feature modelling tools in SPLs?||To examine how each selected tool was validated: with proofs of concept, through its use in industry, through case studies, through experiments, etc.|
2.1.3 Search String
From the RQs we obtained keywords.
From the keywords, we considered synonyms.
We built the search string by applying the criterion Population-Intervention-Comparison-Outcomes-Context (PICOC ).
According to , population in SE should correspond to one of the following: (1) specific SE role, (2) a category of software engineer, (3) an application area or (4) an industry group. In our case, Software Product Lines was considered an application area.
An intervention in SE is defined as a methodology, tool, technology or procedure that addresses a specific issue 
. For example, performing specific tasks such as requirements specification, system testing, or software cost estimation. In our case, the intervention is part of atool, in particular for Domain Engineering stage and Feature Modelling step.
The comparison element is not applicable to our RQs, because they did not involve the comparison of the collected papers against any commonly used feature modelling tool or technique (the control condition).
The main outcomes of our RQs are the origin, underlying model, application domain, together with their level of validation in the software industry.
Last, the context represents the place where the comparison is done, for example academia, industry or both.
All different defined terms were combined with the “AND” boolean operator, and all the synonyms were joined to each other by using the “OR” operator to improve the completeness of the results. The terms, synonyms, final search string and search strategy are shown in Table 2.
|Terms||Feature, modelling, model software, family, product, lines, variability, tool|
|“Feature modelling”, “Feature model”, “Variability”, “Software product lines”, “Tool”, “Software family”, “Product family”|
|(“Feature modelling” OR “Feature model” OR “Variability”) AND (‘Software product lines” OR “Product family” OR “Software family” OR SPL) AND (“Tool”)|
|The string was entered sequentially into each data source, adapting it accordingly. Variations in spelling (e.g. modelling vs. modeling) were also accounted for.|
2.1.4 Inclusion and Exclusion Criteria
In this study we defined both inclusion and exclusion criteria. By checking these criteria we decided whether an article was finally included or not in the SLR, based on its content.
In particular, and following the guidelines of , grey literature (i.e. technical reports, white papers and work in progress) was excluded.
These criteria are defined in Table 3.
|Inclusion criteria||Papers that address the topic of feature modelling tools for SPLs, from any of the following perspectives:
|Exclusion criteria||Papers that, even if they discuss proposals and tools for SPL and variability modelling, do not center specifically on feature modelling tools for SPLs:
2.1.5 Protocol Validation
The protocol validation was performed along with the definition of each of the steps of the protocol. This validation was based on the criteria defined by , and we concretely and objectively identified how we developed our mapping study. In the appendix A we detail the evaluation process for the SMS protocol.
The information presented in this paper corresponds to the final result (definition plus validation) of each step. According to the evaluation done to our systematic mapping study, we applied at least one action for each rubric criteria group established in the protocol phase .
Considering the ratio of the number of actions taken in our study in comparison to the total number of actions possible to be taken, the calculated ratio was 38% (10 over 26 items).
2.2 Primary Study Selection
We made a list that was as complete as possible of papers related to feature modelling tools and SPLs. This SMS dates back to 2000 and the search was conducted between March and May 2019.
2.2.1 Search Process
We design a search strategy that consisted of an automatic search on electronic databases, eventually we consider perform a snowballing approach to complete the search.
2.2.2 Pilot Selection
Once both the inclusion and exclusion criteria and the data sources had been defined, we performed a pilot selection and extraction to ensure the reliability of the protocol.
For all the researchers involved in the selection of the primary studies, we will verify that the manner of applying and understanding the inclusion/exclusion criteria be similar for everyone (inter-rater agreement), avoiding any potential bias.
This will be tested as follows: for all the researchers individually deciding on the inclusion/exclusion of a set of papers randomly chosen from those retrieved by this pilot selection. We perform a test of concordance based on the Fleiss’ Kappa statistic as a means of validation . We consider to obtain a Kappa0.75, could be a value that suggests that the criteria were clear enough to apply the inclusion and exclusion criteria in a consistent way for each one of the researchers .
2.3 Preliminary Data Extraction Protocol
Once both the search string and the inclusion/exclusion criteria had been tested, we launched the primary study retrieval and the data extraction phase. A summary of this phase can be seen in Table 4.
|Paper access||Access to each of the papers to be reviewed must be guaranteed.|
|Initial review of the paper||Read the title, abstract and keywords of each paper to decide the relevance to the SMS.|
|Review Report||Scan the whole paper and answer the following questions:
First, we will run the search string in the selected data sources, mentioned in Section 2.2.1. This process will turn in aprox. 1000 and 1500 results (according to pilot searchs). After that, we will eliminate the duplicates. Then, we will look through the title, abstract and keywords (if available) to get an initial impression of their thematic relevance (See Table 4). In this step the papers that will not be rejected follow on to the next step.
Next, we will apply the format-related inclusion/exclusion criteria. We will discard papers not in English grey literature. In addition, we will discard papers that presented a different version of the same proposal. When the latter was the case, we retain the most current version of the proposal in the selection.
Last, we will divide that list among the researchers, and each one apply the content-related inclusion/exclusion criteria defined in Table 3, obtaining the final list of selected papers.
This information will be then jointly reviewed to collaboratively accept the final list of selected papers. The whole list, including a brief description of all the selected papers for this SMS will be summarized in a appendix on the final paper with the results of this systematic mapping.
2.3.1 Preliminary Data Extraction and Assessment
For each selected paper (that will meet the inclusion criteria), we will read it, extracting relevant data in order to answer the established RQs. Figures 3 and 4 show en example for the data extraction form that will be used to compile the details about the paper and the tool reported.
The extracted data for each paper and their assessment strategy will be as follows: (i) Title, authors, year, (ii) Reason why the paper was initially included, (iii) Type of publication journal (SCI-JCR quartile
(SCI-JCR quartile111Journal Citation Reports, http://thomsonreuters.com/journal-citation-reports/ or other) or conference proceeding (CORE ranking222Computing, Research and Education, http://core.edu.au/index.php/ or other) and the corresponding editor, (iv) Type of experience reported, (v) Results, (vi) Community to which the paper was directed and (vii) Tools and programming languages used. The detail of this is shown in Table 5.
|Initial reading||The abstract, introduction, related work, conclusions and references should be read to collect background information about:
|Detailed reading||The body of the article should be read in order to:
According to the breakdown for each of the RQs defined, the details and the categorization type that will be used to classify the selected papers are shown in Table 6. This categorization is defined as open (partial) if it does not cover all the possibilities, and therefore more categories could be added. On the opposite, a closed (complete) classification schema covers the whole set of possibilities for that criterion.
|RQ1||To establish whether the feature modelling tool was used, we established the domain categories, and each paper was assigned to a category according to the domain where the tool was used.||Open|
|RQ2||To study the evidence about the underlying model that each tool is linked, we examined the information provided by each paper and assigned it to one of the defined categories.||Open|
|RQ3||To establish whether the feature modelling tool was developed from a need of the researchers or satisfied a deficiency detected by the industry, we created three categories, and each paper was assigned to a category according its origin.||Closed|
|RQ4||To quantify how the results provided were validated, we established seven validation categories, and each paper was assigned to a category according to the type of validation.||Open|
2.4 SMS Tool Support
In order to facilitate finding, selecting, documenting and analyzing the information gathered, the following support tools were used: Dropbox333http://www.dropbox.com as a shared repository of resources . The details of using this tool are shown in figure 5.
Mendeley444http://www.mendeley.com -Desktop and Web- for storing, reading and annotating reviews for selected papers as well as the automatic creation of the .bib files for managing the bibliographic references . The details of using these tools are shown in figure 6.
Publish or Perish555http://www.harzing.com/resources/publish-or-perish for initial validation of the search string and automatic spreadsheet creation . The details of using this tool are shown in figure 7.
3 Threats to validity
Despite the care taken in the definition of our SMS, secondary studies suffer from some well-known limitations and threats to the validity that we discuss in the following paragraphs, together with how these were addressed to minimize their impact on the execution of this protocol.
Bias - searching papers. Possible bias on searching for papers. It is difficult for us to guarantee that all relevant primary studies will be selected on our SMS, due to mainly defined searching process. We will mitigate this threat by following the main references in the chosen primary studies to make sure they will be also present in our list of candidate papers.
Bias - relevant papers. Possible bias in excluding relevant papers. We will mitigate this threat through a pilot study in which a high level of inter-rater agreement will be found in order to validate the inclusion/exclusion criteria among the researchers. Also, the decisions of including/excluding the papers will be jointly taken by more than one researcher, thus avoiding individual bias.
Limitations - data extraction. Limitations on data extraction from the selected papers. There could be some difficulties in extracting the relevant information related to certain items. For example, some papers can not provide explicit information that directly answer our RQs, such as modelling tool’s application domain or the underlying model for the tool.
We mitigated this threat by talking with experts in SMS and SLR, who gave us feedback, and helped us to validate our defined protocol.
We have followed the guidelines to plan a SMS according to Petersen . As the whole authors adhered to these guidelines to build up the protocol presented in this document, we think that the conducting phase of the SMS will be repeatable. Finally, the threats to validity have been identified, and mitigated as much as possible.
Samuel Sepúlveda would like to thank to Dr. Pedro Rossel and Ms.(c) Alonso Bobadilla for their useful advices and technical support.
-  P. Clements and L. Northrop, Software Product Lines: Practices and Patterns, 3rd ed. Addison-Wesley Professional, Aug. 2001.
-  F. Ahmed and L. F. Capretz, “Best practices of rup® in software product line development,” in Proceedings of the International Conference on Computer and Communication Engineering (ICCCE 2008). IEEE, 2008, pp. 1363–1366.
-  J. D. McGregor, D. Muthig, K. Yoshimura, and P. Jensen, “Guest editors’ introduction: Successful software product line practices,” IEEE Software, vol. 27, no. 3, pp. 16–21, May/June 2010.
-  P. Collet, “Domain Specific Languages for Managing Feature Models: Advances and Challenges,” in Proceedings of the 6th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISoLA 2014), ser. Lecture Notes in Computer Science, T. Margaria and B. Steffen, Eds., vol. 8802. Springer, Oct. 2014, pp. 273–288.
-  S. Sepúlveda, C. Cares, and C. Cachero, “Towards a Unified Feature Metamodel: a Systematic Comparison of Feature Languages,” in Proceedings of the 7th Iberian Conference on Information Systems and Technologies (CISTI 2012). IEEE Computer Society, Jun. 2012, pp. 1–7.
-  ——, “Feature Modeling Languages: Denotations and Semantic Differences,” in Proceedings of the 7th Iberian Conference on Information Systems and Technologies (CISTI 2012). IEEE Computer Society, Jun. 2012, pp. 1–6.
-  S. Sepúlveda, A. Cravero, and C. Cachero, “Requirements modeling languages for software product lines: A systematic literature review,” Information and Software Technology, vol. 69, pp. 16–36, Jan. 2016.
-  B. A. Kitchenham, D. Budgen, and O. P. Brereton, “The value of mapping studies: a participant observer case study,” in Proceedings of the 14th international conference on Evaluation and Assessment in Software Engineering. BCS Learning & Development Ltd., Apr. 2010, pp. 25–33.
-  K. Petersen, R. Feldt, S. Mujtaba, and M. Mattsson, “Systematic mapping studies in software engineering,” in Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering (EASE’08). BCS Learning & Development Ltd., Jun. 2008, pp. 1–10.
-  B. Kitchenham and S. Charters, “Guidelines for performing Systematic Literature Reviews in Software Engineering,” Keele University and Durham University, Tech. Rep. EBSE 2007-01, Jul. 2007.
-  M. Unterkalmsteiner, T. Gorschek, A. M. Islam, C. K. Cheng, R. B. Permadi, and R. Feldt, “Evaluation and measurement of software process improvement - a systematic literature review,” IEEE Transactions on Software Engineering, vol. 38, no. 2, pp. 398–424, March-April 2012.
-  M. Petticrew and H. Roberts, Systematic Reviews in the Social Sciences: A Practical Guide. John Wiley & Sons, Dec. 2008.
-  K. Petersen, S. Vakkalanka, and L. Kuzniarz, “Guidelines for conducting systematic mapping studies in software engineering: An update,” Information and Software Technology, vol. 64, pp. 1–18, Aug. 2015.
-  P. Brereton, B. A. Kitchenham, D. Budgen, M. Turner, and M. Khalil, “Lessons from applying the systematic literature review process within the software engineering domain,” Journal of Systems and Software, vol. 80, no. 4, pp. 571–583, Apr. 2007.
-  K. Gwet, “Inter-rater Reliability: Dependency on Trait Prevalence and Marginal Homogeneity,” Statistical Methods for Inter-Rater Reliability Assessment Series, vol. 2, pp. 1–9, May 2002.
-  J. L. Fleiss, Statistical methods for rates and proportions. John Wiley & Sons, 1981.
-  I. Drago, M. Mellia, M. M. Munafò, A. Sperotto, R. Sadre, and A. Pras, “Inside dropbox: Understanding personal cloud storage services,” in Proceedings of the 2012 ACM SIGCOMM Internet Measurement Conference. ACM, Nov. 2012, pp. 481–494.
-  V. Vaidhyanathan, M. Moore, K. A. Loper, J. Van Schaik, and D. Goolabsingh, “Making bibliographic researchers more efficient: Tools for organizing and downloading pdfs, part 1: icyte, mendeley desktop, papers, pdf stacks, pubget paperplane, wizfolio, and zoterot,” Journal of Electronic Resources in Medical Libraries, vol. 9, no. 1, pp. 47–55, 2012.
-  A.-W. Harzing, “Publish or perish, version 3,” Available at www.harzing.com/pop.htm, 2007.
Appendix A Evaluation of the SMS process
Here we include a self evaluation for the work that will be done according to . Table 7 shows the activities considered for conducting a SMS. We declare using a check-mark (✓) the activities that will be performed.
|Need for map||Motivate the need and relevance||✓|
|Define objectives and questions||✓|
|Consult with target audience to define questions||•|
|Study identification||Choose search strategy|
|Conduct database search||✓|
|Develop the search|
|Consult librarians or experts||•|
|Iteratively try to find more relevant papers||•|
|Keywords from known papers||•|
|Use standards, encyclopedias, and thesaurus||•|
|Evaluate the search|
|Test-set of known papers||•|
|Expert evaluates result||•|
|Search web pages of key authors||•|
|Inclusion and Exclusion criteria|
|Identify objective criteria for decision||✓|
|Add additional reviewer, resolve disagreements between them when needed||•|
|Data extraction and classification||Extraction process|
|Identify objective criteria for decision||✓|
|Obscuring information that could bias||•|
|Add additional reviewer, resolve disagreements between them when needed||•|
|Validity discussion||Validity discussion/limitations provided||✓|
We used the evaluation rubric suggested by Petersen . Tables 8 to 12 show the rubric criteria. The scores that will pretend to obtain executing this protocol are highlighted. These scores must will be contrasted with the results at the SMS executing and reporting results.
|No description||The study is not motivated and the goal is not stated||0|
|Partial evaluation||Motivations and questions are provided||1|
|Full evaluation||Motivations and questions are provided, and have been defined in correspondence with the target audience||2|
|No description||Only one type of search has been conducted||0|
|Minimal evaluation||Two search strategies have been used||1|
|Full evaluation||All three search strategies have been used||2|
|No description||No actions have been reported to improve the reliability of the search and inclusion/exclusion criteria||0|
|Minimal evaluation||At least one action has been taken to improve the reliability of the search or the reliability of the inclusion/exclusion criteria||1|
|Partial evaluation||At least one action has been taken to improve the reliability of the search and the inclusion/exclusion criteria||2|
|Full evaluation||All actions identified have been taken||3|
|No description||No actions have been reported to improve on the extraction process or enable comparability between studies through the use of existing classifications||0|
|Minimal evaluation||At least one action has been taken to increase the reliability of the extraction process||1|
|Partial evaluation||At least one action has been taken to increase the reliability of the extraction process, and research type and method have been classified||2|
|Full evaluation||All actions identified have been taken||3|
|No description||No threats or limitations are described||0|
|Full evaluation||Threats and limitations are described||1|