Technical Debt Prioritization: State of the Art. A Systematic Literature Review

04/29/2019 ∙ by Valentina Lenarduzzi, et al. ∙ Chalmers University of Technology Tampere Universities UNIVERSITETET I OSLO 0

Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to developing features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approaches have been proposed in research and industry. Method. We conducted a Systematic Literature Review among 384 unique papers published until 2018, following a consolidated methodology applied in Software Engineering. We included 38 primary studies. Results. Different approaches have been proposed for Technical Debt prioritization, all having different goals and optimizing on different criteria. The proposed measures capture only a small part of the plethora of factors used to prioritize Technical Debt qualitatively in practice. We report an impact map of such factors. However, there is a lack of empirical and validated set of tools. Conclusion. We observed that technical Debt prioritization research is preliminary and there is no consensus on what are the important factors and how to measure them. Consequently, we cannot consider current research conclusive and in this paper, we outline different directions for necessary future investigations.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Technical debt (TD) is a metaphor introduced by Ward Cunningham Cunningham (1992), to represent sub-optimal design or implementation solutions that give a benefit in the short term but make changes more costly or even impossible in the medium-long term Avgeriou et al. (2016).

Software companies need to manage such suboptimal solutions. The presence of TD is inevitable Martini et al. (2015) and even desirable under some circumstances Besker et al. (2018), due to a number of causes that can often be related to unpredictable business or environmental forces internal or external to the organization.

However, each TD, exactly like any other financial debt, has an interest attached, or else an extra cost or negative impact that is generated by the presence of a sub-optimal solution Li et al. (2015). When such interest becomes very costly, it can lead to disruptive events, such as development crises Martini et al. (2015). The current best practices employed by software companies is to keep TD at bay by avoiding it when the consequences are known, or by refactoring or rewriting code and other artifacts in order to get rid of the accumulated sub-optimal solutions and their negative impact.

However, companies cannot afford to avoid or repay all the TD that is generated continuously and can be unknown Martini et al. (2015). The main business goal of the companies is to continuously deliver value to their customers and to maintain their products. Thus, the activity of refactoring TD usually competes with developing new features and fixing defects: such activities are often prioritized over repaying TD  Martini et al. (2015). It is therefore of utmost importance to understand when refactoring TD becomes worth postponing a feature or a bug fix. In other words, it’s important to understand how to prioritize TD with respect to features and bugs.

In addition, recent studies show how different projects and even different types of TD might be associated with different costs of refactoring (principal) and negative impact (interest) Besker et al. (2018). This means that some TD can be more dangerous than other Seaman et al. (2012); Martini and Bosch (2016), and it’s therefore important to understand how to prioritize TD with respect to other TD.

However, there is no overall study reporting the current state of the art and practice related to how to prioritize Technical Debt. Our goal in this paper is to survey the existing body of knowledge in software engineering to understand what approaches have been proposed in research and industry to prioritize TD.

We therefore performed a Systematic Literature Review (SLR) on the prioritization of Technical Debt.

We conducted a SLR in order to investigate the existing body of knowledge in software engineering to understand how Technical Debt is prioritized in software organizations and what research approaches have been proposed

The main contribution of this paper is a report on the state of the art concerning approaches, factors, measures and tools used in practice or proposed in research to prioritize Technical Debt.

The paper is structured as follows: In Section 2, we describe the background of this review. In Section 3, we outline the research methodology adopted in this study. Section 4 and Section 5 present and discuss the obtained results. Finally, Section 6 identifies the threats to validity and Section 7 draws the conclusion.

2 Background

In this Section, we explain the Technical Debt meaning in order to avoid confusion or misunderstandings and we report on the previous published systematic reviews.

2.1 Technical Debt

The concept of technical debt was introduced for the first time in 1992 by Cunningham as ”The debt incurred through the speeding up of software project development which results in a number of deficiencies ending up in high maintenance overheads” Cunningham (1992). In 2013 McConnell McConnell (2013) refined the definition of technical debt as ”A design or construction approach that’s expedient in the short term but that creates a technical context in which the same work will cost more to do later than it would cost to do now (including increased cost over time)”. In 2016 Avgeriou et al. Avgeriou et al. (2016) defined it as ”A collection of design or implementation constructs that are expedient in the short term, but set up a technical context that can make future changes more costly or impossible. Technical debt presents an actual or contingent liability whose impact is limited to internal system qualities, primarily maintainability and evolvability”.

Li et al Li et al. (2015) conducted a systematic mapping study on understanding Technical Debt concept and drawing an overview on the current state of research on managing Technical Debt. They proposed a classification of 10 types of Technical Debt at different levels from the selected studies (96), as reported in Table 1. Since this classification derives from a recent secondary study and it is, according to our knowledge, the more completed available in the literature, we considered it in the Search Strategy process (Section 3.2) to define the searching terms.

TD Type Definition
Requirements TD ”refers to the distance between the optimal requirements specification and the actual system implementation, under domain assumptions and constraints”
Architectural TD ”is caused by architecture decisions that make compromises in some internal quality aspects, such as maintain- ability”
Design TD ”refers to technical shortcuts that are taken in detailed design”
Code TD ”is the poorly written code that violates best coding practices or coding rules. Examples include code duplication and over- complex code”
Test TD ”refers to shortcuts taken in testing. An example is lack of tests (e.g., unit tests, integration tests, and acceptance tests)”
Build TD ”refers to flaws in a software system, in its build system, or in its build process that make the build overly complex and difficult”
Documentation TD ”refers to insufficient, incomplete, or outdated documentation in any aspect of software development. Examples include out-of-date architecture documentation and lack of code comments”
Infrastructure TD ”refers to a sub-optimal configuration of development-related processes, technologies, supporting tools, etc. Such a sub-optimal configuration negatively affects the team’s ability to produce a quality product”
Versioning TD ”refers to the problems in source code versioning, such as unnecessary code forks”
Defect TD ”refers to defects, bugs, or failures found in software systems”
Table 1: Technical Debt definition Li et al. (2015)

2.2 Previous SLR’s

In this Section, we briefly report previous systematic reviews (Systematic Mapping Study and Systematic Literature Reviews) available in the source engines, showing their main goals in Table 2). We present the studies in chronological order in order to show the research evolution about Technical Debt. The first systematic review was published in 2012 Tom et al. (2013) and the least ones, at the best of our knowledge, in 2018 Besker et al. (2018),Rios et al. (2018).

Tom et al. Tom et al. (2013) exploited an exploratory case study technique that involves multivocal literature review, supplemented by interviews with software practitioners and academics in order to establish the boundaries of the technical debt phenomenon. As result they created a theoretical framework that provides a holistic view of technical debt comprising a set of technical debts dimensions, attributes, precedents and outcomes. The framework provides a useful approach to understand the overall phenomenon of technical debt for practical purposes.

Li et al. Li et al. (2015) instigated Technical Debt management (TDM), providing a classification of Technical Debt concept and drawing the current state of research on TDM. They considered publication between 1992 and 2013 selecting 94 studies. The results showed a need for empirical studies with high-quality evidence on TDM process, application of TDM approaches in industrial context and tools to manage the different TD types during the TDM process.

Ampatzoglou et al. Ampatzoglou et al. (2015) analyzed research efforts on Technical Debt focusing on financial aspect in order to underlie software engineering concepts. They considered publication until 2015 selecting 69 studies. The results provided a glossary of terms and a classification scheme for financial approaches to be applied to manage TD. Moreover, they discovered lacks a clear mapping between financial and software engineering concepts.

Ribeiro et al. Ribeiro et al. (2016) evaluated the appropriate time to pay a Technical Debt item and how to apply decision-making criteria in order to balance the short-term benefits against long-term costs. They considered publication until 2016 selecting 38 studies. They identified 14 decision-making criteria that can be used by development teams to prioritize the payment of TD items and a list of types of debt related to the criteria.

Alves et al. Alves et al. (2016) investigated what strategies have been proposed to identify and manage Technical Debt (TD) in software projects, considering publication between 2010 and 2014 and selecting 100 studies. They proposed an initial taxonomy of TD types and provided a list of indicators to identify TD and management strategies. Moreover, they analyzed the current state on TD highlighting possible research gap. The results showed a growing interest of researchers in the TD area. They identified some lacks on new indicator proposals and management strategies and tools to control TD. Another lack is related to empirical studies to validate the proposed strategies.

Fernández-Sánchez et al Fernández-Sánchez et al. (2017)

identified the elements need to manage Technical Debt, considering publication until 2017 and selecting 69 studies. They did not provide a general overview on the TD phenomenon or on the activities to manage TD. The element were classified in three groups (basic decision-making factors, cost estimation techniques, practices and techniques for decision-making) and grouped based on stakeholders’ points of view (engineering, engineering management, and business-organizational management).

Behutiye et al. Behutiye et al. (2017) analyzed the state of the art of Technical Debt, and its causes, consequences, and management strategies in the context of agile software development (ASD). They considered publication until 2017 and selecting 38 studies founding potential research areas for further investigation. The study highlighted a positive interest about TD and ASD and provided some potential categories that can easily led to TD, such as ”Focus on quick delivery” and “architectural and design issues”.

Besker et al. Besker et al. (2018) investigated Architectural Technical Debt (ATD) synthesizing and compiling research efforts in order to create new knowledge with a specific interest in the ATD. They considered publication between 2005 and 2016 selecting 43 studies. The results showed a lack of guidelines on how to manage ATD successfully in practice and of an overall process where these activities are fully integrated.

Rios er al.Rios et al. (2018) performed a tertiary study based on a set of five research questions and evaluating 13 secondary studies, dated from 2012 to March 2018. They evolved a taxonomy of TD types, identified a list of situations in which debt items can be found in software projects, and organized a map representing the state of the art of activities, strategies and tools to support TD management. Their results can help to identify points that still require further investigation in TD research For example they identified that there are management activities that do not have any type of support tool.

ID Year Goal
Tom et al. (2013) 2012 Understanding the nature of TD
Li et al. (2015) 2015 TD management and TD classification
Ampatzoglou et al. (2015) 2015 Financial approaches to manage TD
Ribeiro et al. (2016) 2016 TD payment prioritization
Alves et al. (2016) 2016 TD management strategies, TD taxonomy
Fernández-Sánchez et al. (2017) 2017 TD management elements
Behutiye et al. (2017) 2017 TD in Agile development
Besker et al. (2018) 2018 Managing architectural TD
Rios et al. (2018) 2018 TD types, management strategies
Table 2: Previous SLR’s

3 Technical Debt Prioritization

In order to understand the state of the art of the practice on Technical Debt prioritization, we conducted a systematic literature review based on the guideline defined by Kitchenham et al. Kitchenham and Charters (2007), Kitchenham and Brereton (2013). We also applied the ”snowballing” process, defined by Wohlin Wohlin (????).

In this Section, we describe the goal and the research questions (Section 3.1) and we report the search strategy approach (Section 3.2). Moreover, we performed the quality assessment (Section 3.3) for each included papers and outlined the data extraction and the analysis (Section 3.4) of the corresponding data.

3.1 Goal and Research Questions

The study goal is to investigate the existing body of knowledge in software engineering to understand how Technical Debt is prioritized in software organizations and what research approaches have been proposed.

Based on our goal, we defined the following research questions (RQs):

RQ1 Which different types of TD have been investigated?
RQ2 Which prioritization approaches have been proposed?
RQ2.1 Are papers prioritizing TD vs TD or TD vs Features?
RQ2.2 Are the prioritization based on a one-shot activity or on a continuous process?
RQ3 How different types of TD are evaluated?
RQ4 How TD principal and TD interest are evaluated?
RQ5 Which characteristics and measures have been considered when prioritizing TD?
RQ6 Which tools have been used to prioritize TD?

In order to satisfy our goal, first we investigated which TD type is more investigated by researchers and whether there is a gap and it is need to concentrate research effort in the future (RQ1). As TD types, we adopted the list proposed by Li et al. Li et al. (2015) reported in Table 1.

The second research question targets how the investigated research papers address the prioritization process of TD, both in terms of different approaches (RQ2), whether the prioritization process of TD mainly focuses on different TD items or if the process also includes the prioritization between TD items and e.g., implementing new features (RQ2.1) and finally how the prioritization process is described in terms of its periodically (RQ2.2).

Based on the results of RQ1 and RQ2, we characterize how the different TD types are evaluated, highlighting the measures and information (RQ3).

Moreover, we aimed at understand how the main TD components, principal and interest are evaluated and which measures are considered. (RQ4).

Based on the previous RQs, we aimed at identify a set of characteristics and measures considered useful during the TD prioritization activities (RQ5).

We aim at provide a list of the existing tool used to evaluate TD in order to depict the current situation in term of numbers and maturity of each tool (RQ6).

3.2 Search Strategy

The search strategy involves the outline of the most relevant bibliographic sources and search terms, the definition of the inclusion and exclusion criteria and the selection process, that are relevant for the inclusion decision. The search strategy is depicted in Figure 3.

Figure 1: The Search and Selection Process

Searching terms. In our search string, we included all the terms related to Technical Debt proposed by Li et al. Li et al. (2015) and reported in Table 1 (Section 2).

The search string contains the following search terms:
(”technical debt”)OR (”design debt”) OR (”architect* debt”)OR (”test* debt”) OR (”implem* debt”)OR (”docum* debt”) OR (”requirement debt”)OR (”code debt”) OR (”Infrastructure debt”) OR (”versioning debt”) OR (”defect debt”) OR (”build debt”)

We used the asterisk character (*) for the second term group, in order to capture the possible term variations such as plurals and verb conjugations. To increase the likelihood of finding publications addressing TD prioritization, we applied the search string both for title and abstract.

Bibliographic sources. We selected the list of relevant bibliographic sources following the suggestions of Kitchenham and Charters Kitchenham and Charters (2007), since these sources are recognized as the most representative in the software engineering domain and used in many reviews. The list includes: ACM digital Library, IEEEXplore Digital Library, Science Direct, Scopus, Google Scholar, Citeseer library, Inspec, Springer link. Moreover, we performed a manual hand search on the most important conferences and workshops on Technical Debt, such as the International Conference on Managing Technical Debt (MTD).

Inclusion and exclusion criteria. We defined inclusion and exclusion criteria to be applied to title and abstract (T/A) or to full text (F) or in both cases (All), as reported in Table 3.

Criteria Assessment Criteria Step
Inclusion Papers that prioritize TD issues All
Papers that report the criteria of removal&refactoring&remediate of TD issue on any aspect (financial, maintenance, performance, readability, …) All
Papers that compare TD issues All
Papers that empirically validated/elicited the results F
Exclusion Papers not fully written in English T/A
Papers not peer-reviewed (i.e. blog, forum …)
Duplicate paper (only consider the most recent version) T/A
Position papers and work-plan (i.e. paper that does not report results) T/A
Publications where the full paper is not possible to locate (i.e. if the used database does not have access to the full text of the publication) T/A
Publications that only mention prioritization of TD in an introductory statement and do not fully or partly focus on it All
Only the latest version of the papers (eg. journal papers that extend conference papers will be excluded if they are referred to the same dataset) All
Table 3: Inclusion and exclusion criteria

Search and selection process. The search was conducted in from March 2018 until December 2018 including all the publications available until this period. The application of the searching terms returned 383 unique papers.

Testing Inclusion and Exclusion Criteria applicability: Before applying the inclusion and exclusion criteria, we tested their applicability Kitchenham and Brereton (2013) to a subset of 10 papers (assigned to all the authors) randomly selected from the retrieved ones.

Applying inclusion and exclusion criteria to title and abstract: We applied the refined criteria to remaining 374 papers. Each paper was read by two authors and in case of disagreed and a third author was involved in the discussion to clear the disagreements. For 29 papers we involved the third author. Out of 384 initial papers, we included 107 ones by title and abstract.

Fulfill reading: We fulfill read the 107 papers included by title and abstract, applying the criteria defined in Table 3, assigning each one to two authors. We involved a third author for 6 papers to achieve a final decision. Based on this step, we selected on 43 papers as possible relevant contributions.

Snowballing: We performed snowballing process Wohlin (????), considering all the references presented in the retrieved papers and evaluating all the papers that reference the retrieved ones resulting in one additional relevant paper. We applied the same process as for the retrieved papers. Snowballing search was conducted from August 2018 to December 2018. We identified only 11 potential papers, but only 1 were included in order to compose the final set of publications.

Based on the search and selection process, we retrieved 44 papers for the review, as reported in Table 5.

3.3 Quality Assessing

Before proceeding to the review, we checked whether the quality of the selected papers was sufficient to support our goal and if the quality of each paper reached a certain quality level. We performed this step according to the protocol proposed by Dybå and Dingsøyr Dyb and Dingsøyr (2008). To evaluate the selected papers, we prepared a checklist (Table 4) with a a set of specific questions. We ranked each answer assigning a score on a five-point likert scale (0=poor, 4=excellent). One paper satisfied the Quality Assessment criteria if reached a rating higher (or equal) to 2.

QA Quality Assessment Criteria (QA) Response Scale
QA1 Is the paper based on research (or is it merely a ”lessons learned” report based on expert opinion)?
QA2 Is there a clear statement of the aims of the research
QA3 Is there an adequate description of the context in which the research was carried out?
QA4 Was the research design appropriate to address the aims of the research? Excellent = 4
QA5 Was the recruitment strategy appropriate to the aims of the research? Very Good=3
QA6 Was there a control group with which to compare treatments? Good=2
QA7 Was the data collected in a way that addressed the research issue? Fair=1
QA8 Was the data analysis sufficiently rigorous? Poor=0
QA9 Has the relationship between researcher and participants been considered to an adequate degree?
QA10 Is there a clear statement of findings?
QA11 Is the study of value for research or practice?
Table 4: Quality Assessment Criteria

Among the 44 papers included in the review from the search and selection process, only 38 passed the Quality Assessment criteria, as reported in Table 5.

Step # papers
Retrieval from bibliographic sources (unique papers) 384
Reading by title and abstract 276 rejected
Fulfill reading 65 rejected
Backward and forward snowballing 1
Papers identified 44
Quality assessment 6 rejected
PS’s 38
Table 5: Search and selection and quality assessment criteria results

In Table 6, we list the 39 papers included in the review (Appendix A reports the details for each papers). The detailed references of all the 38 PS’s is reported in Appendix A.

id Title Authors Year
1 An empirical model of technical debt and interest Nugroho, A. et al. 2011
2 Investigating the impact of design debt on software quality Zazworka, N. et al. 2011
3 Prioritizing design debt investment opportunities Zazworka, N. et al. 2011
4 Estimating the principal of an application’s technical debt Curtis, B. et al. 2012
5 Investigating the impact of code smells debt on quality code evaluation Arcelli Fontana, F. et al. 2012
6 Using technical debt data in decision making: Potential decision approaches Seaman, C. et al. 2012
7 Defining the decision factors for managing defects: A technical debt perspective Snipes, W. et al. 2012
8 A formal approach to technical debt decision making Schmid, K. 2013
9 Challenges to and Solutions for Refactoring Adoption: An Industrial Perspective Sharma, T. et al. 2015
10 Investigating Architectural Technical Debt accumulation and refactoring over time: A multiple-case study Martini, A. et al. 2015
11 On the use of time series and search based software engineering for refactoring recommendation Wang, H. et al. 2015
12 Towards Prioritizing Architecture Technical Debt: Information Needs of Architects and Product Owners Martini, A. and Bosch, J. 2015
13 Validating and prioritizing quality rules for managing technical debt: An industrial case study Falessi, D. and Voegele, A. 2015
14 Developing processes to increase technical debt visibility and manageability – An action research study in industry Yli-Huumo, J. et al. 2016
16 How do software development teams manage technical debt? – An empirical study Yli-Huumo, J. et al. 2016
17 Identifying and quantifying architectural debt Xiao, L. et al. 2016
18 JSpIRIT: A flexible tool for the analysis of code smells Vidal, S. et al. 2016
19 Minimizing refactoring effort through prioritization of classes based on historical, architectural and code smell information Choudhary, A. and Singh, P. 2016
20 Pragmatic approach for managing technical debt in legacy software project Gupta, R.K. et al. 2016
21 Technical debt prioritization using predictive analytics Codabux, Z. and Williams, B.J. 2016
22

Technical Debt Management with Genetic Algorithms

Vathsavayi,S. H. and Systa, K. 2016
23

A Heuristic for Estimating the Impact of Lingering Defects: Can Debt Analogy Be Used as a Metric?

Akbarinasaji, S. et al. 2017
24 A strategy based on multiple decision criteria to support technical debt management Ribeiro, L.F. et al. 2017
25 An empirical assessment of technical debt practices in industry Codabux, Z. et al. 2017
26

Assessing code smell interest probability: A case study

Charalampidou, S. et al. 2017
27 Impact of architectural technical debt on daily software development work - A survey of software practitioners Besker, T. et al. 2017
28 Investigating the identification of technical debt through code comment analysis de Freitas Farias, M.A. et al. 2017
29 Lessons learned from the ProDebt research project on planning technical debt strategically Ciolkowski, M. et al. 2017
30 Looking for Peace of Mind? Manage Your (Technical) Debt: An Exploratory Field Study Ghanbari, H. et al. 2017
31 Revealing social debt with the CAFFEA framework: An antidote to architectural debt Martini, A., Bosch, J. 2017
32 Technical debt interest assessment: From issues to project Martini, A. et al. 2017
33 The magnificent seven: Towards a systematic estimation of technical debt interest Martini, A., Bosch, J. 2017
34 The pricey bill of Technical Debt - When and by whom will it be paid? Besker, T. et al. 2017
35 A semi-automated framework for the identification and estimation of Architectural Technical Debt: A comparative case-study on the modularization of a software component Martini, A. et al. 2018
36 Early evaluation of technical debt impact on maintainability Conejero, J.M. et al. 2018
37 Technical Debt tracking: Current state of practice: A survey and multiple case study in 15 large organizations Martini, A. et al. 2018
38 Identifying and Prioritizing Architectural Debt Through Architectural Smells: A Case Study in a Large Software Company Martini, A. et al. 2018
Table 6: The Selected Papers

3.4 Data Extraction

We extracted data from the 38 primary studies (PS’s) that satisfied the Quality Assessment criteria. The context of each PS is explained in terms of: Context Data, Process Data and Outcome Data as reported in Table 7.

Context Data are necessary in order to outline the context of each PS in terms of type of evaluated TD, according with the list proposed by Li et al. (2015). We also extracted data regarding the projects considered in the study such as: number of projects, project size (LOC or KLOC) and programming languages. Moreover, we collected information about the process phase where the Technical Debt is evaluated.

Process Data explains us the process adopted to evaluate and prioritize TD issues. We collected data on the type of process (single activity or a continuous process, proactive or reactive), the analysis type discriminating among qualitative, quantitative, or mixed evaluation approaches, and the statistical methods used for the analysis. We also retrieved information about frameworks and tools adopted to evaluate and prioritize TD issues. These data are exclusively based on what is reported in the papers, without any kind of personal interpretation.

Outcome Data identifies the criteria of removal refactoring remediate of TD issue. Moreover, we extracted measures and factors used to assess the prioritization of a TD issue and which of them are suggested to not be considered during the prioritization process.

Category Type
Context Data Technical Debt type (according to Li et al. (2015))
Analyzed project (projects #, size and programming languages)
Process phase (i.e. maintainability, changeability, ecc.)
Process Data analysis type (qualitative, quantitative, or mixed evaluation approach)
frameworks and tools adopted
statistical methods used for the analysis
process type (single activity or a continuous process, proactive or reactive)
Outcome Data criteria of removal refactoring remediate of TD issue
measures and factors used to assess the prioritization (or not) of a TD issue
Table 7: Data Extraction

3.5 Replicability

In order to allow the replication and the extension of our work to other researchers, we prepared a replication package111http://www.taibi.it/raw-data/JSS_TD_2019.zip (Raw data will be moved to a permanent repository (Mendeley Data) in case of acceptance). for this study with the complete obtained results.

4 Results

4.1 Overview of the Primary Studies

Based on the adopted selection process, we identified 39 Primary Studies (PS’s) as listed in Table 6. We illustrated the distribution per year in Figure 2.

The first three relevant papers on TD prioritization were published in 2011. In the next two years, between 2012 and 2014, only three papers were published.

From 2015 the publication trend increased a lot (5 papers) reaching a considerable improvement in 2016 and 2017 with 10 and 12 papers each.

In 2018 we found only three papers and this number was expected since the study was conducted in the middle of the year.

The selected PS’s are published in 21 different sources, including 6 journals and 15 conferences and workshops. Specifically, the journal publication sources are: (2 papers) Information and Software Technology (IST), (2 papers) Journal of System and Software (JSS), (2 papers) IEEE Software, (1 papers) Empirical Software Engineering Journal (EMSE), (1 papers) Journal of Software: Evolution and Process (JSEP), (1 papers) Science of Computer Programming.

Regarding Conferences and workshops: (7 papers) Workshop on Managing Technical Debt (MTD), (4 papers) Euromicro Conference on Software Engineering and Advanced Applications (SEAA),(3 papers) International Conference on Agile Software Development (XP), (2 papers) International Conference on Product-Focused Software Process Improvement (PROFES), (2 papers) International Conference on Software Engineering (ICSE), (1 paper) International Conference on Management of Digital Eco Systems (MEDES), (1 papers) International Conference on Services Computing (SCCC), (1 paper) International Workshop on Quantitative Approaches to Software Quality (QuASoQ), (1 paper) International Workshop on Emerging Trends in Software Metrics (WETSoM), (1 paper) International Conference on Enterprise Information Systems (ICEIS), (1 paper) International Symposium on Empirical Software Engineering and Measurement (ESEM), (1 paper) International Conference On Software Architecture Workshop (ICSAW), (1 paper) International Conference on Software Maintenance and Evolution (ICSME) and (1 paper) International conference on Quality of software architectures (QoSA).

2011

2012

2013

2014

2015

2016

2017

2018

Year

Number of Papers
Figure 2: Paper distribution per year

4.1.1 Context Data

28 PS’s (75.67%) conducted case studies in order to investigate technical debt issues analyzing different set of projects. 24 out 28 PS’s report the findings for each analyzed project in terms of projects number, project size and programming language.

Regarding the number of project analyzed, the majority of the PS’s considered less than 7 each, mostly one project. We identified three papers that took into account as context a huge number of project, such as 4 with 700 projects, 1 with 44 projects and 5 with 12 projects. Only 11 PS’s report on the project programming language, where Java, C# and C++ are the main common ones.

The remaining papers investigated technical debt issues based on surveys among different practitioners.

Technical Debt issues are mainly (48.64%) investigated focusing on the maintainability process. The remaining PS’s took into account different process phases such as: Defectively or Changeability.

4.2 RQ1. Which different types of TD have been investigated?

Considering the TD type reported in Table 1, the most considered TD type in the PS’s were: Code Debt (38%), Architectural Debt (24%) and Design Debt (10%). Moreover, some PS’s (24%) did not report on a specific TD type issues, but evaluate TD in general.

Figure 3: TD types

4.3 RQ2. Which prioritization approaches have been proposed?

Technical Debt prioritization is considered as one of the most important activities when managing TD. The TD prioritization process is used for defining the ordering and/or scheduling of planned refactoring initiatives based on the priority or each identified TD item concerning the individual item’s impact on the software. Several different prioritization approaches have been proposed by researchers in the reviewed publications and a few methods on how to prioritize TD have been developed, but there is no unified approach of how the prioritization process of TD should be carried out, neither is there a consensus on which aspects to focus on when performing the prioritization process of TD. The selection of the prioritization approach is currently context-dependent in most organizations 21.

The different suggested prioritization approaches presented in the reviewed publications are mainly a) improving the software quality, b) decrease software practitioners’ productivity c) affection on the correctness of the software, d) cost-benefit analysis (CBA) to comparing various TD items with respect to a low cost with high payoff, or e) a combination of several different approaches.

Studies focusing on the internal software quality as an prioritization approach, commonly focus on a quality assessment of the software in order to identify the TD items that causes e.g. the most maintenance costs 1,  2,  13,  28,  26,  4,  31,  35,  19 together with factors such as the remaining product life, debt severity and its impact on future development activities, and current business-related constraints  3,  9.

Xiao et al. 17 suggest an approach which focuses on architectural TD, focusing on both locating TD items, and also ranking and prioritizing them. Their approach returns the TD items that consume the largest maintenance effort, and therefore deserve more attention and higher priority for refactoring.

Other reviewed publications also take the decrease software practitioners’ productivity into consideration when prioritizing TD, since software suffering from e.g. architectural TD slows down the development though introducing rework  2,  3.

Also the effect the TD has on the correctness of the software is described as an approach evaluating the different candidate TD item for prioritization 2. More specifically, Fontana, Ferme and Spinelli 5 describe that the prioritization of refactoring of code smells representing design debt can be evaluated by studying the correlations between the smells and changes or fault proneness with the goal of prioritizing “the most dangerous smell and hence the smell which represents the worst technical debt”. When prioritizing specifically the defect debt, Akbarinasaji et al. 23 focus their approach on the debt items’ severity (using the categorizations critical, major, normal, minor) and the duration of bug fixing time.

Codabux et al. 21 used a Bayesian Approach to build a prediction model to determine the “TD proneness” of each TD item using a classification scheme according to the TD proneness probability where the individual items’ risk is assessed.

Other researchers such as 3, 6, use a cost-benefit analysis when prioritizing the different TD items, focusing on which refactoring activities should be performed first because they are likely to be inexpensive to implement yet have a significant effect, and which refactoring should be postponed due to high cost and low payoff. The main focus with this approach is for the purpose of making a lucrative investment in the software where the output of this analysis is a prioritized list of different TD items ordered by the profitability of the different possible refactoring activities 3.

This strategy is echoed by Martini et al. 32 stating that “if the interest is (or is going to be) high, the debt is worth being paid. On the contrary, if the interest is not enough to justify the cost of refactoring, there is no reason to ”waste” resources to refactor the system.”. However, Martini et al. 32 also stress the importance of not only focus the prioritization decisions on single TD items by assessing each TD item separately, they describe the importance of also understanding the overall impact of how TD items in general have on the whole project, focusing on the overall project goals by evaluating the information holistically. Using this approach Martini et al. 32 also include factors such as such as the portion of the code affected by the TD, the project size, the roadmap, the positive impact of the TD, the existence of an alternative and the cultural attitude of the team, when prioritizing refactoring activities of TD.

By borrowing prioritization approaches from other disciplines, such as finance and psychology, Seaman et al 6, include techniques such as Analytic Hierarchy Process (AHP), the portfolio method, and the options approach. The AHP approach involves building a criteria hierarchy, assigning weights and scales to the criteria, and finally performing a series of pairwise comparisons between the alternatives against the various criteria. The goal of using the portfolio approach is to select the assets that maximize the return on investment or minimize the investment risk.

Codabux et al. 25 stress the importance to adopt a broader perspective of the prioritization process focusing on the liability of TD, where the decision-makers need to think beyond the cost associated with fixing the debt including estimates of the possible future costs resulting from the decisions to ship. The additional cost that is mentioned to be reflected during the prioritization in terms of liability costs are e.g. responding to support requests, or costs associated with catastrophic failures, etc. and potential litigation costs where service level agreements are violated because of unmanageable debts should also be included.

Ribeiro et al. 24 present a multiple decision strategy criteria model using a combination of different prioritization approaches, that can be used during different project phases. Their model focus on approaches such as e.g. the severity, the impact the TD items have from a customer perspective, to the interest cost of TD and the projects’ properties lifetime and its possibility of evolution.

Yet another prioritization process that includes different perspectives is the approach described by Ciolkowski et al 29 where their approach focuses on a combination of the overall software quality together with a focus on the productivity improvement from a future-oriented perspective, using a proactive methodology.

Gupta et al. 20 use a two-level approach when prioritizing TD. First the TD items are assessed due to its importance and urgency and in a secondly step the TD items impact on business values and effort are assessed.

Guo et al. 15 prioritization approach of TD ranks customer expectations having the top priority, followed by availability of the development resource, the interest of the TD items, the current status of the debt-infected modules and the impact of the debt on other features. By studying how software practitioners prioritize TD items in practice, Yli-Huumo et al.  14,  16 conclude that the prioritization approach commonly focuses on scalability, business value, use of a feature, and customer effect.

Further, Snipes et al. 7 suggest a prioritization approach of TD which includes a combination of factors such as severity, the existence of a workaround, urgency of the refactoring required by customers, refactoring effort, the risk of the proposed refactoring, and scope of testing required.

Schmid 8 distinguish between potential and effective TD where the potential TD is any type of sub‐optimal software system while effective TD are issues in the software system that makes further development of that system more difficult. This prioritization approach considers the aspects such as evolution cost, refactoring cost, and the probability that the predicted evolution path will be realized.

Martini and Bosch 33 propose a tool called AnaConDebt, to assist during the prioritization process of TD. Their tool assesses the severity of the interest for different TD items, where the calculation of the interest is based on an assessment of seven different factors and their growth. The assessed factors are 1) reduced development speed, 2) bugs related to the TD item, 3) other qualities compromised, 4) other extra-costs, 5) frequency of the issue, 6) spread in the system, and 7) users affected. Also Vidal et al. 18 propose at a tool called JSpIRIT, to specifically prioritize source code related TD, where the TD items are evaluated according to their importance based on different prioritization criteria. The tool calculates a ranking for a set of code smells according to their importance, where the tool can instantiate to prioritize TD items by different criteria. Examples of such criteria are the relevance of the kind of code smells, the history of the system, or different software metrics, among others. Additionally, the developer can use external information to improve the prioritization.

4.4 RQ2.1. Are papers prioritizing TD vs TD or TD vs Features?

This research question seeks to address if the prioritization process of TD mainly focuses on the prioritization among different TD items or if the TD items are described as competing with the implementation of new features or not.

Budget, resources and available time play important factors in a software project, especially during the prioritization process since spending time and effort on refactoring activities commonly infers that less time can be spent on e.g. implementing new features. This is one of the main reasons for that software company not always spend the additional budget and effort on the refactoring of TD since they commonly have a strong focus on delivering customer-visible features 18.

Ciolkowski et al. 29 describe this situation as “The challenge for project managers is to find a balance when using the given budget and schedule, either by reducing technical debt or by adding technical features. This balance is needed to keep time to market for current product releases short and future maintenance costs at an acceptable level.” Further, Martini, Bosch and Chaudron 10 describe that refactoring initiatives of TD is usually low-prioritized compared to the implementation of new features and that TD that is not directly related to the implementation of new features are often postponed.

Vathsavayi and Systä 22 echo the notion, fixing, “Deciding on whether to spend resources for developing new features or fixing the debt is a challenging task.” Where the researchers highlight that software teams need to prioritize new features, bug fixes and refactoring of TD within the same prioritization process.

However, even if the balance between implementing new features and refactoring activities of TD are described as important 31, the investigated paper in this study, commonly focus their prioritization approaches on a prioritization among different TD items with the goal of deciding on which item should be refactored first. None of the described prioritization approaches explicitly address how the prioritization between implementing new features and spending time and effort on the refactoring of TD, should be carried out.

4.5 RQ2.2. Are the prioritization based on a one-shot activity or on a continuous process?

Just as important as prioritizing TD refactoring activities in a project, is to describe a management strategy for the prioritization process.

Some of the reviewed publications in this study, highlight the prioritization process of TD in terms of a being a continuous, integrated and iterative process 16,  22 meanwhile others stresses the importance of prioritizing refactoring of TD within each sprint 15. Choudhary et al. 19 illustrate the prioritization process as being an integral part of the continuous development process this by saying “ideally software companies try to incorporate refactoring practices as an integral part of their development and maintenance processes” 9.

However, most of the reviewed publications in this study does not give any specific recommendation on how often or in what way the prioritization of TD should be carried out.

4.6 RQ3. How different types of TD are evaluated?

According with the results showed in RQ1, we focused only on the two most considered TD types: code and architectural. For the other TD types, we do not have enough information to provide an answer.

Code TD is generally investigated form the point of view of its impact on one - or more than one - software qualities 13, 18, 19, 26. Maintainability 4, 5, 11 and maintenance effort 1, 2, 11, 19 are the most by the PS’s. Code Debt evaluation and prioritization is mostly based on code smells 2, 5, 11, 18, 19, 26.

Other metrics are considered such as time 4, 23 or cost 1 to fix a violation and quality rules 13.

Some factors are related to subjective evaluation such as customer feedback 23 or developers’ comments in the code 28 are less evaluated.

The approaches mainly involve models that reduce TD removing or refactoring code smells or other metrics 11,18. These approached look at the impact on code smells other ones 5, or make a comparison with classes without smells 2, 26, or rank the code rules 13 from developers perceived criticality.

Architectural TD is general investigated taking into account the role of architectural smells 17, 19, 20 or complex architectural design 17, 27 that negatively impact software quality 17, 19, 20.

Architectural TD is evaluated measuring the extra-maintenance effort for bug fixing 17 or analyzing the bug proneness 17 of code. Another approach combines three different perspectives such as historical data of the projects, architectural design, and severity of the class prioritizing the refactoring activities 19.

Complex Architectural Design is used to identify high interest in terms of wasted time related to architectural TD 27, combining with other metrics such as number of file, percentage of complex functions and files 35.

Another approach identify dependencies and social gaps across architecture organization in order to define architectural TD 31.

4.7 RQ4. How TD principal and TD interest are evaluated?

Four PS’s considered only Interest (13, 17, 27, 34), other six PS’s considered during the prioritization process both Principal and Interest (1, 10, 13, 15, 23, 35).

Principal. Principal is calculated as cost 1, 10 or time 1, 4, to fix technical quality issues 1 or quality rules violation 13. Other factors are considered such as page rank or customer feedbacks 23.

Interest. Interest is calculated as extra cost spent on maintenance due to technical quality issues 1, 10, 17, 35 or as wasted time related to different activities (management or refactoring) 27, 34.

Principal is compared with Interest without considering any item for which the benefit does not outweigh the cost 15. The factors considered are customer expectations have the top priority, followed by availability of the development resource, the interest of the technical debt items, the current status of the debt-infected modules and the impact of the debt on other features 15.

4.8 RQ5. Which characteristics and measures have been considered when prioritizing TD?

In Table 8 we present an ”Impact Map”, which highlights how there is a plethora of factors related to the impact (interest) of Technical Debt to be considered for prioritization, and they vary widely across studies and projects. In total, we can count 53 unique factors.

A few of the factors might overlap, although in different papers the factors are calculated differently. For example, “number of bugs” and “ROI (calculated on number of bugs)”, are obviously overlapping factors, although using the sheer number of bugs or the cost of their impact as indicators might give very different results when prioritizing. In other cases, a generic concept of “interest” or “cost” has been used, although such values have probably been implicitly calculated by the researchers or practitioners by taking in consideration some of the other 52 remaining factors explicitly mentioned in the other papers. However, there is no way, given the reported information, to perform such mapping: thus, we report a generic factor, for example “risk”, as different from all the other specific ones.

Some of the factors have been grouped in categories. The majority of the papers focus on the impact of TD on maintainability (12). Some papers focus on productivity (7), evolvability (5) and other system qualities (6), while 5 papers take in consideration the customer perspective.

Only a few papers take in consideration other factors such business- (3), social- (3), project- (3) and other uncategorized factors (6). In most of these cases (including the customer aspect), the identified factors have been reported in a single paper or two. This highlight either their specificity for a specific context or a lack of focus on these factors in literature. Both in 10 and  24, the authors conducted a survey with practitioners to understand which of these factors are the most important for developers, architects and product owners. In most cases, customer and business factors have been considered the most important ones. However, only a few papers have been addressing such factors when prioritizing TD, so we can conclude that these factors have been overlooked in literature.

In quite a few studies (8), the interest (impact) of Technical Debt has been identified and assessed as generic interest, interest likelihood, risk, severity or as customizable by the practitioners. 6 papers present factors that have not been categorized specifically in the previously mentioned categories and that represent impact of TD that span multiple categories or represent a specific aspect not related to such categories.

Other 8 papers assume that the TD impact is associated with the (co-)occurrence of instances of different issues (e.g. code smells) that are considered sub-optimal (“quantity of debt” in the table). However, the measures used in different papers change from paper to paper according to the tools used, and the impact of the individual issues is assumed to be the same or has been arbitrarily assigned. Very few papers (4) use an estimate or a measure of the cost of refactoring (principal) in contrast to the impact of TD (interest). This is in contrast with the theoretical approach (Chatzigeorgiou et al. (2015), Martini and Bosch (2016), 8), for which TD needs to be prioritized by taking in consideration both the cost of refactoring and the impact.

Category Factors PS’s
ID #
business competitive advantage 10 3
lead time 10
actractiveness for the market 10
penalties 10
feature usage 16
business value 16
ROI (calculated per bug) 20
customer satisfaction 12 5
long-term satisfaction 10
specific customer value 10
customer expectations 13
customer effect 16, 24
evolution time of impact on evolution (short or long-term) 8 5
risk of critical impact on evolution (possible crisis) 8
impact on ohter features 13, 24
impact on upcoming features 22, 24, 32
maintenance modifiability 2, 18, 26, 28 12
number of bugs 2, 10, 11, 17, 20, 23, 28, 32, 33, 38
maintenance cost 10, 17, 35
productivity % wasted time (effort) 27, 32, 33, 34, 35, 38 7
number of developers working on TD 35
wasted development hours 35
generic effort 24
coding output/effort 29
project factors availability of resources 13 3
project size and complexity 32
postponement of bugs 23
quality debt # of issues or their co-occurrence 9, 16, 28, 29, 25, 32, 35, 36 8
social factors developers’ morale 30 3
social debt 31
positive impact of TD 32
team culture 32
system qualities robustness 4 6
performance efficiency 2, 4, 12
security 4
transferability 4
scalability 16
generic qualities 32, 33, 38
other factors contagious debt 10 6
existence of TD solution (alternative) 32
spread of impact in the system 32, 33, 38
number of users affected 32, 33, 38
frequency of negative impact 32, 33, 38
kind of smell 18, 24
history of the system 18
compromise architecture 18
future cost 22
user perception 24
not specified risk 10, 25 8
interest likelihood 13, 22
interest 13, 24
severity 24, 38
customizable 18, 24, 25, 32, 33, 38
Table 8: Impact Map: Factors and measures related to the interest of TD considered when prioritizing (RQ5)

4.9 RQ6. Which tools have been used to prioritize TD?

As reported in Table9, only 14 papers analyzed in this SLR provide stakeholders with tools support the TD evaluation and prioritization. Among them 11 explained some indication about the used tools, while 3 did not mention the name of the tool. The other studies used an hoc-tool developed by them self for their specific purposes.

Tool Name Tool link Paper id.
AnaconDebt https://anacondebt.com 32, 33
CAFFEA not available 31
CAST https://www.castsoftware.com 4
Coverity http://www.coverity.com 20
Findbugs http://findbugs.sourceforge.net/ 20
FxCop not available 20
iPlasma http://loose.upt.ro/iplasma/index.html 5
Jsprit https://sites.google.com/site/santiagoavidal/ 18
projects/jspirit
Scitool Understand https://scitools.com/ 21
SonarQube https://www.sonarqube.org/ 30
ARCAN  Arcelli Fontana et al. (2016), Fontana et al. (2017) http://essere.disco.unimib.it/wiki/arcan 38
Table 9: Tool used when prioritizing TD (RQ4)

SonarQube and CAST Curtis et al. (2012) are commercial tools commonly used in order to analyze code compliance against a set of rules. They recommends to customize the out-of-the-box set of rules.

AnaConDebt Martini (2018) is a management tool based on a TD-enhanced backlog. The backlog allows the creation of TD Items and performs TD-specific operations on the created items.

CAFFEA framework Martini and Bosch (2016) identifies a organization roles where architectural responsibilities are allocated defining the team members and share among them. The framework has been proven to help in managing Architectural Debt.

iPlasma Marinescu et al. (2005) is an integrated environment for quality analysis of object- oriented software systems. The tool provides support for all the analysis phases: from model extraction up to high-level metrics based analysis.

ARCAN tool exploits graph databases to perform graph queries, which allow higher scalability in the detection and management of a large number of different kinds of dependencies.

5 Discussion

In this Section, we discuss the results obtained outlining some implications for researchers and practitioners working on TD domain.

Despite the TD domain is relatively young compared to other domains such as software testing or software quality, significant contributions were published in the last ten years and researchers are becoming more and more active (Figure 2).

Among the ten TD types proposed in 2015 by Li et al. Li et al. (2015) (Table 1), only Code Debt and Architectural Debt result the most considered by researchers (RQ1) in the context of TD prioritization.

In the study proposed by Li et al. Li et al. (2015), Code Debt was the most investigated TD type, followed by the Test Debt. However, also other types of TD received significant attention. Differently than in Li et al. (2015), in our work emerged that Code and Architectural Debt are by far the most investigated type of debt when considering the prioritization.

This could be due to the their measurement easiness, mainly based on extensions of previous research on other domain, or it can also be due to that they are considered (specifically ATD) as the most harmful and expensive to manage in the software. For example, architectural and code patterns have been investigating for more than twenty years, even if they were not considered as ”debt”.

In a software affected by TD, the only significantly effective way to reduce it, is to refactor. This fact stresses the importance of continuously and iteratively prioritize the identified refactorings tasks and thereby highlights the importance of using an appropriate TD prioritization process. Through this study, we have identified several different approaches for prioritizing TD (RQ2, RQ2.1, and RQ2.2). However, there is no unified approach for this activity; neither is there a consensus on which aspects to focus on when performing the prioritization process of TD.

It is evidently clear from the findings that the prioritization process of TD refactoring can be carried out using different approaches, all having different goals and optimizing on different criteria.

This study has identified five main different approaches, aiming to a) improve software qualities, especially maintainability and evolvability b) increase software practitioners’ productivity, c) reduce fault proneness of the software, d) compare various TD items using cost-benefit analysis (CBA) to understand the convenience of refactoring and e) combine several different approaches.

This result is of value to both academics and practitioners illustrating that is it important to first identify the goals with the TD prioritization, and thereafter to implement a corresponding TD prioritization approach targeting the identified and specified goals.

One interesting finding is that the investigated papers commonly only compare different TD items during this prioritizing process and more rarely compare the need for implementing a new feature with refactorings of TD.

The two most considered types of TD (code and architectural debt) - (RQ3) are mainly evaluated by means of architectural or code-level anti-patterns (architectural smells, code smells, or code violations). Moreover, their harmfulness, is mainly related to the influence they have on some external quality (e.g. the impact of a specific code smell on the maintenance effort). However, their influence is still not clear, since the vast majority of studies do not agree on their harmfulness. Other type of TD should be investigated in the future. We believe that code debt is the most investigated since the easiness to access to the data, by means of mining software repositories study, while other type of debt requires other type of studies, including case studies involving developers. We recommend practitioners to consider the measures identified in this RQ, but to complement them with expert judgement to understand which architectural smells, code smells, or code violations to consider.

Considering the two main components of TD (RQ4), only a limited number of papers proposed how to evaluate principal and interest. Interest is mainly calculated as extra cost, or as time wasted to fix TD issues. The reason could be that TD interest is not easy to calculate without access to empirical data from companies. Researchers should design and perform studies to understand the actual interest of existing TD issues.

Regarding the characteristics and measures considered during the prioritization process (RQ5), the results so far imply that prioritizing TD is an activity that requires a holistic view of several factors. The systematic assessment of TD requires a wide amount of information, which might change from case to case, and in most cases Technical Debt is prioritized without following a standardized approach. Also, the known measures, used in a few papers, capture only a small part of the factors that are used to prioritize TD (proxy for maintenance costs or productivity). Using only such measures to prioritize TD without considering the full picture of the relevant factors (risks and costs) might consequently result in a partial and thus biased prioritization, which in turn could lead to poor business decisions. On the other hand, some of the factors have been reported in a single study conducted in a specific context and might not be relevant in other prioritization cases.

More studies are necessary in order to acquire better evidence on factors that have been overlooked (for example related to customers, business, social, and project aspects). In addition, we need to better understand which factors should be considered in different context, and which additional measures should be considered when prioritizing TD. Finally, although a few holistic approaches have been reported (Martini and Bosch (2016), 24, 33), there is a need for a better defined framework and standardized approach to assess TD.

The tools usage support to prioritization activities is very fragmented (RQ6) highlighting the lack of a a solid and widely used and validated set of tools specific to TD prioritization. Current tools mainly identify TD issues and, in some case, propose an estimate of the time needed to fix it. However, at the best of our knowledge, no tools calculate the interest due to the postponement of the activities.

Results can be useful for researchers and practitioners. Researchers should focus on the other types of TD, also considering less investigated TD types in the last few years. They can also evaluate approaches, factors, measures and how to prioritize them. Moreover, since the available tools are not mature research activities can be focus on empirical validation of existing tools, confirming the usefulness of each measure proposed by each tool.

Practitioners can benefit of our results applying our impact map to explore/anticipate what kind of impact might occur because of TD. Moreover, they should be careful on selecting tools not applying only one but consider more than one.

6 Threats to Validity

The results of a SLR may be subject to validity threats, mainly concerning the correctness and completeness of the survey. In this Section, we outline some implications for researchers and practitioners working on TD domain. We structured this Section as proposed by Wohlin et al. Wohlin et al. (2012), including construct, internal, external and conclusion validity threats.

6.1 Construct validity

Construct validity is related to generalization of the result to the concept or theory behind the study execution Wohlin et al. (2012). In our case they are related to the potentially subjective analysis of the selected studies. As recommended by Kitchenham’s guidelines Kitchenham and Charters (2007), data extraction was performed independently by two or more researchers and, in case of discrepancies, a third author was involved in the discussion to clear the disagreements. Moreover, the quality of each selected papers was checked according to the protocol proposed by Dyb and Dingsøyr Dyb and Dingsøyr (2008).

6.2 Internal validity

Internal validity threats are related to possible wrong conclusion about causal relationships between treatment and outcome Wohlin et al. (2012). In case of secondary studies, internal validity represent how well the findings represent the findings reported in the literature. In order to address these threats, we carefully followed the tactics proposed by Kitchenham and Charters (2007).

6.3 External validity

External validity threats are related to the ability to generalize the result Wohlin et al. (2012). In secondary studies, external validity depends on validity of the selected studies. If the selected studies are not externally valid, neither the synthesis of its content it will be. In our work we were not able to evaluate the external validity of all the included studies.

6.4 Conclusion validity

Conclusion validity are related to the reliability of the conclusions drawn from the results Wohlin et al. (2012). In our case, threats are related to the potential not inclusion of some studies. In order to mitigate this threat, we carefully applied the search strategy performing the search in eight digital libraries in conjunction with the snowballing process Wohlin (????), considering all the references presented in the retrieved papers and evaluating all the papers that reference the retrieved ones resulting in one additional relevant paper. We applied a broad search string, which resulted in a large set of articles, but enabled to include more possible results. We defined inclusion and exclusion criteria starting from title and abstract and then to full text. However, we did not rely exclusively on titles and abstracts to establish if the work reported evidence on Technical Debt prioritization. So, before accepted one paper by title and abstract, we browsed the full text.

7 Conclusion

Software companies need to manage and refactoring Technical Debt issues, since sometimes its presence is inevitable, due to a number of causes that can often be related to unpredictable business or environmental forces internal or external to the organization. Moreover, some TD can be more dangerous than other.

Therefore, it is necessary understanding when refactoring TD is prioritize with respect to features and bugs or to other TD.

We conducted a SLR in order to investigate the existing body of knowledge in software engineering to understand how Technical Debt is prioritized in software organizations and what research approaches have been proposed.

The SLR process has been carried out by following two rigorous approaches. We included scientific articles indexed by the most important bibliographic sources and selected by a rigorous process. We considered articles published before the December 2018. Our work is based on 37 selected studies, which include data on the state of the art concerning approaches, factors, measures and tools used in practice or proposed in research to prioritize Technical Debt.

The results of our review show that Code and Architectural Debt are by far the most investigated type of debt when considering the prioritization, and there is scant evidence about the other TD types such as Test debt and Requirement debt. Prioritization process of TD refactoring can be carried out using different approaches, all having different goals and optimizing on different criteria. However, the identified measures, used in a few papers, capture only a small part of the factors that are used to prioritize TD.

There is a lack of empirical evidence on measuring principal and interest. Moreover, our results highlighted the lack of a solid, validated, and widely used set of tools specific to TD prioritization.

In practice, we found that there is a plethora of aspects that need to be considered when prioritizing TD. We report an impact map of such factors, which can be used as a comprehensive reference of which interest might be paid by an organization and how it should be considered. Such map can also be used to follow up with further research.

Future work should focus on the investigation of TD types that are less investigated. Moreover, we are planning to investigate how to systematically evaluate and measure the principal and interest of TD of different types. We also aim at developing a framework to support decision making related to the prioritization of Technical Debt.

References

References

  • Cunningham (1992) W. Cunningham, The wycash portfolio management system, SIGPLAN OOPS Mess. 4 (1992) 29–30.
  • Avgeriou et al. (2016) P. Avgeriou, P. Kruchten, I. Ozkaya, C. Seaman, Managing Technical Debt in Software Engineering (Dagstuhl Seminar 16162), Dagstuhl Reports 6 (2016) 110–138.
  • Martini et al. (2015) A. Martini, J. Bosch, M. Chaudron, Investigating architectural technical debt accumulation and refactoring over time: A multiple-case study, Information and Software Technology 67 (2015) 237 – 253.
  • Besker et al. (2018) T. Besker, A. Martini, R. E. Lokuge, K. Blincoe, J. Bosch, Embracing technical debt, from a startup company perspective, in: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 415–425.
  • Li et al. (2015) Z. Li, P. Avgeriou, P. Liang, A systematic mapping study on technical debt and its management, Journal of Systems and Software 101 (2015) 193 – 220.
  • Besker et al. (2018) T. Besker, A. Martini, J. Bosch,

    Technical debt cripples software developer productivity: A longitudinal study on developers’ daily software development work,

    in: Proceedings of the 2018 International Conference on Technical Debt, TechDebt ’18, pp. 105–114.
  • Seaman et al. (2012) C. B. Seaman, Y. Guo, C. Izurieta, Y. Cai, N. Zazworka, F. Shull, A. Vetro, Using technical debt data in decision making: Potential decision approaches, 2012 Third International Workshop on Managing Technical Debt (MTD) (2012) 45–48.
  • Martini and Bosch (2016) A. Martini, J. Bosch, An empirically developed method to aid decisions on architectural technical debt refactoring: Anacondebt, in: Proceedings of the 38th International Conference on Software Engineering Companion, ICSE ’16, pp. 31–40.
  • McConnell (2013) S. McConnell, Managing technical debt, http://www.sei.cmu.edu/community/td2013/program/upload/techncaldebt-icse.pdf (2013).
  • Avgeriou et al. (2016) P. Avgeriou, P. Kruchten, R. L. Nord, I. Ozkaya, C. Seaman, Reducing friction in software development, IEEE Softw. 33 (2016) 66–73.
  • Tom et al. (2013) E. Tom, A. Aurum, R. Vidgen, An exploration of technical debt, Journal of Systems and Software 86 (2013) 1498 – 1516.
  • Besker et al. (2018) T. Besker, A. Martini, J. Bosch, Managing architectural technical debt: A unified model and systematic literature review, Journal of Systems and Software 135 (2018) 1 – 16.
  • Rios et al. (2018) N. Rios, M. G. de Mendonça Neto, R. O. Spínola, A tertiary study on technical debt: Types, management strategies, research trends, and base information for practitioners, Information and Software Technology 102 (2018) 117 – 145.
  • Ampatzoglou et al. (2015) A. Ampatzoglou, A. Ampatzoglou, A. Chatzigeorgiou, P. Avgeriou, The financial aspect of managing technical debt: A systematic literature review, Information and Software Technology 64 (2015) 52 – 73.
  • Ribeiro et al. (2016) L. F. Ribeiro, M. A. d. F. Farias, M. Mendonça, R. O. Spínola, Decision criteria for the payment of technical debt in software projects: A systematic mapping study, in: 18th International Conference on Enterprise Information Systems, ICEIS 2016, Portugal, pp. 572–579.
  • Alves et al. (2016) N. S. Alves, T. S. Mendes, M. G. de Mendonça, R. O. Spínola, F. Shull, C. Seaman, Identification and management of technical debt: A systematic mapping study, Information and Software Technology 70 (2016) 100 – 121.
  • Fernández-Sánchez et al. (2017) C. Fernández-Sánchez, J. Garbajosa, A. Yagüe, J. Perez, Identification and analysis of the elements required to manage technical debt by means of a systematic mapping study, Journal of Systems and Software 124 (2017) 22 – 38.
  • Behutiye et al. (2017) W. N. Behutiye, P. Rodríguez, M. Oivo, A. Tosun, Analyzing the concept of technical debt in the context of agile software development: A systematic literature review, Information and Software Technology 82 (2017) 139 – 158.
  • Kitchenham and Charters (2007) B. Kitchenham, S. Charters, Guidelines for performing systematic literature reviews in software engineering, 2007.
  • Kitchenham and Brereton (2013) B. Kitchenham, P. Brereton, A systematic review of systematic review process research in software engineering, Information & Software Technology 55 (2013) 2049–2075.
  • Wohlin (????) C. Wohlin, Guidelines for snowballing in systematic literature studies and a replication in software engineering, in: EASE 2014.
  • Dyb and Dingsøyr (2008) T. Dyb, T. Dingsøyr, Empirical studies of agile software development: A systematic review, Inf. Softw. Technol. 50 (2008) 833–859.
  • Chatzigeorgiou et al. (2015) A. Chatzigeorgiou, A. Ampatzoglou, A. Ampatzoglou, T. Amanatidis, Estimating the breaking point for technical debt, in: 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD), pp. 53–56.
  • Martini and Bosch (2016) A. Martini, J. Bosch, A multiple case study of continuous architecting in large agile companies: Current gaps and the caffea framework, in: 2016 13th Working IEEE/IFIP Conference on Software Architecture (WICSA), pp. 1–10.
  • Arcelli Fontana et al. (2016) F. Arcelli Fontana, I. Pigazzini, R. Roveda, M. Zanoni, Automatic detection of instability architectural smells, in: Proc. 32nd Intern. Conf. on Software Maintenance and Evolution (ICSME 2016), IEEE, Raleigh, North Carolina, USA, 2016.
  • Fontana et al. (2017) F. A. Fontana, I. Pigazzini, R. Roveda, D. Tamburri, M. Zanoni, E. D. Nitto, Arcan: A tool for architectural smells detection, in: 2017 IEEE International Conference on Software Architecture Workshops (ICSAW), pp. 282–285.
  • Curtis et al. (2012) B. Curtis, J. Sappidi, A. Szynkarski, Estimating the principal of an application’s technical debt, IEEE Software 29 (2012) 34–42.
  • Martini (2018) A. Martini, Anacondebt: A tool to assess and track technical debt, in: Proceedings of the 2018 International Conference on Technical Debt, TechDebt ’18, pp. 55–56.
  • Marinescu et al. (2005) C. Marinescu, R. Marinescu, P. Florin Mihancea, D. Ratiu, R. Wettel, iplasma: An integrated platform for quality assessment of object-oriented design., pp. 77–80.
  • Wohlin et al. (2012) C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, Experimentation in Software Engineering, Springer, 2012.

Appendix A: The Selected Papers

  1. A. Nugroho, J. Visser, and T. Kuipers. An empirical model of technical debt and interest. 2nd Workshop on Managing Technical Debt (MTD ’11). pp. 1-8, 2011.

  2. N. Zazworka, M. A. Shaw, F. Shull, and C. Seaman. Investigating the impact of design debt on software quality. 2nd Workshop on Managing Technical Debt (MTD ’11). pp. 17-23, 2011.

  3. N. Zazworka, C. Seaman, and F. Shull. Prioritizing design debt investment opportunities. 2nd Workshop on Managing Technical Debt (MTD ’11). pp. 39-42, 2011.

  4. B. Curtis, J. Sappidi and A. Szynkarski. Estimating the Principal of an Application’s Technical Debt. IEEE Software, vol. 29, no. 6, pp. 34-42, 2012.

  5. F. Arcelli Fontana, V. Ferme, and S. Spinelli. Investigating the impact of code smells debt on quality code evaluation. Third International Workshop on Managing Technical Debt (MTD ’12). pp. 15-22, 2012.

  6. C. Seaman et al. Using technical debt data in decision making: Potential decision approaches. Third International Workshop on Managing Technical Debt (MTD’12), pp. 45-48, 2012.

  7. W. Snipes, B. Robinson, Y. Guo, and C. Seaman. Defining the decision factors for managing defects: a technical debt perspective. Third International Workshop on Managing Technical Debt (MTD ’12), pp. 54-60, 2012.

  8. K. Schmid. A formal approach to technical debt decision making. 9th international ACM Sigsoft conference on Quality of software architectures (QoSA ’13), pp/153-162, 2013.

  9. T. Sharma, G. Suryanarayana and G. Samarthyam. Challenges to and Solutions for Refactoring Adoption: An Industrial Perspective. IEEE Software, vol. 32, no. 6, pp. 44-51, 2015.

  10. A. Martini, J. Bosch, M. Chaudron. Investigating Architectural Technical Debt accumulation and refactoring over time: A multiple-case study. Information and Software Technology, Volume 67, pp. 237-253, 2015.

  11. H. Wang, M. Kessentini, W. Grosky, and H. Meddeb. On the use of time series and search based software engineering for refactoring recommendation. 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems (MEDES ’15). pp. 35-42, 2015.

  12. A. Martini and J. Bosch. Towards Prioritizing Architecture Technical Debt: Information Needs of Architects and Product Owners. 41st Euromicro Conference on Software Engineering and Advanced Applications. pp. 422-429, 2015.

  13. D. Falessi and A. Voegele. Validating and prioritizing quality rules for managing technical debt: An industrial case study. 7th International Workshop on Managing Technical Debt (MTD). pp. 41-48, 2015.

  14. J. Yli-Huumo, A. Maglyas, K. Smolander, J. Haller and H. Törnroos. Developing Processes to Increase Technical Debt Visibility and Manageability - An Action Research Study in Industry. Product-Focused Software Process Improvement. pp. 368-378, 2016.

  15. Y. Guo, R. Oliveira Spínola, and C. Seaman. 2016. Exploring the costs of technical debt management - a case study. Empirical Softw. Engg. Volume 21(1), pp. 159-182, 2016.

  16. J. Yli-Huumo, A. Maglyas, and K. Smolander. How do software development teams manage technical debt? - An empirical study. Journal of System and Software, Vol. 120, C, pp. 195-218, 2016.

  17. U. Xiao, Yuanfang Cai, Rick Kazman, Ran Mo, and Qiong Feng. Identifying and quantifying architectural debt. 38th International Conference on Software Engineering (ICSE ’16), pp. 488-498, 2016.

  18. S. Vidal, H. Vazquez, J. A. Diaz-Pace, C. Marcos, A. Garcia and W. Oizumi. JSpIRIT: a flexible tool for the analysis of code smells. 34th International Conference of the Chilean Computer Science Society (SCCC), pp. 1-6, 2015.

  19. A. Choudhary and P. Singh. Minimizing Refactoring Effort through Prioritization of Classes based on Historical, Architectural and Code Smell Information. QuASoQ/TDA@APSEC, 2016.

  20. R.K. Gupta, P. Manikreddy, S. Naik, and K. Arya. Pragmatic Approach for Managing Technical Debt in Legacy Software Project. 9th India Software Engineering Conference (ISEC ’16), pp. 170-176, 2016.

  21. Z. Codabux and B. J. Williams. Technical debt prioritization using predictive analytics. 38th International Conference on Software Engineering Companion (ICSE ’16), pp. 704-706, 2016.

  22. S. H. Vathsavayi and K. Systä. Technical Debt Management with Genetic Algorithms. 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Limassol, pp. 50-53, 2016.

  23. S. Akbarinasaji, A. Bener and A. Neal. A Heuristic for Estimating the Impact of Lingering Defects: Can Debt Analogy Be Used as a Metric?. 8th Workshop on Emerging Trends in Software Metrics (WETSoM), pp. 36-42, 2017.

  24. L. F. Ribeiro, N. S. R. Alves, M. G. d. M. Neto and R. O. Spínola. A Strategy Based on Multiple Decision Criteria to Support Technical Debt Management. 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 334-341, 2017.

  25. Z. Codabux, B. Williams, G. Bradshaw and M. Cantor. An empirical assessment of technical debt practices in industry. Journal of Software: Evolution and Process. Vol. 29, 2017.

  26. S. Charalampidou, A. Ampatzoglou, A. Chatzigeorgiou, and P. Avgeriou. Assessing code smell interest probability: a case study. XP2017 Scientific Workshops (XP ’17), Article 5, 8 pages, 2017.

  27. T. Besker, A. Martini and J. Bosch. Impact of Architectural Technical Debt on Daily Software Development Work — A Survey of Software Practitioners. 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 278-287, 2017.

  28. M. Farias, J. Amâncio Santos, M. Kalinowski, M. Mendonça and R. Spínola, Rodrigo. Investigating the Identification of Technical Debt Through Code Comment Analysis. Lecture Notes in Business Information Processing. pp. 284-309, 2017.

  29. M. Ciolkowsk, L. Guzmán, A. Trendowicz and F. Salfner. Lessons Learned from the ProDebt Research Project on Planning Technical Debt Strategically. International Conference on Product-Focused Software Process Improvement. pp. 523-534, 2017.

  30. H. Ghanbari, T. Besker, A. Martini and J. Bosch. Looking for Peace of Mind? Manage Your (Technical) Debt: An Exploratory Field Study. International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 384-393, 2017.

  31. A. Martini and J. Bosch. Revealing Social Debt with the CAFFEA Framework: An Antidote to Architectural Debt. International Conference on Software Architecture Workshops (ICSAW),pp. 179-181, 2017

  32. A. Martini, S. Vajda, J. Vasa, A. Jones, M. Abdelrazek, J. Grundy and J. Bosch. Technical debt interest assessment: from issues to project. XP2017 Scientific Workshops. pp. 1-6, 2017.

  33. A. Martini and J. Bosch. The magnificent seven: towards a systematic estimation of technical debt interest. IXP2017 Scientific Workshops (XP ’17), Article 7, 5 pages, 2017.

  34. T. Besker, A. Martini and J. Bosch. The Pricey Bill of Technical Debt: When and by Whom will it be Paid?. International Conference on Software Maintenance and Evolution (ICSME), pp. 13-23, 2017

  35. A. Martini, E. Sikander, and N. Madlani. A semi-automated framework for the identification and estimation of Architectural Technical Debt. Information and Software Technology, Vol. 93, C, pp.264-279, 2018.

  36. J. M. Conejero, R. Rodríguez-Echeverría, J. Hernández, P. J. Clemente, C. Ortiz-Caraballo, E. Jurado, F. Sánchez-Figueroa, Early evaluation of technical debt impact on maintainability. Journal of Systems and Software. Vol. 142, pp. 92-114, 2018.

  37. A. Martini, T. Besker, J. Bosch. Technical Debt tracking: Current state of practice: A survey and multiple case study in 15 large organizations. Science of Computer Programming. Vol. 163, pp. 42-61, 2018.

  38. A. Martini, F. Arcelli Fontana, A. Biaggi, R. Roveda. Identifying and Prioritizing Architectural Debt Through Architectural Smells: A Case Study in a Large Software Company. 12th European Conference on Software Architecture (ECSA), pp. 24-28, 2018