Employment of Multiple Algorithms for Optimal Path-based Test Selection Strategy

02/22/2018 ∙ by Miroslav Bures, et al. ∙ Czech Technical University in Prague 0

Executing various sequences of system functions in a system under test represents one of the primary techniques in software testing. The natural way to create effective, consistent and efficient test sequences is to model the system under test and employ an algorithm to generate the tests that satisfy a defined test coverage criterion. Several criteria of test set optimality can be defined. In addition, to optimize the test set from an economic viewpoint, the priorities of the various parts of the system model under test must be defined. Using this prioritization, the test cases exercise the high priority parts of the system under test more intensely than those with low priority. Evidence from the literature and our observations confirm that finding a universal algorithm that produces an optimal test set for all test coverage and test set optimality criteria is a challenging task. Moreover, for different individual problem instances, different algorithms provide optimal results. In this paper, we present a path-based strategy to perform optimal test selection. The strategy first employs a set of current algorithms to generate test sets; then, it assesses the optimality of each test set by the selected criteria, and finally, chooses the optimal test set. The experimental results confirm the validity and usefulness of this strategy. For individual instances of 50 system under test models, different algorithms provided optimal results; these results varied by the required test coverage level, the size of the priority parts of the model, and the selected test set optimality criteria.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The natural way to construct a test case is to chain a sequence of specific calls to various functions of the system under test (SUT). Whether designing a method flow, or API calls are used for an integration test, or a test scenario is designed for a manual business end-to-end test, following a systematic approach that generates consistent and effective test sequences is essential. The field of model-based testing provides a solution for this issue through which we first model a particular SUT process or workflow in a suitable notation and then use an appropriate algorithm to generate the flows (i.e., path-based test cases).

To generate the path-based test cases systematically and consistently, a SUT model based on a directed graph is used [1]. Several algorithms have been presented (e.g., [2, 3, 4, 5, 6, 7]) to solve this problem. However, based on both evidence from the literature and our experiments while developing new algorithms to solve this problem, it is challenging task to find a universal algorithm that can generate an optimal test set for all instances. Not only do individual problem instances (particular SUT models) differ but also different test set optimality criteria can be formulated [3, 1, 8]. A significant finding here addresses the possibility for creating a universal algorithm that satisfies multiple optimality criteria. This task is complicated and must consider the different test coverage criteria that have been defined, which span a range from All Node Coverage to All Path Coverage, and individual algorithms differ in their ability to produce test sets that satisfy these different criteria [7].

The complexity of the problem increases when individual parts of the SUT model should be tested at different priority levels. For instance, consider a complex workflow in an information system that must be covered by path-based test scenarios. Only selected parts of the workflow require coverage by high-intensity test scenarios, while for the remaining parts, lightweight tests are sufficient to optimize the test set and reduce the testing costs. The priorities can captured by the test requirements [1, 3] or by defining edge weights in the model [9]. However, to reflect the priorities captured by edge weights when generating test cases, alternative strategies must be defined because the current algorithms provide near optimum results only for non-prioritized SUT models and can be suboptimal when solving this type of problem for prioritized SUT models.

To address the issues described above, in this study, we employ an approach based on combining current algorithms, including both our own work in this area [9] and selected algorithms previously published in the literature [3, 1]. The strategy, which includes these algorithms, relies on input from the tester as follows. The tester first creates a SUT model, defines the priority parts of the model, and specifies the test coverage criteria. Then, the tester selects the test set optimality criteria from a set of options (details are provided in Section III-B3). This strategy uses all the algorithms to generate different test sets based on the SUT model. Then, based on the test set optimality criteria, the best test set is selected and provided to the test analyst. We implemented this test case generation strategy in the latest version of the experimental Oxygen Model-based Testing platform111http://still.felk.cvut.cz/oxygen/ developed by the STILL group. In this paper, we present the details of this strategy and its results for 50 SUT models using Edge Coverage and Edge-Pair Coverage criteria and for 16 different test set optimality criteria (including an optimality function and a sequence-selection strategy composed from additional test set optimality indicators). These data can also be used to compare the test sets produced by the algorithms.

The paper is organized as follows. Section II defines the problem; then, it provides an overview of the test coverage criteria used to determine the intensity of the test set, and finally, it discusses possible test set optimality criteria. Section III provides the details of the process for selecting an optimal test set based on the optimality criteria. Section IV presents the experimental method and the acquired data. Section V discusses the results and Section VI analyzes possible threats to validity. Section VII summarizes the relevant related work. Finally, Section VIII concludes this paper.

Ii Problem Definition

As mentioned previously, the strategy presented in this paper takes a SUT model as input. Here, the SUT process is modeled as a directed graph , where is a set of nodes, , and is a set of edges. is a subset of . In the model we define one start node . The set contains the end nodes of the graph, and [1].

The SUT functions and decision points are mapped to depending on the level of abstraction. In addition, the SUT layer for which we prepare test cases plays an essential role in the modeling. As an example, we can provide data-flow testing at the code level or design an end-to-end (E2E) high-level business-process test set. More information about this topic appears in V.

The test case is a sequence of nodes , with a sequence of edges , where , , , . The test case starts with the start node () and ends with a end node () . We can denote the test case as a either sequence of nodes , or a sequence of edges . The test set is a set of test cases.

To determine the required the test coverage, we define a set of test requirements . Generally, a test requirement is a path in that must be a sub-path of at least one test case . The test requirements can be used either to (1) define the general intensity of the test cases or (2) to express which parts of the SUT model are considered as priorities to be covered by test cases.

The fact that the test requirements can be used either to determine the overall intensity of the test set or to express which parts of the SUT model should be tested at a higher priority leads us to adopt an alternative definition of the SUT model. This definition supports formulation of algoriths which allow determining testing intensity and expressing priorities in parallel [9]. Moreover, it uses a multigraph instead of a graph as a SUT model, which gives test analysts more flexibility when modeling SUT processes (this issue is discussed further in Section V). In addition, using more priority levels is natural in the software development process [10]; using test requirements for prioritization results in algorithms being able to work with two priority levels only, which could restrict the development of further and possibly more effective algorithms.

Our alternative definition of the SUT model is as follows. We model a SUT process as a weighted multigraph , where is a set of nodes, , and is a set of edges. Here, assigns each edge to its source node and assigns each edge to its target node. We define one start node . The set contains the end nodes of the multigraph, . For each edge (resp. node ), a (resp. ) is defined, where and . When is not defined, the default value is used. is a set of high-priority edges; is a set of medium-priority edges; and is a set of low-priority edges, where , , , .

Priority reflects the importance of the edge to be tested. The test analyst determines the priority based on a risk prioritization technique [11] or a technique that combines risk assessment with information regarding the internal complexity of the SUT or the presence of defects in previous SUT versions [10]. To determine the intensity of the test set , test coverage criteria are used.

Ii-a Test Coverage Criteria

Several different test coverage criteria have been defined for . For instance, All Edge Coverage (or Edge Coverage) requires each edge to be present in the test set minimally once. Alternatively, All Node Coverage requires each node to be present in test set at least once. To satisfy the Edge-Pair Coverage criterion, the test set must contain each possible pair of adjacent edges in [1].

The All Paths Coverage (or Complete Path Coverage) requires that all possible paths in , starting from and ending at any node of , be present in test set . Such a test set can contain considerable redundancy. To reduce this redundancy, the Prime Path Coverage criterion is used. To satisfy the Prime Path Coverage criterion, each reachable prime path in must be a sub-path of a test case . A path from to is prime if (1) is simple, and (2) is not a sub-path of any other simple path in . A path is simple when no node is present more than once in (i.e., does not contain any loops); the only exception is and , which can be identical (in other words, itself can be a loop) [1].

Sorted by the intensity of the test cases, All Node Coverage is the weakest option, followed by All Edge Coverage, Edge-Pair Coverage and Prime Path Coverage. The All Paths Coverage lies at the other end of the spectrum [1], because it implies the most intense test cases. However, due to the high number of test case steps, this option is not practicable in most software development projects. This problem can be also faced by Prime Path Coverage: for many routine process-testing tasks, this level of test coverage can be too extensive.

Test coverage criteria can also be specified by the Test Depth Level (TDL) [12]. when the edge appears at least once in at least one test case . when the test set satisfies the following conditions: For each node , is a set of all possible paths in starting with an edge incoming to the decision point , followed by a sequence of edges outgoing from the node . Then, the test cases cases in test set contain all paths from . When , it is equivalent to All Edge Coverage, and when , it is equivalent to Edge-Pair Coverage. All the coverage criteria in this section are defined in the same way for as well as for .

To determine the testing priority in selected parts of the SUT processes, we define a Priority Level (PL) for . . when the edge is present at least once in at least one test case . Further, when the edge is present at least once in at least one test case . When a test set satisfies the All Edge Coverage, it also satisfies PL.

In the test case generation strategies we evolve, the PL can be combined with yet another test coverage criteria such as TDL [9] or Prime Path Coverage (refer to Section III-B4). In these cases, PL reduces the test coverage by TDL or Prime Paths Coverage to only the parts, which are defined as the priority. It allows optimizing the test cases to exercise only the priority parts of the SUT processes or workflows.

Ii-B Test Set Optimality Criteria

Various optimality criteria for have been discussed in the literature (e.g., [3, 8]). Table I lists the optimality criteria used in this paper as defined for SUT model . Parts of these criteria can also be defined for which is captured in Table I in the column “Applicable to.”

Optimality criterion Description Applicable to
Number of test cases in the test set and
Total number of edges in the test cases of a test set , edges can repeat and
Total number of edges of priority in the test cases of a test set , edges can repeat
Total number of edges of priority and in the test cases of a test set , edges can repeat
Total number of unique edges in the test cases of a test set and
Total number of unique edges of priority in the test cases of a test set
Total number of unique edges of priority and in the test cases of a test set
Total number of nodes in the test cases of a test set , nodes can repeat and
Total number of unique nodes in the test cases of a test set
Ratio of unique edges contained in the test cases of a test set . A lower value of means more optimal test set, because less unique edges are present in the test cases and these unique edges can represent extra costs for preparation of the detailed test scenarios. and
Ratio of edges of priority and all edges in the test cases of a test set . A higher value of means more optimal test set, because less edges which do not have priority (thus are not necessary to test) are present in the test cases.
Ratio of edges of priority and and all edges in the test cases of a test set . A higher value of means more optimal test set, because less edges which do not have priority and (thus are not necessary to test) are present in the test cases.
The same as , only unique edges are taken in acount
The same as , only unique edges are taken in acount
TABLE I: Test set optimality criteria

Individual test set optimality criteria can be combined. In this paper, we explore two possible methods: combining the optimality criteria to a formula and evaluating the test set using a sequence of criteria. The following section provides more detail on the selection of the optimal test set and how these methods can be used successfully.

Iii Selecting an Optimal Test Set

To obtain an optimal test set for SUT model along with test coverage and test set optimality criteria, we conduct a sequence of three main steps. First, we select a set of algorithms and their suitable input parameters to generate the for , the test coverage, and the test set optimality criteria. Then, we run these selected algorithms to produce the test sets . Finally, we analyze the test sets and select the test set that has the best value of the optimality criteria. The inputs to this process are as follows:

  1. SUT model

  2. Test coverage criteria from the following options:

    1. Test intensity from the following options: Edge Coverage, Edge-Pair Coverage, (where , because is equivalent to Edge Coverage and is equivalent to Edge-Pair coverage), and Prime Path Coverage.

    2. Coverage of the priority parts by Priority Level (PL) as defined in Section II-A.

  3. Test set optimality criterion from the options defined in Section II-B and Table I.

The output of the process is an optimal test set , that satisfies the test coverage criterion.

Iii-a Included Algorithms

In the described strategy, we use the algorithms listed in Table II.

Code Name SUT model Reference
PCT Process Cycle Test Koomen et al.[12], Bures [13]
PPT Prioritized Process Test Bures et al. [9]
BF Brute Force Solution , set of test requirements Li, Li and Offut [3]
SC Set-Covering Based Solution , set of test requirements Li, Li and Offut [3]
PG Matching-Based Prefix Graph Solution , set of test requirements Li, Li and Offut [3]
RSC Set-Covering Based Solution with Test Set Reduction for the whole algorithm, and for its SC part Specified in Section III-B4
TABLE II: Algorithms used to generate

We used the Oxygen Model-based Testing experimental platform222http://still.felk.cvut.cz/oxygen/ (formerly PCTgen) [13] to implement the proposed strategy. Our research team implemented process Cycle Test (PCT) and Prioritized Process Test (PPT). We also implemented the Brute Force Solution (BF) algorithm based on the pseudocode published by Li et al. [3]. The implementations of the Set-Covering Based Solution (SC) and Matching-Based Prefix Graph Solution (PG) algorithms is based on the source code by Ammann and Offutt [14]. The Set-Covering Based Solution with Test Case Reduction (RSC) consists of the Set-Covering Based Solution part and our implementation of the test set reduction part (further specified in Section III-B4).

The role of the PCT algorithm is only to provide information on how many test cases and test steps are in a test set for a particular SUT model when the SUT model parts are not prioritized.

Iii-B Test Set Generation Process

The strategy to determine the optimal test set by the selected test set optimality criterion consists of five main steps, which are summarized in Algorithm LABEL:alg:The-main-process-Algo.

algocf[htbp]    

Figure 1 depicts the overall process. The process inputs are marked in blue while the process outputs are marked in green.

Fig. 1: Main steps of the proposed test case generation strategy

In the Oxygen platform, the is presented to the user, as well as . For each of , Oxygen also provides the values of the optimality criteria. In the following subsections, we explain these individual steps in more detail.

Iii-B1 Conversion of to and

For the BF, SC, PG and RSC algorithms, we need to convert graph to graph and a set of test requirements, . The multigraph is equivalent to graph , when (1) edge priorities () and node priorities () are not considered, and (2) there are no parallel edges in . This conversion implies that when creating no parallel edges can be used, which can restrict the modeling possibilities that using a multigraph as a SUT process abstraction makes possible. However, this restriction can be solved without losing the applicability of the proposed strategy by modeling the parallel edges as graph nodes.

A set of test requirements is created by a method specified in Table III.

Test coverage: test intensity Method of creation
Edge Coverage () Atomic conversion is a set of all adjacent node pairs for each is a set of all adjacent node pairs for each
Edge Coverage () Sequence conversion is a set of paths in , for each is a set of paths in , for each
Edge-Pair Coverage () A set contains all possible pairs of adjacent edges of . Then, is a set of all paths , such that , for each . This process is not influenced by .
, A set contains all possible paths of consisting of adjacent edges. Then, is a set of all paths , such that , , … , for each . This process is not influenced by .
Prime Path Coverage is a set of all possible prime paths in (applies to BF, SC and PG). This process is not influenced by .
TABLE III: Method of creation of the test requirements from

Iii-B2 Algorithms Selection

Table IV specifies the process of selecting algorithms by the specified test coverage criteria (algorithm selection configuration as depicted in Fig. 1).

Test set reduction: coverage of priority parts of
Test coverage: test intensity not reduced by
Edge Coverage () PPT RSC BF SC PG PPT RSC BF SC PG PCT
Edge-Pair Coverage () PPT RSC PPT RSC PCT BF SC PG
PPT PPT PCT BF SC PG
Prime Path Coverage RSC RSC BF SC PG
TABLE IV: Algorithm selection configuration

For the Edge Coverage case (), the BF, SC and PG algorithms reflect the edge priorities in via a set of test requirements, , generated from (refer to Table III). In both conversion types, the Atomic and the Sequence conversions are used for each of these algorithms. In contrast, PPT and RSC work directly with the edge priorities in ([9] and Section III-B4).

For the Edge-Pair Coverage case (), the PPT and RSC algorithms are comparable candidates when the criterion reduces the test set. The RSC satisfies the Edge-Pair Coverage criterion, because the test set produced by this algorithm satisfies the Prime Paths Coverage criterion [1]. The PPT algorithm is designed to satisfy , where is the length of the longest path in (excluding the loops) [9]. Thus, it satisfies the Edge-Pair Coverage criterion, which is equivalent to .

Iii-B3 Selection of the Best Test Set

After the selected algorithms have produced the test sets, our strategy selects the test set , which has the best value of the optimality criteria. The test analyst can select the following options:

  1. Selection by single optimality criterion: A specific optimality criterion is specified on input. Then, the test set that has the best value according to the specified optimality criterion is selected. The following options are available: , , , , , , , , , (the with the lowest value of these criteria is considered as optimal) and , , and (the with the highest value of these criteria is considered as optimal). However, a test set could be optimal according to one criterion, for instance , but be strongly sub-optimal according to another criterion, for instance . In such situations, a test set with a slightly higher value but whose were closer to the optimum would be a better choice. In these situations, we use the optimality function explained below.

  2. Selection by the optimality function: The optimality function selects the best test set using several concurrent optimality criteria, and it is defined as . The constants , and determine the weight of each specific optimality criterion, , , and .

  3. Sequence selection: In this approach, a sequence of optimality criteria , is specified on input. When multiple test sets in have the same best value of , the best selection is then based on the criterion. If multiple test sets still have the same best value of , the selection of the final is based on and so forth.

Iii-B4 RSC Algorithm

The pseudocode for the Set-Covering Based Solution with Test Set Reduction (RSC) is specified in Algorithm LABEL:alg:Set-Covering-with-Test-Set-Reduction.

algocf[htbp]    

The principle underlying the RSC is to first employ the SC algorithm to generate the test set satisfying the Prime Path Coverage (denoted as ). Then, from , the test cases that cover the maximal number of priority edges that must be present in the test cases are utilized to build test set incrementally.

Iv Experiments

In this section, we describe some experiments performed to demonstrate the functionality of the proposed test set selection strategy. The provided data can also be used to compare the test sets produced by the individual algorithms and when using various optimality criteria.

Iv-a Experimental Method and Set-up

In the experiments, we execute the algorithms for the following configurations of the test coverage criteria:

  1. Edge Coverage: reduced by and . In this experiment, we compared PPT, RSC, BF, SC and PG. For BF, SC and PG, the priorities in were converted to by both Atomic and Sequence conversion (see Table III).

  2. Edge-Pair Coverage: reduced by and . In this experiment, we compared PPT and RSC.

The Prime Path Coverage criterion was not involved in the experiments, as its reduction by is possible only by the RSC algorithm. Hence, no alternative algorithm was available for comparison. The same situation is applicable for , where the PPT algorithm is the only option, which reduces the test cases by .

Regarding the problem instances, we used 50 SUT models specified by . To ensure the objective comparability of all algorithms (and the convertibility of to and ), the graphs did not contain parallel edges (the SUTs were modeled so that parallel edges were not needed). The models were created in the user interface of the Oxygen platform [13]. The properties of these models are summarized in Table V. . For the atomic conversion of test requirements (see Table III), (resp. ) denotes a set of test requirements for (resp. ). For the sequence conversion of test requirements, (resp. ) denotes a set of test requirements for (resp. ). In Table V, we present only and , because and . The number of loops in is denoted by .

ID ID
1 22 30 8 8 1 6 7 26 51 67 15 7 0 9 12
2 21 30 8 3 4 5 6 27 28 39 8 4 1 5 8
3 41 54 10 10 4 9 11 28 21 22 5 2 0 2 3
4 29 46 11 4 11 9 9 29 29 37 10 6 0 4 7
5 29 45 17 6 6 15 19 30 9 11 1 4 0 1 2
6 21 27 9 6 0 8 13 31 10 13 1 4 0 1 2
7 45 64 18 14 7 15 30 32 25 27 3 5 0 3 5
8 19 30 10 6 6 9 16 33 11 15 1 2 0 1 3
9 25 38 11 9 9 8 13 34 13 19 3 4 0 2 6
10 52 78 9 7 3 6 10 35 10 15 4 2 4 4 4
11 48 69 10 8 3 4 8 36 8 10 1 3 3 1 4
12 47 68 9 11 1 5 8 37 8 11 3 2 3 3 3
13 23 26 9 6 2 5 5 38 7 12 2 3 5 2 3
14 8 10 1 3 2 1 4 39 8 11 3 2 2 2 4
15 24 31 10 4 0 2 3 40 7 9 2 2 0 2 3
16 26 37 10 3 2 3 4 41 9 11 2 3 0 2 4
17 27 36 6 7 3 6 10 42 11 14 2 2 2 2 4
18 20 26 1 8 2 1 2 43 22 27 5 7 0 4 9
19 28 34 1 2 0 1 3 44 26 38 6 3 3 6 7
20 9 8 3 2 0 3 4 45 29 45 8 4 4 7 11
21 8 10 2 2 2 2 3 46 35 48 5 9 4 5 11
22 34 47 13 5 0 7 9 47 40 54 8 6 0 8 11
23 35 49 8 3 0 7 10 48 50 74 13 6 6 13 17
24 37 55 16 5 2 9 13 49 21 27 12 6 0 2 3
25 41 59 15 6 0 10 13 50 22 23 8 2 0 2 3
TABLE V: Problem instances used in the experiments

We compared all the options of test set optimality criteria introduced in Section III-B3: a set of single optimality criteria, an optimality function and sequence selection.

The test set selection process described in this paper is implemented as part of the development branch of the Oxygen platform. All the test set optimality criteria discussed in this paper are calculated automatically from the produced test cases and provided in a CSV-formatted report. In the report, the test set selected by the particular optimality criteria is also presented, including the algorithm that generated this test set.

Iv-B Experimental Results

In this section, we present a performance comparison of the PPT, RSC, BF, SC, and PG algorithms using all the test set optimality criteria discussed in Section II-B. For the comparison that appears in Section IV-B1 we used the data from Step 3 of Algorithm LABEL:alg:The-main-process-Algo. Then, in Section IV-B2, we provide the results of each specific algorithm selection, which are the output of Algorithm LABEL:alg:The-main-process-Algo.

Iv-B1 Comparison of Individual Algorithms

In Table VI, we present the averaged values of the optimality criteria of the test sets produced for the individual SUT models used in the experiments (introduced previously in Table V). Table VI presents the numbers for Edge Coverage and . The atomic and sequence conversions of test requirements (see Table III) are denoted by “atom” and “seq” in the table.

Algorithm / creation method
Value of optimality criterion - average for all PPT RSC BF atom BF seq SC atom SC seq PG atom PG seq
2.88 3.20 5.04 4.40 5.10 4.36 4.20 4.26
23.40 32.62 36.10 32.68 36.92 32.74 31.30 32.12
8.92 11.22 12.24 11.34 13.12 11.64 11.04 11.46
11.64 15.50 16.64 15.30 17.42 15.60 14.74 15.34
17.08 18.90 19.64 18.76 18.86 18.02 17.96 17.96
7.12 7.12 7.12 7.12 7.12 7.12 7.12 7.12
8.86 9.28 9.30 9.20 9.22 9.18 9.14 9.16
20.52 29.42 31.06 28.28 31.82 28.38 27.10 27.86
15.52 17.28 17.88 17.06 17.18 16.40 16.32 16.36
0.50 0.57 0.56 0.54 0.54 0.52 0.52 0.52
0.41 0.38 0.36 0.39 0.38 0.40 0.40 0.40
0.53 0.52 0.49 0.51 0.49 0.52 0.52 0.52
0.36 0.28 0.25 0.29 0.25 0.30 0.30 0.30
0.45 0.38 0.33 0.38 0.33 0.38 0.39 0.39
TABLE VI: Results of the algorithms for Edge Coverage and

Figures 2 and 3 provide a visual comparison, using these averaged values to compare the individual algorithms.

Fig. 2: Algorithm comparison for Edge Coverage and
Fig. 3: Algorithm comparison for Edge Coverage and

Table VII lists the averaged values of the optimality criteria of the test sets produced for the individual SUT models for the Edge Coverage and criterion and .

Algorithm / creation method
Value of optimality criterion - average for all PPT RSC BF atom BF seq SC atom SC seq PG atom PG seq
4.38 4.76 7.66 6.80 7.60 6.52 5.96 6.20
35.34 46.04 54.18 50.46 54.20 48.94 44.84 47.28
10.56 13.26 15.40 15.06 16.32 15.30 13.48 14.86
17.22 21.54 24.44 24.24 25.38 24.28 21.38 23.48
22.60 24.06 24.72 23.60 23.98 22.76 22.80 22.68
7.12 7.12 7.12 7.12 7.12 7.12 7.12 7.12
12.08 12.08 12.08 12.08 12.08 12.08 12.08 12.08
30.96 41.28 46.52 43.66 46.60 42.42 38.88 41.08
20.80 22.28 22.84 21.80 22.16 21.04 21.06 20.96
0.69 0.74 0.75 0.72 0.74 0.70 0.70 0.69
0.30 0.29 0.28 0.31 0.29 0.32 0.31 0.32
0.53 0.51 0.47 0.53 0.48 0.54 0.52 0.54
0.22 0.18 0.14 0.16 0.14 0.17 0.18 0.17
0.42 0.33 0.26 0.31 0.25 0.33 0.34 0.33
TABLE VII: Results of the algorithms for Edge Coverage and

To better compare the algorithms using the values presented in VII, Figures 4 and 5 shows a graphical summary.

Fig. 4: Algorithm comparison for Edge Coverage and
Fig. 5: Algorithm comparison for Edge Coverage and

Table VIII shows a comparison of the PPT and RSC algorithms for the Edge-Pair Coverage criterion and . Here, the only relevant algorithms are PPT and RSC, because these algorithms allow test set reduction by PL and can satisfy the Edge-Pair Coverage criterion before the test set reduction by PL.

Value of optimality criterion - average for all PPT RSC PPT RSC
5.28 3.20 7.78 4.76
43.74 32.62 62.88 46.04
15.50 11.22 18.32 13.26
20.72 15.50 29.18 21.54
23.04 18.90 28.00 24.06
7.12 7.12 7.12 7.12
9.92 9.28 12.08 12.08
38.46 29.42 55.10 41.28
21.46 17.28 26.30 22.28
0.65 0.57 0.86 0.74
0.36 0.38 0.27 0.29
0.49 0.52 0.47 0.51
0.20 0.28 0.12 0.18
0.29 0.38 0.21 0.33
TABLE VIII: Results of the algorithms for Edge-Pair Coverage

A comparison of the individual algorithms for the Edge-Pair Coverage criterion is depicted in Figures 6 and 7 for .

Fig. 6: Algorithm comparison for Edge-Pair Coverage and
Fig. 7: Algorithm comparison for Edge-Pair Coverage and

Figures 8 and 9 depict this comparison for .

Fig. 8: Algorithm comparison for Edge-Pair Coverage and
Fig. 9: Algorithm comparison for Edge-Pair Coverage and

We analyze the data in Section V.

Iv-B2 Algorithm Selection Results

In this section, we present the results related to the functionality of the proposed test set selection strategy. We start with detailed data that demonstrate the test set selection strategy. Table IV-B2 summarizes the execution of the test set selection strategy for the Edge Coverage criterion and . For each of the problem instances and optimality criteria, we list the algorithm that produced the optimal based on the selected criteria. Table IV-B2 presents the results of the first twelve selected problem instances as an example.

OPT denotes an optimality function (refer to Section III-B3). In all the experiments described in this paper, the optimality function was configured with parameters , and .

SEQ denotes sequence selection (refer to Section III-B3). In all of the experiments, the following sequence of optimality criteria was adopted: , , . The selection sequence starts with .

In Table IV-B2 we present the name of the algorithm (or algorithms) that produced the optimal test set based on the particular criterion. For simple optimality criteria, the algorithm name is followed by the value of the optimality criterion (in brackets). When multiple algorithms can provide an optimal for a particular criterion, more algorithms are listed. Cases in which more than three algorithms provided an optimal for a particular criterion are denoted as , where denotes the number of algorithms and denotes the value of the optimality criterion.

Atomic and sequence conversions of test requirements (refer to Table III) are denoted by “a” and “spostfixes, respectively, in italics following the name of the algorithm. ID denotes the ID of the problem instance .

Results of the test set selection strategy for the individual problem instances for Edge Coverage and

Optimality criterion
ID OPT SEQ
1 RSC(4) PPT(4) PPT(18) RSC(8) PPT(8) PPT(17) PPT(14) PPT(14) PPT(0.57) 4(0.46) RSC(0.90) PPT PPT
2 RSC(3) PPT(3) PPT(22) PG(8) PPT(8) PPT(22) PPT(19) PPT(19) PPT(0.73) PPT(0.36) PPT(0.50) PPT PPT
3 RSC(6) PPT(6) SCa(67) SCa(13) 4(35) SCa(60) 4(34) 4(0.65) RSC(0.22) PGa(0.41) SCa PPT
4 PPT(2) PPT(31) PPT(13) PPT(26) PPT(29) PPT(25) PPT(0.56) PPT(0.42) PPT(0.55) PPT PPT
5 PPT(6) PPT(35) PPT(19) PGa(30) PPT(30) PPT(29) PGa(26) PPT(26) PGa(0.67) PPT(0.67) PPT(0.54) PPT(0.66) PPT PPT
6 RSC(4) PPT(4) RSC(28) PPT(28) RSC(12) PPT(12) 4(19) RSC(24) PPT(24) 4(17) 4(0.70) RSC(0.43) PPT(0.43) BFa(0.41) RSC RSC PPT
7 RSC(11) PPT(11) PPT(99) PPT(26) PPT(45) PPT(88) PGa(42) PGs(42) SCs(42) PPT(0.70) PPT(0.26) PPT(0.60) PPT PPT
8 PPT(3) PPT(37) PPT(14) PPT(21) PPT(34) PPT(20) PPT(0.70) BFs(0.43) RSC(0.56) PPT PPT
9 PPT(2) PPT(31) PPT(15) PPT(22) PPT(29) PPT(21) PPT(0.58) PPT(0.48) PPT(0.65) PPT PPT
10 RSC(3) PPT(3) RSC(26) PPT(26) RSC(10) PPT(10) RSC(25) PPT(25) RSC(23) PPT(23) RSC(22) PPT(22) RSC(0.32) PPT(0.32) PGs(0.44) SCs(0.44) PGs(0.51) SCs(0.51) RSC PPT RSC PPT
11 RSC(3) PPT(3) RSC(22) PPT(22) PPT(12) RSC(19) RSC(19) PPT(19) RSC(16) RSC(0.28) RSC(0.59) RSC(0.59) PPT(0.59) RSC RSC
12 RSC(3) PPT(3) RSC(27) RSC(12) PPT(12) 4(22) RSC(24) 4(20) 4(0.32) RSC(0.44) PGa(0.63) PGs(0.63) SCs(0.63) RSC RSC

Figure 10 shows the overall statistics for the test set selection strategies for the Edge Coverage criterion and . The x-axis reflects the optimality criteria, and the y-axis presents the individual algorithms. The bubble size represents the number of problem instances for which an algorithm produced an optimal using a specific criterion. The maximum bubble size is 50 (the number of problem instances). For the case of , all the algorithms achieved the maximum value, as for in all the cases.

Fig. 10: Overall statistics of the test set selection strategy for Edge Coverage and

Using the same schema, Figure 11 shows the overall statistical results of the test set selection strategies for the Edge Coverage criterion and . In this case, all the employed algorithms provided a having and .

Fig. 11: Overall statistics of the test set selection strategy for Edge Coverage and

The same system is used in Figure 12, which depicts the overall statistics of the test set selection strategies for Edge-Pair Coverage and , and in Figure 13, which presents the statistics for Edge-Pair Coverage and .

Fig. 12: Overall statistics of the test set selection strategy for Edge-Pair Coverage and
Fig. 13: Overall statistics of the test set selection strategy for Edge-Pair Coverage and

V Discussion

From the data presented in Section IV-B, several conclusions can be made.

Starting with a comparison of the algorithms using the average values of optimality criteria computed for 50 different problem instances (Table V), the results differ significantly based on the test coverage level. For Edge Coverage, the PPT algorithm provides the best results in terms of average statistics. The difference between the average value for PPT and the other algorithms is the most significant for the optimality criteria, and . This result can be observed for both (Table VI, Figure 2) and (Table VII, Figure 4). For , the difference in between PPT and RSC is 10%, and it is greater than 31% between PPT and each of the BF, SC and PG algorithms. The difference in between PPT and all the other algorithms is greater than 25%. For , this difference is greater than 24%. For , the differences are slightly lower in general. For , the difference between PPT and RSC is 8%, and it is greater than 27% between PPT and each of the BF, SC and PG algorithms. The difference between PPT and all the other algorithms is greater than 21% for and greater than 20% for .

Additionally, for the optimality criteria based on unique priority edges, and , the average value of optimality criteria differs significantly for the PPT algorithm. Higher values of and result in test sets closer to optimum. For , PPT is higher than all the other algorithms by 18% for and 15% for .

For the rest of the optimality criteria, the differences are not as significant; however, similar results are still present in the data.

Generally, the RSC algorithm yields results relatively similar to the BF, PG and SC algorithms; however, exceptions can be found. For (Table VI, Figure 2), the RSC algorithm is outperformed by the PG and SC algorithms for , , , and . At this priority level, the RSC algorithm does not outperform any of other algorithms.

The situation changes when (Table VII, Figure 4), which, in practical terms, means that the algorithms process more priority edges. From the data, RSC exhibits better performance in this case. It is outperformed by the PG and SC algorithms only for the and optimality criteria. In contrast, the RSC outperforms BF, PG and SC for and , which can be considered as important criteria of test set optimality.

For the total test cases , the RSC yields the better results than do the BF, PG and SC for both and . However,

itself as an indicator of test set optimality is probably insufficient; the total number of test steps (e.g.

or ) are more reliable metrics.

Some other conclusions can be drawn from the data for the Edge Coverage criterion. For , the differences in the results for the atomic and sequence conversion of the test requirements (refer to Table III) are more significant for the BF and SC algorithms; however, the differences are not so significant for PG for majority of the test set optimality criteria. A similar trend can be found for although for the test set optimality criteria and , the differences caused by the atomic and sequence conversions of the test requirements for the BF and SC algorithms are lower, whereas this difference is higher for the PG algorithm compared to .

Regarding the Edge-Pair Coverage criterion, the situation changes: the RSC algorithm outperforms the PPT algorithm on all the test set optimality criteria for (Table VIII, Figure 6) and for (Table VIII, Figure 8).

In some cases, the results are relatively similar, for instance scores for , and when and and when . However, significant differences can be observed for the rest of the test set optimality criteria. For instance, when , the difference in is 39%, the difference in is 25% and the difference in is 24%. When , the difference in is 39%, the difference in is 27% and the difference in is 25%. These results show that the RSC algorithm is a better candidate for Edge-Pair Coverage criterion than is PPT.

Regarding when for both Edge Coverage and Edge-Pair Coverage, all the employed algorithms created the test sets that had the same value for the criterion. This is a correct result and occurs because of the principle behind the algorithms. The same analogy applies for and when .

Regarding the test set selection strategy (the second part) other facts can be observed from the data. The most important finding is that for various problem instances and different optimality criteria, different algorithms provide the optimal test set. This can be observed similarly for Edge Coverage with (see Figure 10 and Table IV-B2) and for Edge Coverage with (see Figure 11), Edge-Pair Coverage with (see Figure 12) and Edge-Pair Coverage with (see Figure 13). This effect is well documented by a sample of the detailed data provided in Table IV-B2.

For certain test set optimality criteria, the algorithms that provide the optimal solution for all or for the majority of the problem instances can be identified. For instance, this is true in the case of the Edge Coverage criterion ( and ), using the PPT algorithm and the test set optimality criteria and . However, for different optimality criteria (e.g., and ), a single algorithm that clearly outperforms the other algorithms cannot be identified. For , this effect is even more obvious and relates to the fact that when the algorithms reflect more priority edges. Moreover, when , the data shows that no single algorithm clearly outperforms the others for the other test set optimality criteria, namely, , and . For , for instance, PPT provided the optimal test set for 31 problem instances, RSC for 16 instances, BF with atomic conversion of test requirements for 11 instances, BF with sequence conversion of test requirements for 22 instances, SC with atomic conversion of test requirements for 14 instances, SC with sequence conversion of test requirements for 30 instances, PG with atomic conversion of test requirements for 27 instances and PG with sequence conversion of test requirements for 33 instances. On a given problem instance, applying more algorithms can provide an optimal result.

Generally, the results of the Edge Coverage criteria (Figures 10 and 11) correlate with the findings presented when the algorithms were compared by their average values of optimality criteria (Figures 2 and 4).

For Edge-Pair Coverage, the analysis is simpler, because only two algorithms, PPT and RSC are comparable for this test coverage level. When ,the RSC outperforms PPT on most of the test set optimality criteria; however, no clear ”winner” can be identified for criteria and . When considering , PPT provided the optimal test set for 25 problem instances, RSC provided the optimal test set for 33 problem instances, and both algorithms provided the optimal test set for 8 problem instances. Considering , PPT provided the optimal test set for 27 problem instances, RSC provided the optimal test set for 33 problem instances, and both algorithms provided the optimal test set for 10 problem instances.

For the optimality criteria , , and , when , RSC outperformed PPT in 39 out of 50 problem instances , while both algorithms provided the same result for 6 problem instances.

For , the RSC outperformed PPT in most of the test set optimality criteria. For this priority level, RSC yields the better results. This also applies to the and previously discussed for . Considering when , PPT provided the optimal test set for 18 problem instances, RSC provided the optimal test set for 36 problem instances, and both algorithms provided the optimal test set for 4 problem instances. Considering , PPT provided the optimal test set for 19 problem instances, RSC provided the optimal test set for 36 problem instances, and both algorithms provided the optimal test set for 4 problem instances.

Generally, the results justify the concept proposed in this paper: in situations in which different algorithms provide optimal results for different problem instances (when considering a particular test set optimality criterion), employing more algorithms and then selecting the best set is a practical approach.

Vi Threats to Validity

Several issues can be raised regarding the validity of the results; we discuss them in this section and describe the countermeasures that mitigate the effects of these issues.

The first concern that can be raised involves the generation of the set of test requirements from for the BF, SC and PG algorithms for the Edge Coverage criterion (), where is used for the test set reduction. The SUT models and with differ between the methods, how to capture the priority parts of the SUT process, hence the different possibilities for conversion between the edge priorities in and the set of test requirements can be discussed. To mitigate this issue, we employed and analyzed two different strategies for generating the set of test requirements from , namely, the atomic and sequence conversion methods, which are specified in Section III.

Another issue relates to the topology of the SUT models. The BF, SC and PG algorithms use a directed graph as the SUT model [3]; consequently, a directed graph is also used for RSC, because RSC employs SC as its main part (refer to Algorithm LABEL:alg:Set-Covering-with-Test-Set-Reduction). For PPT, a directed multigraph can be used as input. To mitigate this issue and to ensure the objective comparability of all the algorithms and the convertibility of to and , we used only directed graphs in the experiments.

A related issue arises at this point: Does this restriction not limit the modeling possibilities when capturing the SUT structure? The answer is that practically speaking, the modeling possibilities are not limited. Using a directed graph leads only to more extensive models. When parallel edges present in the conceptual SUT model (e.g., UML Activity Diagram) are not allowed in its abstraction as captured by a directed graph, we instead use graph nodes to capture the parallel edges. This approach leads to more extensive graphs; however, it does not limit the algorithms and the overall solution.

Another question can be raised regarding the practical applicability of all the test set optimality criteria presented in Table I; many arguments can be brought both for and against this issue. In this study, rather than tackling such discussions, we present the data for all the optimality criteria and let the readers decide.

The last issue to be raised regards the strength of the test cases, which are reduced by the concept to cover the priority parts of the SUT processes only. In these defined priority parts, the test coverage and the strength of the test cases are guaranteed. However, it is not guaranteed for the non-priority parts due to the principle of the criteria. However, this fact does not invalidate the algorithms, the experimental data, or the conclusions drawn from these data.

Vii Related Work

In the majority of the current path-based techniques, a SUT abstraction is based on a directed graph [1]. To capture the priority of specific parts of the SUT process or determine the test coverage level, test requirements are used [1, 7]. To assess the optimality of a path-based test set, a number of criteria can be discussed [3, 1, 8]. These criteria are usually based on the number of nodes, the number of edges, the number of paths or the coverage of the test requirements.

To generate path-based test cases, a number of algorithms have been proposed [2, 3, 4, 5, 6, 7, 15], such as the Brute Force algorithm, the Set-Covering Based Solution, or the Matching-Based Prefix Graph Solution [3]

. Additionally, genetic algorithms have been employed to generate the prime paths

[6] or to generate basis test paths [16]. Other nature-inspired algorithms have also been proposed, for example, ant colony optimization algorithms [5, 17], the firefly algorithm [18] and algorithms inspired by microbiology [4].

Test set optimization based on prioritization is considered essential area to be explored and here; various alternative approaches can be identified. As an example, clustering based on a neural network was examined in

[19], fuzzy clustering possibilities were explored in [20], and the Firefly optimization algorithm was utilized in [21]. These approaches also use the internal structure of a SUT as the input to the process.

The path-based testing technique itself is generally applicable to and can be employed for various types of testing. For instance, the composition of end-to-end business scenarios [13], the composition of scenarios for integration tests or path-based testing focusing on the code level of the SUT [22, 8]. On this last level, path-based testing overlaps with the data-flow technique, which focuses on verifying the data consistency of the SUT code [23, 24, 25, 26]. In this area, control-flow graphs are employed as the SUT abstraction [22].

Alternative approaches to the current test requirement concept have been formulated [9], which result in capturing the priorities by the weights of the graph edges. This approach was inspired by the need for more priority levels, which are commonly used in the software engineering and management praxes [10, 11]. Another motivation for this approach regards certain limitations of the test requirements concept: in a number of the algorithms, the test requirements can be practically used either to specify the SUT priority parts or to determine the test intensity. As an alternative, the PPT was formulated, which is an algorithm that combines variable test coverage with SUT part prioritization [9].

Regarding using a combination of algorithms to determine the optimal test set, significantly less work exists. Some work utilizing this idea exists in the area of combinatorial interaction testing, in which different approaches are combined to obtain the optimal test set [27, 28]. Considering the experimental results presented in this paper, this stream can be considered prospectively for the path-based testing domain.

Viii Conclusion

In the paper, we proposed a strategy that employs a set of currently available algorithms and one new algorithm to find an optimal set of path-based test cases for a SUT model based on a directed graph with priority parts. The priority is captured as edge weights; for some of the algorithms, it is converted to test requirements. The optimality of the test set is determined by an optimality criterion selected by the user from fourteen indicators of test set optimality, by an optimality function that can be parameterized, or by the sequence selection method specified in this paper. The experimental results from running this strategy on 50 various problem instances justify the proposed approach. For the various problem instances and different optimality criteria, different algorithms provide the optimal test set—an outcome that was observed for all four combinations of test coverage and priority level criteria used in the experiments.

From the exercised algorithms, the PPT provided the best results for the Edge Coverage criterion. However, for certain sets of problem instances and certain test set optimality criteria, the PPT is outperformed by other algorithms (i.e., RSC, SC, and PG) and by BF in certain instances. For the Edge-Pair Coverage criterion, where the PPT and the RSC were the only comparable candidates for solving the problem (combining Edge-Pair Coverage with prioritization of particular SUT model parts), the RSC outperformed the PPT on the majority of the optimality criteria. However, for specific optimality criteria (e.g., and ), the dominance of the RSC algorithm was weak, and for a significant proportion of the problem instances, the PPT provided better results.

The proposed test set selection strategy is not a substitute for the development of new perspective algorithms to solve the path-based test case generation problem. Using this strategy, the quality of the overall result depends on the quality of the algorithms employed. If new algorithms are developed that provide better results for particular problem instances, this strategy could provide better results in the future.

Acknowledgments

This research is conducted as a part of the project TACR TH02010296 Quality Assurance System for the Internet of Things Technology.

References

  • [1] P. Ammann and J. Offutt, Introduction to software testing.   Cambridge University Press, 2016.
  • [2] A. Dwarakanath and A. Jankiti, “Minimum number of test paths for prime path and other structural coverage criteria,” in IFIP International Conference on Testing Software and Systems.   Springer, 2014, pp. 63–79.
  • [3] N. Li, F. Li, and J. Offutt, “Better algorithms to minimize the cost of test paths,” in Software Testing, Verification and Validation (ICST), 2012 IEEE Fifth International Conference on.   IEEE, 2012, pp. 280–289.
  • [4] V. Arora, R. Bhatia, and M. Singh, “Synthesizing test scenarios in uml activity diagram using a bio-inspired approach,” Computer Languages, Systems & Structures, 2017.
  • [5] F. Sayyari and S. Emadi, “Automated generation of software testing path based on ant colony,” in Technology, Communication and Knowledge (ICTCK), 2015 International Congress on.   IEEE, 2015, pp. 435–440.
  • [6] B. Hoseini and S. Jalili, “Automatic test path generation from sequence diagram using genetic algorithm,” in Telecommunications (IST), 2014 7th International Symposium on.   IEEE, 2014, pp. 106–111.
  • [7] M. Shirole and R. Kumar, “Uml behavioral model based test case generation: a survey,” ACM SIGSOFT Software Engineering Notes, vol. 38, no. 4, pp. 1–13, 2013.
  • [8] N. Li, U. Praphamontripong, and J. Offutt, “An experimental comparison of four unit test criteria: Mutation, edge-pair, all-uses and prime path coverage,” in Software Testing, Verification and Validation Workshops, 2009. ICSTW’09. International Conference on.   IEEE, 2009, pp. 220–229.
  • [9] M. Bures, T. Cerny, and M. Klima, “Prioritized process test: More efficiency in testing of business processes and workflows,” in International Conference on Information Science and Applications.   Springer, 2017, pp. 585–593.
  • [10] P. Achimugu, A. Selamat, R. Ibrahim, and M. N. Mahrin, “A systematic literature review of software requirements prioritization research,” Information and software technology, vol. 56, no. 6, pp. 568–585, 2014.
  • [11] L. van der Aalst, E. Roodenrijs, J. Vink, and R. Baarda, TMap NEXT: business driven test management.   Uitgeverij kleine Uil, 2013.
  • [12] T. Koomen, B. Broekman, L. van der Aalst, and M. Vroon, TMap next: for result-driven testing.   Uitgeverij kleine Uil, 2013.
  • [13] M. Bures, “Pctgen: automated generation of test cases for application workflows,” in New Contributions in Information Systems and Technologies.   Springer, 2015, pp. 789–794.
  • [14] P. Ammann and J. Offutt. (2017) Graph coverage web application, http://cs.gmu.edu:8080/offutt/coverage/graphcoverage. [Online]. Available: http://cs.gmu.edu:8080/offutt/coverage/GraphCoverage
  • [15] S. Anand, E. K. Burke, T. Y. Chen, J. Clark, M. B. Cohen, W. Grieskamp, M. Harman, M. J. Harrold, P. Mcminn et al., “An orchestrated survey of methodologies for automated software test case generation,” Journal of Systems and Software, vol. 86, no. 8, pp. 1978–2001, 2013.
  • [16] A. S. Ghiduk, “Automatic generation of basis test paths using variable length genetic algorithm,” Information Processing Letters, vol. 114, no. 6, pp. 304–316, 2014.
  • [17] P. R. Srivastava, N. Jose, S. Barade, and D. Ghosh, “Optimized test sequence generation from usage models using ant colony optimization,” International Journal of Software Engineering & Applications, vol. 2, no. 2, pp. 14–28, 2010.
  • [18] P. R. Srivatsava, B. Mallikarjun, and X.-S. Yang, “Optimal test sequence generation using firefly algorithm,”

    Swarm and Evolutionary Computation

    , vol. 8, pp. 44–53, 2013.
  • [19] N. Gökçe, M. Eminov, and F. Belli, “Coverage-based, prioritized testing using neural network clustering,” in International Symposium on Computer and Information Sciences.   Springer, 2006, pp. 1060–1071.
  • [20] F. Belli, M. Eminov, and N. Gökçe, “Coverage-oriented, prioritized testing–a fuzzy clustering approach and case study,” in Latin-American Symposium on Dependable Computing.   Springer, 2007, pp. 95–110.
  • [21] V. Panthi and D. Mohapatra, “Generating prioritized test sequences using firefly optimization technique,” in Computational Intelligence in Data Mining-Volume 2.   Springer, 2015, pp. 627–635.
  • [22] J. Yan and J. Zhang, “An efficient method to generate feasible paths for basis path testing,” Information Processing Letters, vol. 107, no. 3-4, pp. 87–92, 2008.
  • [23] M. L. Chaim and R. P. A. De Araujo, “An efficient bitwise algorithm for intra-procedural data-flow testing coverage,” Information Processing Letters, vol. 113, no. 8, pp. 293–300, 2013.
  • [24] G. Denaro, M. Pezzè, and M. Vivanti, “On the right objectives of data flow testing,” in Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on.   IEEE, 2014, pp. 71–80.
  • [25] G. Denaro, A. Margara, M. Pezze, and M. Vivanti, “Dynamic data flow testing of object oriented systems,” in Proceedings of the 37th International Conference on Software Engineering-Volume 1.   IEEE Press, 2015, pp. 947–958.
  • [26] T. Su, K. Wu, W. Miao, G. Pu, J. He, Y. Chen, and Z. Su, “A survey on data-flow testing,” ACM Computing Surveys (CSUR), vol. 50, no. 1, p. 5, 2017.
  • [27]

    K. Z. Zamli, B. Y. Alkazemi, and G. Kendall, “A tabu search hyper-heuristic strategy for t-way test suite generation,”

    Applied Soft Computing, vol. 44, pp. 57–74, 2016.
  • [28] K. Z. Zamli, F. Din, G. Kendall, and B. S. Ahmed, “An experimental study of hyper-heuristic selection and acceptance mechanism for combinatorial t-way test suite generation,” Information Sciences, vol. 399, pp. 121–153, 2017.