On the Use of Quality Diversity Algorithms for The Traveling Thief Problem

by   Adel Nikfarjam, et al.
The University of Adelaide

In real-world optimisation, it is common to face several sub-problems interacting and forming the main problem. There is an inter-dependency between the sub-problems, making it impossible to solve such a problem by focusing on only one component. The traveling thief problem (TTP) belongs to this category and is formed by the integration of the traveling salesperson problem (TSP) and the knapsack problem (KP). In this paper, we investigate the inter-dependency of the TSP and the KP by means of quality diversity (QD) approaches. QD algorithms provide a powerful tool not only to obtain high-quality solutions but also to illustrate the distribution of high-performing solutions in the behavioural space. We introduce a MAP-Elite based evolutionary algorithm using well-known TSP and KP search operators, taking the TSP and KP score as behavioural descriptor. Afterwards, we conduct comprehensive experimental studies that show the usefulness of using the QD approach applied to the TTP. First, we provide insights regarding high-quality TTP solutions in the TSP/KP behavioural space. Afterwards, we show that better solutions for the TTP can be obtained by using our QD approach and show that it can improve the best-known solution for a wide range of TTP instances used for benchmarking in the literature.



page 7

page 8


Evolutionary Diversity Optimisation for The Traveling Thief Problem

There has been a growing interest in the evolutionary computation commun...

Computing Diverse Sets of Solutions for Monotone Submodular Optimisation Problems

Submodular functions allow to model many real-world optimisation problem...

Computing Diverse Sets of High Quality TSP Tours by EAX-Based Evolutionary Diversity Optimisation

Evolutionary algorithms based on edge assembly crossover (EAX) constitut...

Exploring the Feature Space of TSP Instances Using Quality Diversity

Generating instances of different properties is key to algorithm selecti...

Unsupervised Behaviour Discovery with Quality-Diversity Optimisation

Quality-Diversity algorithms refer to a class of evolutionary algorithms...

Advanced Ore Mine Optimisation under Uncertainty Using Evolution

In this paper, we investigate the impact of uncertainty in advanced ore ...

Relevance-guided Unsupervised Discovery of Abilities with Quality-Diversity Algorithms

Quality-Diversity algorithms provide efficient mechanisms to generate la...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In many real-world optimisation problems, several NP-hard problems interact with each other. Such optimisation problems are complex due to the inter-dependencies between the sub-problems. The inter-dependencies make each sub-problem affect the quality and even the feasibility of solutions of the others. This complicates the decision-making process Bonyadi et al. (2019). Vehicle routing problems, the traveling thief problem, and patient admission problems are examples of multi-component optimisation problems.

TTP was introduced in 2013 by Bonyadi et al. (2013). TTP is the combination of the classical TSP and the KP. Both TSP and KP are well-known, well-studied combinatorial problems. In a nutshell, they integrate the TSP and the KP so that the traveling cost between two cities depends not only on the distance between the cities but also on the weight of the items collected so far. In recent years, several solution approaches have been introduced to TTP. This includes algorithms based on co-evolutionary strategies Bonyadi et al. (2014); Yafrani and Ahiod (2015)

, local search heuristics

Polyakovskiy et al. (2014); Maity and Das (2020), simulated annealing Yafrani and Ahiod (2018), swarm intelligence approaches Wagner (2016); Zouari et al. (2019). Furthermore, exact methods based on dynamic programming have been introduced in Wu et al. (2017), but they are limited to solving only small instances.

In multi-component optimisation problems such as the TTP, it is beneficial to provide decision-makers with a diverse set of high-quality solutions differing in terms of the scores in the sub-problems. Such a set of solutions provides decision-makers with invaluable information about the inter-dependency of the sub-problems. It also enables them to involve their interests and choose between different alternatives. Computing a diverse set of solutions has recently gained increasing attention in evolutionary computation literature. Traditionally, these works are dominated by research on multi-modal optimisation, which involves diversity preservation techniques such as niching. In this context, solution diversity is seen as a means to explore niches in the fitness landscape, which correspond to regions of local optima.

In contrast, evolutionary diversity optimisation  (EDO) aims to explicitly maximise structural diversity of the solutions, usually subject to quality constraints. In EDO approaches, some structural features are defined, and a diversity measure is used to determine the diversity of a set of solutions. EDO was first introduced by Ulrich and Thiele (2011) in the continuous domain. Afterwards, the concept has been used to generate a diverse set of images and benchmark instances for the TSP Chagas and Wagner (2020); Gao et al. (2021). The star-discrepancy measure Neumann et al. (2018a) and indicators from evolutionary multi-objective optimisation  Neumann et al. (2019) have been used as diversity measures for the same problems. More recently, researchers used EDO for evolving a diverse set of high-quality solutions for combinatorial optimisation problems. Distance-based measures and entropy have been used in Do et al. (2020); Nikfarjam et al. (2021b) for generating diverse sets of the TSP tours. Nikfarjam et al. (2021a) studied the scenario that the optimal solution is unkown. In addition, the minimum spanning tree problem Bossek and Neumann (2021), the knapsack problem Bossek et al. (2021), and the optimisation of monotone sub-modular functions Neumann et al. (2021) have been studied in this context.

QD is another well-studied paradigm. QD focuses on exploring niches in the behavioural spaces and seeks a set of high-quality solutions that differ in terms of a few user-defined features of interest. Having been provided with such a set of solutions, the users are able to choose the high-quality solution suiting their interests the most. QD has emerged from the concept of novelty search, where algorithms aim to find new behaviours without considering fitness Lehman and Stanley (2011). Cully and Mouret (2013) introduced a mechanism to only keep the best-performing solutions while seeking new behaviours. Concurrently, Clune et al. (2013) proposed a simple algorithm to plot the distribution of high-quality solutions over a feature/behavioural space. Interestingly, the proposed algorithm, named MAP-Elites, efficiently evolves behavioural repertoires. Pugh et al. (2015, 2016) formulated the concept of computing a diverse set of high-quality solutions differing in features or behaviours and named it QD. The paradigm has been widely applied to the areas of robotics Rakicevic et al. (2021); Zardini et al. (2021); Cully (2020) and games Steckel and Schrum (2021); Fontaine et al. (2020, 2021) as well as other continuous problems such as urban design Galanos et al. (2021). We refer the interested readers to the review paper of Chatzilygeroudis et al. (2020). To the best of our knowledge, QD algorithms have not previously been used to a combinatorial optimisation problem, and we provide the first study on this subject.

We employ the concept of QD for solving the TTP. By this means, we scrutinise the distribution of high-performing TTP solutions in the behavioural space of the TSP and the KP and compute very high-quality solutions. We introduce a bi-level MAP-elite based evolutionary algorithm called BMBEA. The algorithm generates new solutions in a two-stage procedure. First, it generates new high-quality TSP tours from old ones by the well established EAX crossover operator Nagata and Kobayashi (2013) for the TSP. Second, it utilises dynamic programming (or alternatively a simple evolutionary algorithm) to compute an optimal (or near-optimal) packing list for the given TSP tour. Having generated a new solution, BMBEA applies a MAP-Elites based survival selection to achieve a diverse set of high-quality TTP solutions. To achieve diversity, MAP-Elites is applied with respect to the two-dimensional space given by the TSP and KP quality of the TTP solutions. We conduct a comprehensive experimental investigation to analyse and visualise the distribution of high-quality TTP solutions for different TTP instances. Furthermore, we show the capability of BMBEA to generate high-performing TTP solutions. The algorithm results in very high TTP values and improves the best-known TTP solution for several benchmark instances.

The remainder of the paper is structured as follows. In Section 2, we formally define the TTP problem. We introduce the MAP-Elites based approach for TTP and the BMBEA algorithm in Section 3. In Section 3, we examine the high-quality TTP solutions in terms of their TSP and KP score and report on our results using BMBEA for solving the TTP are shown in Section 4. Finally, we finish with some concluding remarks.

2 The Traveling Thief Problem

The traveling thief problem (TTP) is formed by the integration of the traveling salesperson problem (TSP) and the knapsack problem (KP). The TSP problem can be defined on a complete directed graph where is a set of nodes (cities) of size and is a set of pairwise edges between the nodes. There is a non-negative distance associated with each edge . The goal is to find a permutation (tour) that minimises the following cost function:

The KP is defined on a set of items , where . Each item has a profit and a weight . In KP, the objective is to find a selection of items (where is equal to if item is picked and otherwise, it is equal to ) that maximises the profit subject to the weight of the selected items not exceeding the capacity of the knapsack (). Formally, the goal is to maximise

The TTP is defined on the graph same as TSP and a set of items where items are scattered on the cities equally. Formally, every city except the first one contains a set of items (a subset of ). Same as KP, each item located in the city is associated with a profit and a weight . To ease the presentation, we do not use the double subscripts for the profits and weights in the following but refer directly to the items at one particular city when required.

The thief should visit all the cities exactly once, pick some items into the knapsack, and return to the first city. A rent should be paid for the knapsack per time unit. The thief’s speed non-linearly depends on the weight of the knapsack. In TTP, we aim to find a solution consisting of a tour and a KP solution (called a packing list in the context of TTP) that maximises

Here, and are the maximal and minimal traveling speed, is a constant, and is the cumulative weight of the items collected from the start of the tour up to city .

In this study, serves as the fitness function, and and serve as the behavioural descriptor (BD). Generally, the fitness function indicates how well a solution solves the given problem, while the BD shows how it solves the problem and behaves in terms of the features. In this case, the BD presents the length of the tour () and the value of items collected (), whereby the fitness function returns the overall profit (). Here, we aim to compute a diverse set of high-quality solutions differing in the BD. By this means, we can look into the distribution of high-performing TTP solutions over the 2D space of TSP and KP.

3 Bi-level Map-Elites-based Evolutionary Algorithm

Map-Elites is an evolutionary computation approach where solutions compete with each other to survive. However, competition is only among solutions with a similar BD value in order to maintain diversity. We require a hyperparameter to define the similarity and the tolerance of acceptable differences between two descriptors. In the MAP-Elites algorithms, the BD space is discretised into a grid, where each cell is associated with one BD type. It means each solution belongs to at most one cell in the behavioural space (the map). Map-Elite algorithms typically keep only the best solution in each cell. When a solution is generated, it is assessed and potentially added to the cell with the associating BD. If the cell is empty, the solution occupies the cell; otherwise, the best solution is kept in the cell. The map aids in understanding and visualising the distribution of high-quality TTP solutions. For instance, how much we should move away from the optimal TSP tour and the optimal KP solution to generate high-performing TTP solutions.

Figure 1: The representation of an empty map. There are cells within the map.

Generally, the behavioural space can be extremely large. Thus, it is rational to limit the map to a promising part of the space; otherwise, either the number or the size of the cells increases severely, and as a result the performance and efficiency of the algorithms is undermined. As mentioned, a TTP solution consists of a tour and a packing list that belong to the TSP and the KP components of the problem. Although solving each sub-problems separately does not necessarily result in a high-quality TTP solution, a TTP solution should score fairly good in both features in order to gain high profits. Thus, we focus on solutions within and percents gap to the optimal TSP value () and the optimal KP value (), respectively. In this study and are set to and , respectively, based on initial experimental investigations. Figure 1 depicts an empty map. There are cells. Cell , , contains the best found solution with TSP score in and KP score in , the corresponding BD. The cell (1, ) consists of TTP solutions with TSP and KP values closest to the optimums. We use EAX Nagata and Kobayashi (2013) and dynamic programming (DP) Toth (1980) to compute and for the TTP instances.

1:  Find the optimal/near-optimal values of the TSP and the KP by algorithms in Nagata and Kobayashi (2013); Toth (1980), respectively.
2:  Generate an empty map and populate it with the initialising procedure.
3:  while termination criterion is not met do
4:     Generate an offspring and calculate the TSP and the KP scores.
5:     if The TSP and the KP scores are within , and gaps to the optimal values of BD. then
6:        Find the corresponding cell to the TSP and the KP scores.
7:        if The cell is empty then
8:           Store the offspring in the cell.
9:        else
10:           Compare the offspring and the individual occupying the cell and store the best individual in terms of TTP score in the cell.
Algorithm 1 The MAP-Elites-based algorithm

Algorithm 1 describes the BMBEA. The initialising procedure and the operators to generate a new TTP solution will be discussed later. Having generated an empty map, we populate it with an initialising procedure. After generating offspring, we calculate the TSP score and the KP score of the offspring. If the TSP and the KP scores are within and gap of the optimal values, respectively, we find the cell corresponding to those scores; otherwise, the offspring is discarded. If the corresponding cell is empty, the offspring is kept in the cell; otherwise, we compare the offspring and the individual in the cell and keep the individual with highest TTP score. We repeat steps 3 to 10 until a termination criterion is met.

Evolutionary algorithms require some operators to generate new solutions (offspring) from old ones (parents); BMBEA is no exception. One can see the generating of TTP solutions as a bi-level process. First, new tours can be generated by mutation or crossovers; then, we can compute a suitable packing list for the new tours to have complete TTP solutions.

3.1 Search Operators for TSP

We consider EAX crossover Nagata and Kobayashi (2013) to generate new TSP tours. EAX is a highly performing TSP crossover known as one of the state-the-of-the-art operators in solving TSP. EAX has several variants; we incorporate the EAX-1AB due to its simplicity and efficiency. The EAX is consists of four steps. Figure 2 depicts the four steps to implement the EAX-1Ab.

  • Selection: Selecting two parents uniformly at random (Fig 2.1).

  • AB-cycle: Generating one AB-cycle from the two parents by alternatively choosing edges from first and second parents until a cycle is formed (Fig 2.2).

  • Intermediate Solution: Copying all edges of the first parent to the offspring; then removing the Ab-cycle’s edges that belong to the first parent from the offspring, and adding the other edges of the AB-cycle to it (Fig 2.3).

  • The Complete Tour: Connecting all sub-tours of the intermediate solution to form a complete tour (Fig 2.4).

Figure 2: The representation of the steps to implement EAX.

We refer the interested readers to Nagata and Kobayashi (2013) for more details about the process of generating a new tour by the EAX. Alternatively, 2-OPT can be used to generate the TSP tours. 2-OPT is a random neighbourhood search, where two elements of a permutation are selected uniformly at random. Having these elements swapped, we reorder the elements in between in a backward direction.

3.2 Search Operators for KP

In the second phase, we optimise the packing list to match the TSP tour and form a good TTP solution. To this mean, inner algorithms are required to optimise the packing list. When the tour is fixed, and the packing list is optimised, the problem is referred to as Packing While Traveling (PWT) Polyakovskiy et al. (2014) in the literature. Neumann et al. (2018b) introduced a DP algorithm to solve the PWT problem to the optimality.

3.2.1 Dynamic Programming

DP is a classical approach in solving the KP. Here, we employ the DP introduced in Neumann et al. (2018b) to solve the PWT problem. The DP includes a table consisting of rows and columns. In the DP, items are processed in the order that their corresponding node appears in the tour. For example, is processed sooner than , if the node to which belongs is visited sooner than the node of . If two items belong to the same node, they are processed according to their indices. The entry represents the maximal profit that the thief can obtain among all combinations of items with bringing about the weight exactly equal to . If no combinations lead to the weight , is set to .

Let denote the profit of the empty set by , that is equal to traveling cost with an empty knapsack. Moreover, we denote the profit by when only item is collected. Thus, for the first item (the first row of the table ) based on the order aforementioned, we have:

For the rest of the table, let show the predecessor of by , each entry can be computed from , where

The is reported as the optimal profit that the thief can gain from the given tour. Although DP can provide us with the optimal packing list for a given tour, the run-time is quite long. Considering that we compute the packing list in the second level of a bi-level optimisation, it can affect the time efficiency of the BMBEA. Therefore, we propose an EA here as an alternative. The interested readers are referred to Neumann et al. (2018b), which analysed the run-time of the DP.

3.2.2  Evolutionary Algorithm

The is a well-known simple EA that converges fact since it only keeps the best-found solution. First, the new tour generated by the TSP operators inherits its parent’s packing list. Next, a new packing list is generated by mutation. If the new packing list results in a higher TTP score, the new packing list is replaced with the old one. We continue these steps until a termination criterion is met. For mutation, the bit-flip is used, where each bit is independently flipped by mutation rate .

The mutation can result in packing lists violating the knapsack’s capacity. We incorporate a repair function into the to avoid the violation. After the offspring is mutated, the repair function fixes the offspring’s violation. The repair function removes collected items uniformly at random one by one until the packing list complies with the capacity constraint.

3.3 Initialisation

One may notice that it is doubtful to populate the map with random solutions. This is because, the map only accepts TTP individuals with fairly good TSP and KP scores. Therefore, a heuristic approach is required to populate the map initially. Since we use the EAX-based algorithm in Nagata and Kobayashi (2013) to find the optimal/near optimal TSP value, and by considering that the algorithm is population-based, we can use the tours of the algorithm’s final population. Having extracted the tours, we can compute an good quality packing list for each tour by one of the KP operators mentioned in section  3.2. This results in TTP solutions with high TSP and KP scores and lets us populate the map at the beginning of the BMBEA.

No. Original Name No. Original Name
1 eil51_n50_bounded-strongly-corr_01 18 a280_n279_uncorr_01
2 eil51_n150_bounded-strongly-corr_01 19 rat575_n574_bounded-strongly-corr_01
3 eil51_n250_bounded-strongly-corr_01 20 rat575_n574_uncorr-similar-weights_01
4 eil51_n50_uncorr-similar-weights_01 21 rat575_n574_uncorr_01
5 eil51_n150_uncorr-similar-weights_01 22 dsj1000_n999_bounded-strongly-corr_02
6 eil51_n250_uncorr-similar-weights_01 23 dsj1000_n999_uncorr-similar-weights_06
7 eil51_n50_uncorr_01 24 dsj1000_n999_uncorr_04
8 eil51_n150_uncorr_01 25 u2152_n2151_bounded-strongly-corr_01
9 eil51_n250_uncorr_01 26 u2152_n2151_uncorr-similar-weights_01
10 pr152_n151_bounded-strongly-corr_01 27 u2152_n2151_uncorr_01
11 pr152_n453_bounded-strongly-corr_01 28 fnl4461_n4460_bounded-strongly-corr_01
12 pr152_n151_uncorr-similar-weights_01 29 fnl4461_n4460_uncorr-similar-weights_01
13 pr152_n453_uncorr-similar-weights_01 30 fnl4461_n4460_uncorr_01
14 pr152_n151_uncorr_01 31 dsj1000_n999_uncorr_02
15 pr152_n453_uncorr_01 32 dsj1000_n999_uncorr_03
16 a280_n279_bounded-strongly-corr_01 33 dsj1000_n999_uncorr-similar-weights_03
17 a280_n279_uncorr-similar-weights_01 34 dsj1000_n999_uncorr-similar-weights_04
Table 1: The names of the TTP instances are used in the paper.
Figure 3: The distribution of TTP solutions of the four competitors over the behaviour space on instance eil51_n250_bounded-strongly-corr_01 (top), pr152_n453_bounded-strongly-corr_01 (middle), and a280_n279_bounded-strongly-corr_01 (bottom). The cells are coloured based on the average TTP scores of the solutions in the cell over ten independent runs.
Figure 4: The frequency of cells housing a TTP solution over 10 independent runs on on instance eil51_n250_bounded-strongly-corr_01 (top), pr152_n453_bounded-strongly-corr_01 (middle), and a280_n279_bounded-strongly-corr_01 (bottom).

4 Experimental Investigation

In this section, we use the BMBEA to compute a set of solutions for several TTP instances; then, we plot the map to illuminate the distribution of the solutions over the space of and . Moreover, we comprehensively compare different search operators and their effects on the distributions and the final maps. We consider the EAX and the 2-OPT for generating tours and the DP and the EA for computing the packing lists. Employing the operators alternatively, we have four different operator settings. For the termination criterion, the algorithms are terminated when they reach iterations. Here, iteration is referred to as the main loop of the BMBEA. We used the TTP instances developed in Polyakovskiy et al. (2014). Table 1 presents the names of the instances that we used in the paper. Please note that we use the first instance of each sub-group except for the dsj1000. The renting price () is set to zero in those instances; the issue makes the TTP instances turn to a KP.

4.1 Analysis of the maps

This section visualises and scrutinises the final map obtained from the BMBEA using different search operators, namely EAX, 2-OPT, KP, and EA. Figure 3 visualises the final maps obtained from the four competitors in instances 3, 12, and 16. The TSP value increases when we move in the direction of axis, while moving in the axis results in a rise in the KP score. Since the TSP is a minimisation and the KP is a maximisation problem, the cell (1,20) consists of the solution with a BD closest to and . The maps’ cells are coloured based on the average TTP score of the solution within the cells out of 10 independent runs; the hotter colour, the higher TTP score. Moreover, the cells with the best average score are coloured in black. As we can observe, the west part of the maps tends to contain better TTP solutions. In 8 out of 9 cases, the best solutions are located in a BD of and . Moreover, the figure shows that the maps obtained from BMBEAs using EAX have more hot-coloured cells than the ones with 2-OPT have, which shows the consistency of EAX in generating high-quality solutions. Turning to the comparison between DP and EA, the latter can populate a larger part of the map.

Figure 4 illustrates the frequency of cells containing a solution in ten independent runs. The instances are the same as Figure 3. Here, a hotter colour indicates a higher frequency. The figure depicts that the algorithms cannot populate the cells close to the optimal KP. Because, the algorithms compute the packing list as the second level of a bi-level optimisation procedure. Thus, the KP values are constrained by the given tour. More interestingly, the most of the cells corresponding to the KP values close to also remain empty for the same reason, especially when DP is used. Moreover, one may notice that the most red-coloured cells in Figure 3 are coloured red here as well. It illustrates a proportional relationship between the quality of solutions and the frequency. Furthermore, the cells associated with low TSP values (left) of maps are more likely to be empty compared to the other side. As the TSP value increases, so does the number of tours resulting in such a TSP value rise. This results in a more diverse set of tours and eventually a more diverse set of packing lists and a broader range of the KP score.

4.2 Best found TTP Solutions

We, now, compare the search operators, EAX, 2OPT, DP, and EA, in terms of the best found TTP solution in this section. We consider instances in a range of 51 to 280 cities and 50 to 453 items form Polyakovskiy et al. (2014). Table 2 shows the average and the best TTP solutions, and the average CPU time in ten independent runs for the four competitors and the best-known TTP values. Note that the best-known values are obtained from Chagas and Wagner (2020) which are the best values among the results of their two proposed algorithms and 21 algorithms analysed in Wagner et al. (2018)

. The results indicate that EAX outperforms 2-OPT in terms of TTP score in most cases. The observations are confirmed by a Kruskal-Wallis test at significance level

and Bonferroni correction. Turning to the comparison of the KP operators, EA yields very decent objective values and can compete with DP, which results in the optimal packing list. On the other hand, the run times of EA are significantly shorter; for example, the EAX-EA averagely finishes the 10000 iterations in 263 seconds on instance 3, whereby the figure is about 19895 seconds for EAX-DP. In general, an increase in the size of instances severely affects the run time of the BMBEA using DP. This is while the run time of algorithm employing EA remains in a reasonable range.

Instance EAX-DP (1) EAX-EA (2) Best-known
Average Stat Best CPU time Average Stat Best CPU time value
01 4462.2 4465 83.2 4446.7 4459.9 36.8 4269.4
02 8289.9 8293.8 2305 8032 8195.6 128.2 7532
03 13672.1 13672.1 19895.2 13388.5 13648.2 263.4 12804
04 1600.3 1607.5 36.8 1594.1 1603.7 35.4 1448.5
05 4804.4 4836.6 265.3 4760.4 4809.7 119.8 4365
06 6834.5 6854.4 876.1 6710.6 6841 251.9 6359
07 3204.7 3227.1 42.9 3120.4 3223.4 35.6 2851.1
08 7854.2 7854.2 434.5 7848.4 7854.2 120.8 7037
09 13644.8 13644.8 2357.4 13638 13644.8 235.6 12478
10 11150.1 11150.2 3580.8 11048.4 11133.1 123.2 11117.4
11 22995.8 25564.1 211942.3 25010 25398.5 555.5 25664.4
12 3555.8 3649.9 249.9 3484.5 3556 116.7 3791.9
13 13441.9 13589.6 5195.8 13187.3 13369.3 539.1 13556.9
14 5416.9 5434.1 370.6 5415 5415 108.4 5615
15 20506.8 20506.8 15524.3 20501.7 20506.8 496.5 20705.8
16 18662.8 18703.1 40888.5 18396.8 18491.7 274.9 18470
17 9392.6 9514.8 1436.9 9268.4 9369.6 248.9 9434
18 19785.2 19889.3 3817.4 19710.7 19791.3 230.8 19889.8
Instance 2-OPT-DP (3) 2-OPT-EA (4) Best-known
Average Stat Best CPU time Average Stat Best CPU time value
01 4423.1 4449.4 66.8 4182.7 4356.8 30.5 4269.4
02 8132.5 8253.9 2197.1 7413.6 7632.8 118.6 7532
03 13424.1 13598.9 18558.1 11902.5 12339.3 247.6 12804
04 1575 1594.5 17.7 1527.5 1568.7 26.7 1448.5
05 4712.8 4753.5 230.3 4420.5 4498.3 104.1 4365
06 6722.2 6810.2 892.3 6106.4 6282 234.1 6359
07 3071.3 3140.5 25.3 2903.2 3090.3 29 2851.1
08 7759.6 7828.2 392 7379.6 7696.3 108.8 7037
09 13534.9 13594.9 2124.2 12690.3 13100.9 229.7 12478
10 11153.2 11231.7 3029.1 10308.7 10456.3 109.4 11117.4
11 10227.7 26154.3 79529.6 22334.6 22850.3 554.6 25664.4
12 3547.9 3640.1 212.5 3247.7 3464.7 99.6 3791.9
13 13236.9 13539.2 4483.5 12265.9 12605.8 505.1 13556.9
14 5243.3 5376.1 322.4 5029.2 5218.1 91.7 5615
15 19945.5 20468.7 13069.5 19057.7 19747.6 470.5 20705.8
16 18467.5 18564.1 35636.2 17158.5 17397.2 246.2 18470
17 9204.4 9435.7 1349.9 8658 8801.6 229.5 9434
18 19382.7 19590.1 3411.6 18611.1 19027.2 214 19889.8
Table 2: Comparison of the search operators in terms of the TTP score and CPU time on the small size instances. In columns Stat the notation means the median of the measure is better than the one for variant , means it is worse, and indicates no significant difference. Stat shows the results of Kruskal-Wallis statistical test at significance level and Bonferroni correction.

More Interestingly, Table 2 also indicates that all variants of BMBEA result in very decent TTP scores. In most cases, such as 1,2,3,4,5,6,16, and 17, the introduced algorithms beat the best TTP scores. In instance 2, the TTP score is improved by 10 percent; The figure is 4, 6, and 11 percent for instances 1, 3, 4, and 5, respectively.

In. EAX-EA (1) 2-OPT-EA (2) Best-known
Average Stat Best Average Stat Best value
19 34746.5 35427.4 31421.7 32054.6 32993.1
20 20513.6 20834.3 19173.2 19388.6 19379.7
21 37895.6 38299.6 36093.9 36448.2 35015.2
22 1657.2 2575.5 -2576.8 -1374.9 893.4
23 52561.1 54884 49531.3 50297.4 51303.4
24 29229.2 31594.9 26465.4 26959.1 28304
25 109653.6 110510 100572.4 102396.9 105908.1
26 73922.7 75388.1 71037.8 72235.6 72308.7
27 116130.6 119997.2 110629.5 113817.5 108236.1
28 262758.6 264654.6 244346.7 245597 263040.2
29 132208.5 132430 127773 128588.1 131486.2
30 238596.9 241263.1 234120.5 235595.3 233343
Table 3: Performance of the MAP-Elite based approach in terms of the TTP score. The notations are in line with Table 2.

Since the DP is not time-efficient in larger instances, we consider the EA for computing the packing list. Table 3 shows the results on 12 instances from 575 to 4661 cities and 574 to 4460 items. Here, the termination criterion increases to iterations. As one can observe, EAX dominates 2-OPT in these instances. Moreover, the algorithm using EAX improved the best found solution in 10 out of 12 instances. For example, the TTP score is significantly increased from 893 to 2576 in instance 4. As one can notice, the TTP score of the algorithm using 2-OPT is negative in this instance. Polyakovskiy et al. (2014) aimed to make the instances balanced in both sides of the TSP and the KP, but the TSP sub-problem is more dominating in some of the instances of the dsj1000 sub-group. The traveling cost is high in these particular instances, and the items do not compensate for the high cost. Having the TSP sub-problem more dominating, it is not surprising that the EAX outperforms the 2-OPT. Moreover, in four other instances in the dsj1000 sub-group, the TSP part of instances dominates the KP sub-problem even more such that the best-known values are negative. It is likely that the global optimum is located on the negative side of the search space in these four instances. We investigate the four instances separated from the others due to the dominance of the TSP sub-problem over the KP.

In. EAX-EA (1) 2-OPT-EA (2) Best-known
Average Stat Best Average Stat Best value
31 -48537.3 -47907.1 -51704.2 -50721.7 -49149.9
32 -6517.3 -4516.8 -9741.1 -8965.4 -7714.6
33 -62735 -61254.4 -66077.1 -64351.6 -61709.1
34 -23022.7 -21641 -26962.2 -25647.6 -19215.2
Table 4: Performance of the MAP-Elite based approach on the unbalanced instances. The notations are in line with Table 2

It means that the high-quality TTP solutions are closer to the TSP optimal value and more away from the KP optimal values. The current and are set for the balance instances. Thus, we need to reset the and in order to populate the map. Based on initial experimental investigations, we set and to 2 and 60 percent, respectively. Table 4 summarises the results on the four instances. The EAX, as expected, outperforms the 2-OPT in all four cases. More importantly, the EAX-based algorithm improved the TTP values for instances 31, 32, and 33 by 1.2 and 41, and 0.7 percent, respectively.

5 Conclusion

In this study, we incorporated the concept of QD into solving the TTP. To the best of our knowledge, this is the first time that QD concept have been used for solving a combinatorial problem. The behaviour descriptor for our approach is defined on the TSP and the KP scores of a TTP solution. Having defined a 2D MAP-Elite, we introduced the BMBEA algorithm to generate high-quality TTP solutions. BMBEA involves EAX crossover to create new tours. Afterwards, the algorithm computes a high-quality packing list by dynamic programming or the (1+1) EA. By visualising the map obtained from BMBEA, we observed the distribution of high-performing TTP solutions over the behavioural space of TSP and KP. Moreover, we conducted a comprehensive experimental comparison involving four different search operators for BMBEA. The results showed the BMBEA using EAX, and the EA performs well in terms of quality of solutions and CPU time. In addition, BMBEA improved the best TTP scores on several instances.

It would be interesting to incorporate more complex MAP-Elite approaches such as CVT-MAP-Elites Vassiliades et al. (2018) into the introduced algorithm. Using such an approach can discretise the behavioural space more intelligently. Moreover, several multi-component combinatorial optimisation problems can be found in literature where QD is highly beneficial to understanding of the inter-dependencies of components and the distribution of solutions in the behavioural space.


This work was supported by the Australian Research Council through grants DP190103894 and FT200100536, and by the South Australian Government through the Research Consortium "Unlocking Complex Resources through Lean Processing".


  • (1)
  • Bonyadi et al. (2013) Mohammad Reza Bonyadi, Zbigniew Michalewicz, and Luigi Barone. 2013. The travelling thief problem: The first step in the transition from theoretical problems to realistic problems. In IEEE Congress on Evolutionary Computation. IEEE, 1037–1044.
  • Bonyadi et al. (2014) Mohammad Reza Bonyadi, Zbigniew Michalewicz, Michal Roman Przybylek, and Adam Wierzbicki. 2014. Socially inspired algorithms for the travelling thief problem. In GECCO. ACM, 421–428.
  • Bonyadi et al. (2019) Mohammad Reza Bonyadi, Zbigniew Michalewicz, Markus Wagner, and Frank Neumann. 2019. Evolutionary Computation for Multicomponent Problems: Opportunities and Future Directions. In Optimization in Industry. Springer, 13–30.
  • Bossek et al. (2021) Jakob Bossek, Aneta Neumann, and Frank Neumann. 2021. Breeding diverse packings for the knapsack problem by means of diversity-tailored evolutionary algorithms. In GECCO. ACM, 556–564.
  • Bossek and Neumann (2021) Jakob Bossek and Frank Neumann. 2021. Evolutionary diversity optimization and the minimum spanning tree problem. In GECCO. ACM, 198–206.
  • Chagas and Wagner (2020) Jonatas B. C. Chagas and Markus Wagner. 2020. A weighted-sum method for solving the bi-objective traveling thief problem. CoRR abs/2011.05081 (2020).
  • Chatzilygeroudis et al. (2020) Konstantinos I. Chatzilygeroudis, Antoine Cully, Vassilis Vassiliades, and Jean-Baptiste Mouret. 2020. Quality-Diversity Optimization: a novel branch of stochastic optimization. CoRR abs/2012.04322 (2020).
  • Clune et al. (2013) Jeff Clune, Jean-Baptiste Mouret, and Hod Lipson. 2013. Summary of "the evolutionary origins of modularity". In GECCO (Companion). ACM, 23–24.
  • Cully (2020) Antoine Cully. 2020. Multi-Emitter MAP-Elites: Improving quality, diversity and convergence speed with heterogeneous sets of emitters. CoRR abs/2007.05352 (2020).
  • Cully and Mouret (2013) Antoine Cully and Jean-Baptiste Mouret. 2013. Behavioral repertoire learning in robotics. In GECCO. ACM, 175–182.
  • Do et al. (2020) Anh Viet Do, Jakob Bossek, Aneta Neumann, and Frank Neumann. 2020. Evolving diverse sets of tours for the travelling salesperson problem. In GECCO. ACM, 681–689.
  • Fontaine et al. (2021) Matthew C. Fontaine, Ruilin Liu, Ahmed Khalifa, Jignesh Modi, Julian Togelius, Amy K. Hoover, and Stefanos Nikolaidis. 2021.

    Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network. In

    AAAI. AAAI Press, 5922–5930.
  • Fontaine et al. (2020) Matthew C. Fontaine, Julian Togelius, Stefanos Nikolaidis, and Amy K. Hoover. 2020. Covariance matrix adaptation for the rapid illumination of behavior space. In GECCO. ACM, 94–102.
  • Galanos et al. (2021) Theodoros Galanos, Antonios Liapis, Georgios N. Yannakakis, and Reinhard Koenig. 2021. ARCH-Elites: quality-diversity for urban design. In GECCO Companion. ACM, 313–314.
  • Gao et al. (2021) Wanru Gao, Samadhi Nallaperuma, and Frank Neumann. 2021. Feature-Based Diversity Optimization for Problem Instance Classification. Evol. Comput. 29, 1 (2021), 107–128.
  • Lehman and Stanley (2011) Joel Lehman and Kenneth O. Stanley. 2011. Abandoning Objectives: Evolution Through the Search for Novelty Alone. Evol. Comput. 19, 2 (2011), 189–223.
  • Maity and Das (2020) Alenrex Maity and Swagatam Das. 2020. Efficient hybrid local search heuristics for solving the travelling thief problem. Appl. Soft Comput. 93 (2020), 106284.
  • Nagata and Kobayashi (2013) Yuichi Nagata and Shigenobu Kobayashi. 2013.

    A Powerful Genetic Algorithm Using Edge Assembly Crossover for the Traveling Salesman Problem.

    INFORMS J. Comput. 25, 2 (2013), 346–363.
  • Neumann et al. (2021) Aneta Neumann, Jakob Bossek, and Frank Neumann. 2021. Diversifying greedy sampling and evolutionary diversity optimisation for constrained monotone submodular functions. In GECCO. ACM, 261–269.
  • Neumann et al. (2018a) Aneta Neumann, Wanru Gao, Carola Doerr, Frank Neumann, and Markus Wagner. 2018a. Discrepancy-based evolutionary diversity optimization. In GECCO. ACM, 991–998.
  • Neumann et al. (2019) Aneta Neumann, Wanru Gao, Markus Wagner, and Frank Neumann. 2019. Evolutionary diversity optimization using multi-objective indicators. In GECCO. ACM, 837–845.
  • Neumann et al. (2018b) Frank Neumann, Sergey Polyakovskiy, Martin Skutella, Leen Stougie, and Junhua Wu. 2018b. A Fully Polynomial Time Approximation Scheme for Packing While Traveling. In ALGOCLOUD (Lecture Notes in Computer Science, Vol. 11409). Springer, 59–72.
  • Nikfarjam et al. (2021a) Adel Nikfarjam, Jakob Bossek, Aneta Neumann, and Frank Neumann. 2021a. Computing diverse sets of high quality TSP tours by EAX-based evolutionary diversity optimisation. In FOGA. ACM, 9:1–9:11.
  • Nikfarjam et al. (2021b) Adel Nikfarjam, Jakob Bossek, Aneta Neumann, and Frank Neumann. 2021b. Entropy-based evolutionary diversity optimisation for the traveling salesperson problem. In GECCO. ACM, 600–608.
  • Polyakovskiy et al. (2014) Sergey Polyakovskiy, Mohammad Reza Bonyadi, Markus Wagner, Zbigniew Michalewicz, and Frank Neumann. 2014. A comprehensive benchmark set and heuristics for the traveling thief problem. In GECCO. ACM, 477–484.
  • Pugh et al. (2016) Justin K. Pugh, Lisa B. Soros, and Kenneth O. Stanley. 2016. Quality Diversity: A New Frontier for Evolutionary Computation. Frontiers Robotics AI 3 (2016), 40.
  • Pugh et al. (2015) Justin K. Pugh, Lisa B. Soros, Paul A. Szerlip, and Kenneth O. Stanley. 2015. Confronting the Challenge of Quality Diversity. In GECCO. ACM, 967–974.
  • Rakicevic et al. (2021) Nemanja Rakicevic, Antoine Cully, and Petar Kormushev. 2021.

    Policy manifold search: exploring the manifold hypothesis for diversity-based neuroevolution. In

    GECCO. ACM, 901–909.
  • Steckel and Schrum (2021) Kirby Steckel and Jacob Schrum. 2021. Illuminating the space of beatable lode runner levels produced by various generative adversarial networks. In GECCO Companion. ACM, 111–112.
  • Toth (1980) Paolo Toth. 1980. Dynamic programming algorithms for the Zero-One Knapsack Problem. Computing 25, 1 (1980), 29–45.
  • Ulrich and Thiele (2011) Tamara Ulrich and Lothar Thiele. 2011. Maximizing population diversity in single-objective optimization. In GECCO. ACM, 641–648.
  • Vassiliades et al. (2018) Vassilis Vassiliades, Konstantinos I. Chatzilygeroudis, and Jean-Baptiste Mouret. 2018. Using Centroidal Voronoi Tessellations to Scale Up the Multidimensional Archive of Phenotypic Elites Algorithm. IEEE Trans. Evol. Comput. 22, 4 (2018), 623–630.
  • Wagner (2016) Markus Wagner. 2016. Stealing Items More Efficiently with Ants: A Swarm Intelligence Approach to the Travelling Thief Problem. In ANTS Conference (Lecture Notes in Computer Science, Vol. 9882). Springer, 273–281.
  • Wagner et al. (2018) Markus Wagner, Marius Lindauer, Mustafa Misir, Samadhi Nallaperuma, and Frank Hutter. 2018. A case study of algorithm selection for the traveling thief problem. J. Heuristics 24, 3 (2018), 295–320.
  • Wu et al. (2017) Junhua Wu, Markus Wagner, Sergey Polyakovskiy, and Frank Neumann. 2017. Exact Approaches for the Travelling Thief Problem. In SEAL (Lecture Notes in Computer Science, Vol. 10593). Springer, 110–121.
  • Yafrani and Ahiod (2015) Mohamed El Yafrani and Belaïd Ahiod. 2015. Cosolver2B: An efficient local search heuristic for the Travelling Thief Problem. In AICCSA. IEEE Computer Society, 1–5.
  • Yafrani and Ahiod (2018) Mohamed El Yafrani and Belaïd Ahiod. 2018. Efficiently solving the Traveling Thief Problem using hill climbing and simulated annealing. Inf. Sci. 432 (2018), 231–244.
  • Zardini et al. (2021) Enrico Zardini, Davide Zappetti, Davide Zambrano, Giovanni Iacca, and Dario Floreano. 2021. Seeking quality diversity in evolutionary co-design of morphology and control of soft tensegrity modular robots. In GECCO. ACM, 189–197.
  • Zouari et al. (2019) Wiem Zouari, Inès Alaya, and Moncef Tagina. 2019. A new hybrid ant colony algorithms for the traveling thief problem. In GECCO (Companion). ACM, 95–96.