1 Introduction
The travelling thief problem (TTP) [5] is a bicomponent problem, where two wellknown hard combinatorial optimisation problems, namely the travelling salesperson problem (TSP) and the 01 knapsack problem (KP), are interrelated. Hence, tackling each component individually is unlikely to lead to a global optimal solution. It is an artificial benchmark problem modelling features of complex realworld applications emerging in the areas of planning, scheduling and routing. For example, Stolk et al. [25] exemplify a delivery problem that consist of a routing part for the vehicle(s) and a packing part of the goods onto the vehicle(s).
Thus far, many approaches have been proposed for the TTP [23, 6, 18, 17, 19, 13, 26, 28, 11, 10, 32, 16, 12, 15]
. However to the best of our knowledge, all of them are focusing on utilising the existing heuristic approaches (such as local search, simulated annealing, tabu search, genetic algorithms, memetic algorithm, swarm intelligence, etc.), incorporating either wellstudied operators of the TSP and KP or slight variations of such operators. The heuristic approaches or operators that take advantage of the existing exact algorithms of the TTP
[21, 31] are yet lacking. On the other hand, very few investigations have been taken on the approaches of the multiobjective formulations of the TTP except by Blank et al. [4], Yafrani et al. [33].In this paper, we consider a biobjective version of the TTP, where the goal is to minimise the weight and maximise the overall benefit of a solution. We present a hybrid approach for the biobjective TTP that uses the dynamic programming approach for the underlying PWT problem as a subroutine. The evolutionary component of our approach constructs a tour for the TTP. This tour is then fed into the dynamic programming algorithm to compute a tradeoff front for the biobjective problem. Here the tour is kept fixed and the resulting packing solutions are Pareto optimal owing to the capability of the dynamic programming. A key aspect of the algorithm is to take advantage of the different fronts belonging to different tours for the TTP component, as presumably the global Pareto optimum might contain some segments from the different fronts. Meanwhile, when the evolutionary approach evolves the tours and the current general Pareto front consists of different tours (together with the packing plans), a challenge is to select tours for mutations and crossovers that lead to promising new tours. Such tours shall result in new Pareto optimal solutions for the overall biobjective TTP problem when running the dynamic programming on them. In short, the selection mechanism shall encourage the synergy of the two subapproaches. We introduce a novel indicatorbased evolutionary algorithm (IBEA [34]) that contains a series of customised indicators and parent selections to achieve this goal. Our results show that this approach solves the problem well, and its byproduct, which is the total reward of the single objective TTP, beats the stateoftheart approach in most cases.
The remainder of the paper first states the biobjective version of the TTP mathematically in Section 2. Then, Section 3 covers the prerequisites required for our approach, which is later introduced in Section 4. Section 5 provides the description of the computational setup and the analysis of computational experiments. Finally, Section 6 draws conclusions.
2 The Travelling Thief Problem
The standard singleobjective TTP [23] involves cities, items, and a thief who must make a tour visiting each of the cities exactly once. The cities form a set of nodes in a complete graph , where is a set of edges representing all possible connections between the cities. Every edge is assigned a known distance . Every node but the first one relates to a unique set of items , , stored in the corresponding city. Each item positioned in node is associated with an integer profit and an integer weight . The thief starts and ends the tour in the first node and can collect any of the items located in the intermediate nodes . Items may only be selected until their total weight exceeds a knapsack’s capacity . Furthermore, the thief pays a rent rate for each time unit of travelling. Selection of an item contributes its profit to a total reward, but produces a transportation cost relative to its weight. As the weight of each added item slows down the thief, the transportation cost increases. This cost is therefore deducted from the reward. When the knapsack is empty, the thief can achieve a maximal velocity . When it is full, the thief can only move with a minimal velocity . The actual velocity when moving along the edge depends on the total weight of items chosen in the cities preceding . The problem asks to determine a combination of a tour and a subset of items that minimises the difference between the total profit of selected items and the overall transportation cost.
Let an integervalued vector
, , represent a tour such that iff is the th visited node of the tour. Clearly, for any , . Next, let a binary decision vector , , encode a packing plan of the problem such that iff item in node is chosen, and otherwise. Then is a total weight of items sequentially selected in the nodes from to , and , , is the real velocity of the thief quitting the th node. In summary, the objective function of the TTP has the following form:(1) 
Here, we extend the standard formulation of the TTP by introduction of an additional objective function. The new version, named as BOTTP for short, becomes a biobjective optimisation problem, where the total accumulated weight
(2) 
yields the second criterion. Such extension appears natural regarding the TTP as one may either need to maximise the reward for a given weight of collected items, or determine the least weight subject to bounds imposed on the reward. Note that even if is fixed, (1) is a nonmonotone submodular function [24] that implies possible deterioration of the reward as the number of selected items, and therefore their total weight, increases. We formulate the BOTTP as follows:
As a biobjective optimisation problem, BOTTP asks for a set of Paretooptimal solutions where each feasible solution cannot be improved in a second objective without degrading quality of the first one, and vice versa. In other words, the goal is to find a set of all nondominated feasible solutions such that for any solution there is no solution such that either or holds, where is a set of feasible tours and is a set of feasible packing plans.
3 Prerequisites
The packing while travelling problem (PWT) is a special case of the TTP, which maximises the total reward for a specific tour [24]. Thus, an optimal solution of the PWT defines a subset of items producing the maximal gain. This yields a nonlinear knapsack problem, which can be efficiently solved via the dynamic programming (DP) approach proposed by Neumann et al. [21]. Most importantly, we find that the DP yields not just a single optimal packing plan, but a set of plans , where and do not dominate each other for any . We name the corresponding objective vectors of as a DP front. In Section 4, we design our hybrid algorithm that takes advantage of the features of a DP front.
For selfsufficiency of the paper, in Section 3.1, we first briefly explain the DP and how we adopt it to obtain a DP front. Section 3.2 then discusses several algorithms to obtain tours that are later utilised by the DP to create multiple DP fronts and to initialise the population for our hybrid evolutionary approach.
3.1 Dynamic Programming for the PWT
The DP for the PWT bases on a scheme traditional to the classical 01 knapsack problem. It processes items in the lexicographic order as they appear along a given tour ; that is, item strictly precedes item , to be written as , if either or holds. Its table is an matrix, where entry represents the maximal reward that can be achieved by examining all combinations of items with leading to the weight equal to . The base case of the DP with respect to the first item , according to the precedence order, positioned in node is as follows:
Here, the first case relates to the empty packing when the thief collects no items at all while travelling along , and the second computes the reward when only item is chosen. Where a combination yielding doesn’t exist, . For the general case, let item be the predecessor of item with regard to the precedence order. And let denote the column containing all the entries for . Then based on one can obtain computing each entry , assuming that item is in node , as
In order to reduce the search space, in each column the cells dominated by other cells are to be eliminated, i.e. if and , then . An optimal solution derived by the DP corresponds to the maximal reward stored in the last column of . That is, is the value of an optimal solution, where is the last item according to the precedence order.
The last column of can be considered as a complete set of nondominated packing plans , where is the set of all feasible packing plans for a given tour . The packing plans in are nondominated exclusion of any dominated solutions during the solution construction process.
Definition 1
Letting and be the corresponding objective vectors sets of and respectively, is the Pareto front of . We therefore name as a DP front for the given tour .
A DP front for a tour is a complete nondominated set, as it contains all nondominated objective vectors in . We take advantage of this completeness to generate the spread of solutions in our biobjective approach in Section 4.
3.2 Generation of Multiple DP Fronts
As a single DP front is produced for a single given tour , i.e. , we could generate multiple TSP tours to get a set of DP fronts. In practice, various algorithms are capable of producing superior tours for the TSP, and therefore many approaches to the TTP use this capability to succeed. Highperforming TTP algorithms are commonly twostage heuristic approaches, like those proposed by Polyakovskiy et al. [23], Faulkner et al. [13], and El Yafrani and Ahiod [10]. Specifically, their first step generates a nearoptimal TSP tour and the second step completes solution by selection of a subset of items. Most of the approaches utilise the Chained LinKernighan heuristic [3], because it is able to provide very tight upper bounds for TSP instances in short time. The knapsack component then is often handled via constructive heuristics or evolutionary approaches. However, the TTP is essentially structured in such way that the importance of its both components is almost equal within the problem. Although nearoptimal TSP solutions can give good solutions to the TTP, most of them are far away from being optimal [31]. This is the reason for our first experimental study here, where we investigate the impact of several TSP algorithms on TTP solutions. Note that owing to the DP we are able to solve the knapsack part to optimality, which contributes to the validity of our findings.
We analysed five algorithms for the TSP: the Inverover heuristic (INV) [27], the exact solver Concorde (CON) [2], the ant colonybased approach (ACO) [9], the Chained LinKernighan heuristic (LKH) [3] and its latest implementation (LKH2) [14]. We ran each algorithm times on every instance of the eil76 series of the TTP benchmark suite [23]. We computed (capped due to practical reasons) distinct tours by INV, by CON, by the both ACO and LKH, and by LKH2. The lengths of the tours generated by INV are narrowly distributed around the average of
with the standard deviation being
. By contrast, every other algorithm generates tours having the identical tour length of , which beats INV.We then applied the DP to every tour produced by each of the algorithms. Figure 1
depicts the resulted rewards on some sample TTP instances, where each box with whiskers reports the distribution of the rewards for a certain instance and the corresponding algorithm. The central mark of each box indicates the median of rewards, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme rewards without considering outliers, and outliers are plotted individually as plus signs. From the plot, we may observe that the tours generated by the CON, ACO, LKH and LKH2 have similar distributions of rewards. By contrast, the boxes of INV seem to be more extreme on the both sides. This means that the distribution of rewards via INV is more diverse and the best of the rewards outperform the others. In other words, though the Inverover heuristic may lose against modern TSP approaches, it performs better in the role of generator of varied tours for the TTP. It may act as a seeding algorithm for a population in evolutionary algorithms.
In Figure 2, we visualise the collection of the DP fronts produced by the DP on the TTP instance eil76_n75_uncorr_01 [23]. The corresponding tours are the tours generated by the Inverover heuristic. Actually, the plot depicts fronts since the DP was applied to a tour and its reversed order.
Definition 2
Given DP fronts , let denote a union of the fronts as . Then a subset is the Pareto front of called as the surface of .
The surface is formed by the union of all superior points resulted from different DP fronts in . It is further used to guide evolution process in our approach.
4 A hybrid evolutionary approach
Multiobjective optimisation algorithms guided by evolutionary mechanisms explore the decision space iteratively in order to determine a set of Paretooptimal solutions. Indeed, many of them may act myopically as they sample the space searching for individual solutions without clear vision of the whole picture in terms of other solutions and their number. Therefore, achieving strong diversity in exploring the space plays an important role in evolutionary algorithm design. In this paper, we discuss one way to overcome potential issues related to diversity and propose a hybrid approach where evolutionary techniques and dynamic programming find synergy in their combination.
One of the challenges of multiobjective optimisation is to keep the wide spread of solutions, which has to be guaranteed by strong diversity. Modern approaches normally incorporate additional processes to tackle this, such as the density estimation and/or crowdednesscomparison operator in SPEA2
[35] and NSGAII [8]. In our approach, the DP is incorporated as a subroutine capable of producing at once a series of possible decisions with regard to a given tour. Thus, when a tour is specified, the DP guarantees that a corresponding front will be built without missing any of its points due to the completeness of the DP front, which thus also guarantees a good spread of solutions.On the other hand, due to the typically observed nondominance of single DP fronts, the global Pareto optimality of the BOTTP may be formed either by a single DP front or by the combination of segments from different top DP fronts. In Figure 2, we may observe that the DP fronts are all intertwined together, including the ones at the surface of the fronts collection. This seems to indicate that the Paretooptimal set of solutions is more likely to be the result of multiple TSP tours and their DP fronts. We would like our evolutionary mechanism to take advantage of this and to keep the top DP fronts so as to improve the population further. In order to achieve this as well as to overcome the drawback of existing multiobjective evolutionary optimisation algorithms that focus on individual solutions, we design our hybrid IBEA with particular indicators and selection mechanisms in orchestrating improvement of Pareto front guided by the information of the DP fronts for most promising TSP tours.
Our hybrid approach reduces the search space to some extent by decomposing the problem and thus transforming it. Evolutionary optimisation approaches traditionally depend on the choice of solution encoding (i.e. chromosome). Our approach treats a single TSP tour as an individual. Thus, a set of tours yields a population. Indeed, it operates on a reduced set of variables (implying shorter chromosomes), thus decreasing memory consumption and the number of internally needed sorting operations, comparisons and search operations.
Algorithm 1 sketches the whole approach, which we adopted from the original IBEA introduced by Zitzler and Künzli [34]. It accepts as a control parameter for the size of the population and as a limit on the number of iterations, which defines its termination criterion. In order to utilise the information within the DP fronts to guide the evolution of individual tours, we design new indicators to be computed based on the DP fronts instead of directly on the individuals. Our specific selection mechanisms then filter the individuals according to the indicator values in order to find the tours with better DP fronts.
The rest of this section first introduces the indicator functions we apply to TSP solutions. Next, it details a parent selection mechanism to mate existing individuals from the population. It ends with a discussion of mutation and crossover operators guiding the search.
4.1 Design of Indicators
The designs of our indicators are based on the idea of measuring how each DP front contributes to the surface of the fronts’ union corresponding to the population . The surface introduced in Definition 3.2 is the union of all best segments from different DP fronts in . Given a DP front for a tour and a measurement function of a front, we use the followed formula to calculate the indicator :
(3) 
This formula measures how much we could lose (expressed as a value from 0 to 1) if we did not include the segments of the front to the surface , i.e. . In the following, we study two types of the measurement functions: Surface Contribution (SC) and Hypervolume (HV), hence two corresponding indicators: the Loss of Surface Contribution (LSC) and the Loss of Hypervolume (LHV).
Loss of Surface Contribution. Our first indicator is Surface Contribution (SC), which is a novel and direct measure. Given the union of a set of fronts , a front and the surface , counts the number of objective vectors that contributes to , as defined by:
(4) 
Using SC (4) to replace the function in (3), we have the formula of LSC as follows:
Loss of Hypervolume. In multiobjective optimisation, the hypervolume indicator is a traditional indicator used to indicate the quality of a set of objective vectors [36]. In the bicriteria case, when a front is given as a set of points in twodimensional space, its value is computed as a sum of areas of rectangular regions.
Let be the reference point for our problem, which implies that only the range of nonnegative objective values is taken into account. In addition, let be a bidimensional objective vector in a DP front while and , calculates the hypervolume for as:
Putting back to (3), we have the loss of hypervolume computed as
4.2 Parent Selection Mechanisms
With the individuals in the population being measured by the defined indicators, we can study strategies that shall efficiently select good individuals. There are five parent selection schemes that we take into consideration due to their popularity or previous theoretical findings. In comparison, we introduce two simple and arbitrary selections as well as a traditional policy to be a baseline. In this study, we expect to find a wellperforming combination of indicator and selection to encourage the synergy of the DP and evolutionary approach.
Rankbased Selection (RBS). In the rankbased selection policy, individuals are first ranked with respect to the value of an indicator. The selection policy is based then on a specific distribution law affecting the choice of a parent. Here, we study three schemes introduced by Osuna et al. [22], namely exponential (EXP), inverse quadratic (IQ) and Harmonic (HAR), and make them a part of our hybrid approach. Given a population of size
, the probability of selecting the
th ranked individual according to EXP, IQ and HAR is, respectively,(5) 
FitnessProportionate Selection (FPS). This rule estimates an individual according to the indicator of its DP front . It has the following form:
(6) 
Tournament Selection (TS). This policy applies the tournament selection [20], but employs indicators discussed in Section 4.1 to rank individuals.
Arbitrary Selection (AS). Here, we consider two different rules: the best arbitrary selection (BST) and another one, which we call extreme (EXT). The former ranks individuals of a population with accordance to the value of an indicator and selects the best half of the population. The latter proceeds similarly selecting of the best and of the worst individuals.
Uniformlyatrandom Selection (UAR). This traditional policy selects a parent from a population with probability uniformly at random.
4.3 Mutation and Crossover Operators
In our approach, we adopt a multipoint crossover operator that has already proved its efficiency for the TTP in [10]. As an (unoptimised) rule, we perform the crossover operation on a tour with probability. It is always followed by the mutation procedure, which either applies the classical 2OPT mutation [7] or reinserts a node to another location. Both the node and the location are selected uniformly at random. We name these two operators 2OPT and JUMP, respectively.
5 Computational Experiments
5.1 Computational Set Up
We examine the IBEA presented in Algorithm 1 by going through each of the two indicators and the eight parent selections, resulting in a total of 16 settings. For example, FPS on LHV means the combination of the FPS selection and the LHV indicator.
From the original set of TTP instances, we use three different types, namely boundedstronglycorrelated (Bounded), uncorrelated (Uncorrelated) and uncorrelatedwithsimilarweights (SimilarWeights), selected from three instance series: eil51, eil76, eil101 in the TTP benchmark [23]. We run our approach 30 times repetitively on each selected instance. Each time, the algorithm runs 20,000 generations on a population in size of 50.
Due to the significant computing cost, our experiments run on the supercomputer in our university, which consists of 5568 Intel(R) Xeon(R) 2.30GHz CPU cores and 12TB of memory. Overall the experiments consumed around 170,000 CPUhours.
5.2 Results and Analysis
To compare the outcomes of the different approaches based on the final populations of tours, we calculate the hypervolumes for the surface of resulting nondominated solutions. We also store the corresponding total reward in order to compare with the results from the stateoftheart singleobjective approach: MA2B [10] (see comparison in [29]).
However, due to the varied mean values and unknown global optima of different TTP instances, it is hard to analyse and compare across instances. Nevertheless, such a comparison is desired because such analysis or comparison may provide a more general view for our algorithm. We design a statistical comparison to overcome this as follows. Firstly, we choose the uniformlyatrandom (UAR) selection as the baseline, which creates two baseline settings, namely UAR on LHV and UAR on LSC. We secondly conduct
Welch’s ttest
[30] between the results of the others and the baselines for two indicators respectively.The results of the ttest are probability values (pvalues), each of which measures the likelihood of one selection to the corresponding baseline with respect to their performance. For example, we have the pvalue being in the case of comparing the hypervolume of the FPS and the UAR on LHV. This means that the probability of the FPS performing identical to the UAR on LHV (as expressed by having the same means) is less than . In fact, the former performs much better than the latter on average. In order to improve the readability, we use the logarithm of the pvalue in our plots. Thus, the measure of the FPS on LHV in our little example is (i.e. ). In short, the larger the logarithmic pvalue is, the better the selection is against the UAR.
Figure 3 depicts the overall results of the Welch’s ttest, in which we categorise our results into three types of bars according to three types of TTP instances: Bounded, Uncorrelated and SimilarWeights. Each bar in the plots represents the mean of the logarithmic pvalues of several instances in this category, for example eil51_n50_boundedstronglycorr_01.ttp, eil76_n75_boundedstronglycorr_01.ttp and eil101_n100_boundedstronglycorr_01.ttp. From it we may observe distinguishable patterns between the selections running on the LHV and the LSC respectively. For example, the three rankbased selection (RBS) schemes generally perform better on LHV than on LSC, among which the HAR is the best. According to the definitions, the HAR is the least aggressive scheme among the three, with a fat tail and relatively small probability for selecting the best few individuals [22]. It seems to imply that the LHV benefits more from the diversity of candidates. By contrast, the ASBST performs best on LSC, which might imply that the LSC relies more on a few outstanding individuals for approximating, as the ASBST only focuses on the best ones.
In terms of different types of TTP instances, we may observe that the IBEA performs best on the uncorrelated instances in all of the settings, while being worst on the strongly bounded ones in most of the settings. This to some extent supports the conjecture that strongly bounded TTP instances are the (relatively) hard ones and uncorrelated instances are the easy ones [23].
With regard to the choice of the parent selections, besides the RBSHAR and the ASBST which perform best on LHV and LSC respectively, we would like to recommend the FPS as well. This selection seems to be the safest choice, as it performs consistently well on different settings.
MA2B  

Mean  Max  SD  
eil51_n50  Uncorrelated  2805.000  2855  27.814 
SimilarWeights  1416.348  1460  47.906  
Bounded  4057.652  4105  25.841  
eil76_n75  Uncorrelated  5275.067  5423  78.138 
SimilarWeights  1398.867  1502  55.448  
Bounded  3849.067  4109  139.742  
eil101_n100  Uncorrelated  3339.600  3789  388.360 
SimilarWeights  2215.500  2483  235.905  
Bounded  4949.000  5137  139.285  
FPS LHV  
Mean  Max  SD  
eil51_n50  Uncorrelated  2828.728  2854.543  15.357 
SimilarWeights  1413.044  1459.953  17.780  
Bounded  4229.149  4230.997  10.118  
eil76_n75  Uncorrelated  5445.624  5514.666  58.992 
SimilarWeights  1477.680  1513.404  24.494  
Bounded  4042.449  4108.760  38.805  
eil101_n100  Uncorrelated  3620.844  3943.425  222.815 
SimilarWeights  2431.907  2482.462  52.265  
Bounded  5094.246  5233.513  65.267  
FPS LSC  
Mean  Max  SD  
eil51_n50  Uncorrelated  2810.509  2832.496  18.076 
SimilarWeights  1426.135  1459.953  21.990  
Bounded  4231.299  4241.199  1.881  
eil76_n75  Uncorrelated  5392.575  5514.666  73.029 
SimilarWeights  1474.803  1513.404  21.346  
Bounded  4054.815  4102.167  21.440  
eil101_n100  Uncorrelated  3664.369  3846.172  124.994 
SimilarWeights  2436.374  2482.462  49.731  
Bounded  5067.070  5233.513  55.587 
Overall, we may observe from Figure 3 that the figures of the hypervolume generally agree with those of the total reward. This somewhat suggests that optimising the biobjective TTP brings good results for the single objective TTP as well. Table 1 presents the total rewards we get by optimising the BOTTP, in comparison with the stateofart algorithm of the single objective TTP, namely MA2B [10]. We run the MA2B with the time limits identical to our approach. The results show that in the majority of the test cases, our approach preforms better.
6 Conclusion
In this paper, we investigated a new biobjective travelling thief problem which optimises both the total reward and the total weight. We proposed a hybrid indicatorbased evolutionary algorithm (IBEA) that utilises the exact dynamic programming algorithm for the underlying PWT problem as a subroutine to evolve the individuals. This approach guarantees the spread of solutions without introducing additional spread mechanisms. We furthermore designed and studied novel indicators and selection schemes that take advantage of the information in the Pareto fronts generated by the exact approach for evolving solutions towards the global Pareto optimality. Our results show that this approach solves the problem well, because its byproducts, which are the results for the singleobjective travelling thief problem, beat the stateoftheart approach singleobjective approaches.
Acknowledgements
The authors were supported by Australian Research Council grants DP130104395, DP140103400, and DE160100850.
References
 [1]
 Applegate et al. [2005] David Applegate, Ribert Bixby, Vasek Chvatal, and William Cook. 2005. Concorde tsp solver, 2006. See: http://www.math.uwaterloo.ca/tsp/concorde.html (2005).
 Applegate et al. [2003] David Applegate, William J. Cook, and André Rohe. 2003. Chained LinKernighan for Large Traveling Salesman Problems. INFORMS Journal on Computing 15, 1 (2003), 82–92. https://doi.org/10.1287/ijoc.15.1.82.15157
 Blank et al. [2017] Julian Blank, Kalyanmoy Deb, and Sanaz Mostaghim. 2017. Solving the Biobjective Traveling Thief Problem with Multiobjective Evolutionary Algorithms. Springer International Publishing, Cham, 46–60. https://doi.org/10.1007/9783319541570_4
 Bonyadi et al. [2013] Mohammad Reza Bonyadi, Zbigniew Michalewicz, and Luigi Barone. 2013. The travelling thief problem: The first step in the transition from theoretical problems to realistic problems. In Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2013, Cancun, Mexico, June 2023, 2013. 1037–1044.
 Bonyadi et al. [2014] Mohammad Reza Bonyadi, Zbigniew Michalewicz, Michal Roman Przybylek, and Adam Wierzbicki. 2014. Socially Inspired Algorithms for the Travelling Thief Problem. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO ’14). ACM, 421–428.
 Croes [1958] Georges A Croes. 1958. A Method for Solving TravelingSalesman Problems. Operations Research 6, 6 (1958), 791–812. https://doi.org/10.1287/opre.6.6.791
 Deb et al. [2002] Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGAII. IEEE Trans. Evolutionary Computation 6, 2 (2002), 182–197. https://doi.org/10.1109/4235.996017
 Dorigo and Stützle [2004] Marco Dorigo and Thomas Stützle. 2004. Ant colony optimization. MIT Press.
 El Yafrani and Ahiod [2016] Mohamed El Yafrani and Belaïd Ahiod. 2016. Populationbased vs. Singlesolution Heuristics for the Travelling Thief Problem. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO ’16). ACM, 317–324.
 El Yafrani and Ahiod [2017] Mohamed El Yafrani and Belaïd Ahiod. 2017. A local search based approach for solving the Travelling Thief Problem: The pros and cons. Applied Soft Computing 52, Supplement C (2017), 795 – 804. https://doi.org/10.1016/j.asoc.2016.09.047
 El Yafrani et al. [2017] Mohamed El Yafrani, Marcella Martins, Markus Wagner, Belaïd Ahiod, Myriam Delgado, and Ricardo Lüders. 2017. A hyperheuristic approach based on lowlevel heuristics for the travelling thief problem. Genetic Programming and Evolvable Machines (15 7 2017). https://doi.org/10.1007/s107100179308x
 Faulkner et al. [2015] Hayden Faulkner, Sergey Polyakovskiy, Tom Schultz, and Markus Wagner. 2015. Approximate Approaches to the Traveling Thief Problem. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (GECCO ’15). ACM, New York, NY, USA, 385–392. https://doi.org/10.1145/2739480.2754716
 Helsgaun [2000] Keld Helsgaun. 2000. An effective implementation of the LinKernighan traveling salesman heuristic. European Journal of Operational Research 126, 1 (2000), 106–130. https://doi.org/10.1016/S03772217(99)002842
 Lourenço et al. [2016] Nuno Lourenço, Francisco B. Pereira, and Ernesto Costa. 2016. An Evolutionary Approach to the Full Optimization of the Traveling Thief Problem. Springer International Publishing, Cham, 34–45. https://doi.org/10.1007/9783319306988_3
 Martins et al. [2017] Marcella S. R. Martins, Mohamed El Yafrani, Myriam R. B. S. Delgado, Markus Wagner, Belaïd Ahiod, and Ricardo Lüders. 2017. HSEDA: A Heuristic Selection Approach Based on Estimation of Distribution Algorithm for the Travelling Thief Problem. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’17). ACM, New York, NY, USA, 361–368.
 Mei et al. [2015] Yi Mei, Xiaodong Li, Flora Salim, and Xin Yao. 2015. Heuristic evolution with Genetic Programming for Traveling Thief Problem. In IEEE Congress on Evolutionary Computation, CEC 2015, Sendai, Japan, May 2528, 2015. 2753–2760. https://doi.org/10.1109/CEC.2015.7257230
 Mei et al. [2014] Yi Mei, Xiaodong Li, and Xin Yao. 2014. Improving Efficiency of Heuristics for the Large Scale Traveling Thief Problem. In Simulated Evolution and Learning  10th International Conference, SEAL 2014, Dunedin, New Zealand, December 1518, 2014. Proceedings. 631–643. https://doi.org/10.1007/9783319135632_53
 Mei et al. [2016] Yi Mei, Xiaodong Li, and Xin Yao. 2016. On investigation of interdependence between subproblems of the Travelling Thief Problem. Soft Comput. 20, 1 (2016), 157–172. https://doi.org/10.1007/s0050001414872
 Miller and Goldberg [1995] Brad L. Miller and David E. Goldberg. 1995. Genetic Algorithms, Tournament Selection, and the Effects of Noise. Complex Systems 9, 3 (1995). http://www.complexsystems.com/abstracts/v09_i03_a02.html
 Neumann et al. [2017] Frank Neumann, Sergey Polyakovskiy, Martin Skutella, Leen Stougie, and Junhua Wu. 2017. A Fully Polynomial Time Approximation Scheme for Packing While Traveling. CoRR abs/1702.05217 (2017). arXiv:1702.05217 http://arxiv.org/abs/1702.05217
 Osuna et al. [2017] Edgar Covantes Osuna, Wanru Gao, Frank Neumann, and Dirk Sudholt. 2017. Speeding up evolutionary multiobjective optimisation through diversitybased parent selection. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017, Berlin, Germany, July 1519, 2017. 553–560.
 Polyakovskiy et al. [2014] Sergey Polyakovskiy, Mohammad Reza Bonyadi, Markus Wagner, Zbigniew Michalewicz, and Frank Neumann. 2014. A comprehensive benchmark set and heuristics for the traveling thief problem. In Genetic and Evolutionary Computation Conference, GECCO ’14, Vancouver, BC, Canada, July 1216, 2014. 477–484.
 Polyakovskiy and Neumann [2017] Sergey Polyakovskiy and Frank Neumann. 2017. The Packing While Traveling Problem. European Journal of Operational Research 258, 2 (2017), 424–439. https://doi.org/10.1016/j.ejor.2016.09.035
 Stolk et al. [2013] Jacob Stolk, Isaac Mann, Arvind Mohais, and Zbigniew Michalewicz. 2013. Combining vehicle routing and packing for optimal delivery schedules of water tanks. OR Insight 26, 3 (2013), 167–190. https://doi.org/10.1057/ori.2013.1
 Strzeżek et al. [2015] Anna Strzeżek, Ludwik Trammer, and Marcin Sydow. 2015. DiverGene: Experiments on controlling population diversity in genetic algorithm with a dispersion operator. In 2015 Federated Conference on Computer Science and Information Systems (FedCSIS). 155–162. https://doi.org/10.15439/2015F411
 Tao and Michalewicz [1998] Guo Tao and Zbigniew Michalewicz. 1998. Inverover Operator for the TSP. In Parallel Problem Solving from Nature  PPSN V, 5th International Conference, Amsterdam, The Netherlands, September 2730, 1998, Proceedings. 803–812.
 Wagner [2016] Markus Wagner. 2016. Stealing Items More Efficiently with Ants: A Swarm Intelligence Approach to the Travelling Thief Problem. In Swarm Intelligence: 10th International Conference, ANTS 2016, Brussels, Belgium, September 79, 2016, Proceedings, Marco Dorigo, Mauro Birattari, Xiaodong Li, Manuel LópezIbáñez, Kazuhiro Ohkura, Carlo Pinciroli, and Thomas Stützle (Eds.). Springer, 273–281.
 Wagner et al. [2017] Markus Wagner, Marius Lindauer, Mustafa Mısır, Samadhi Nallaperuma, and Frank Hutter. 2017. A case study of algorithm selection for the traveling thief problem. Journal of Heuristics (07 Apr 2017). https://doi.org/10.1007/s107320179328y

Welch [1947]
Bernard L Welch.
1947.
The Generalization of ‘Student’s’ Problem when Several Different Population Variances are Involved.
Biometrika 34, 1/2 (1947), 28–35. http://www.jstor.org/stable/2332510  Wu et al. [2017] Junhua Wu, Markus Wagner, Sergey Polyakovskiy, and Frank Neumann. 2017. Exact Approaches for the Travelling Thief Problem. In Simulated Evolution and Learning  11th International Conference, SEAL 2017, Shenzhen, China, November 1013, 2017, Proceedings. 110–121.
 Yafrani and Ahiod [2018] Mohamed El Yafrani and Belaïd Ahiod. 2018. Efficiently solving the Traveling Thief Problem using hill climbing and simulated annealing. Information Sciences 432 (2018), 231–244. https://doi.org/10.1016/j.ins.2017.12.011
 Yafrani et al. [2017] Mohamed El Yafrani, Shelvin Chand, Aneta Neumann, Belaïd Ahiod, and Markus Wagner. 2017. Multiobjectiveness in the Singleobjective Traveling Thief Problem. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’17). ACM, New York, NY, USA, 107–108. https://doi.org/10.1145/3067695.3076010
 Zitzler and Künzli [2004] Eckart Zitzler and Simon Künzli. 2004. IndicatorBased Selection in Multiobjective Search. In Parallel Problem Solving from Nature  PPSN VIII, 8th International Conference, Birmingham, UK, September 1822, 2004, Proceedings. 832–842.
 Zitzler et al. [2001] Eckart Zitzler, Marco Laumanns, and Lothar Thiele. 2001. SPEA2: Improving the strength Pareto evolutionary algorithm. TIKreport 103 (2001).
 Zitzler and Thiele [1998] Eckart Zitzler and Lothar Thiele. 1998. Multiobjective Optimization Using Evolutionary Algorithms  A Comparative Case Study. In Parallel Problem Solving from Nature  PPSN V, 5th International Conference, Amsterdam, The Netherlands, September 2730, 1998, Proceedings. 292–304.