In this paper, we investigate the application of methods of ai to an industrial problem on the example of optimizing commissioning tasks in a high-bay storage. Our goal is not to improve on the state of the art in this task but instead shed light on this problem from an ai engineering point of view. From this point of view, we first have to translate this problem adequately to apply methods of artificial intelligence, then we would seek for established software implementations of these methods and evaluate these on the given task of high-bay storage commissioning.
The commissioning problem or order picking problem is the following: We are given a high-bay storage where goods are stored in slots arranged on a two-dimensional wall. An order comprises a finite set of places on that wall that need to be visited to pick up the goods. We desire to do this as quickly as possible. Assuming a tapping point where the collection device starts and ends its job, we can interpret this as an instance of the tsp: Given a set of locations in the plane, we ask for the shortest closed tour on that visits all points .
What we essentially ask for is the optimal order at which we visit the locations . Furthermore, the way we measure distances between pairs of locations is relevant. To sum up, we consider in a metric space with a metric , encode a tour as a permutation and ask for a tour that minimizes the tour length with
In this paper, we may interchangeably represent a permutation as the sequence , when it fits better to the formal setting.
The metric models movement characteristics of the collection device. For instance, if vertical and horizontal motions cannot happen simultaneously at the high-bay storage then the Manhattan metric is a reasonable model. If the speeds in horizontal and vertical directions differ then weights in the metric definition can accommodate this circumstance. For the remainder of this paper, we simply assume that is the Euclidean metric and is, therefore, the Euclidean plane. This leads to a case, where many optimization algorithms can be applied. In this paper, we discuss especially the Genetic and Hill Climbing Algorithm and improvements of these algorithms. As there are already libraries for standard implementations of these algorithms, we do not implement the algorithms from scratch, rather we use a library called mlrose as a base and improve the above stated algorithm based on this library.
I-B Theoretical background and related work
The tsp is a classical problem in operations research and algorithm theory. In the list of 21 NP-complete problems by Karp  is the related (undirected) Hamiltonian cycle problem: Given an (undirected) graph with vertices, is there a Hamiltonian cycle? This problem can be considered to be a special case of (the decision problem of) tsp after choosing the edge weights appropriately and hence (the decision problem of) tsp is NP-complete, too.
The Euclidean plane adds additional structure to the tsp. Although tsp in the Euclidean plane is still NP-complete, the additional structure allows for polynomial-time approximation algorithms: Mitchell  and Arora  independently presented a polynomial-time approximation scheme ptas for tsp in the Euclidean plane, for which they received the Gödel prize. The algorithm by Christofides is a -approximation in the metric space . While Christofides algorithm is reasonably simple to implement, it has a time complexity of .
Christofides algorithm is closely related to the mst. The Euclidean mst is actually a subgraph of the Delaunay triangulation that can be computed in time and possesses only edges. For Euclidean tsp heuristics based on the Delaunay triangulation we refer to .
The tsp can be phrased as a problem of ai. Following the notation of Russel and Norvig 
, we deal with a fully observable, single-agent, deterministic, sequential, static, discrete, and known environment. As many NP-complete problems, also tsp has attracted extensive ai research, which essentially spans the full spectrum of the ai landscape, such as logic-based methods, e.g., through smt, machine learning or various, often biologically inspired, optimization techniques. In this paper we will focus on the latter.
There are a few meta-heuristic algorithms out there for approaching the global minimum in a multidimensional space. Many approaches and improvements have already been made to get closer to the global minimum without letting the computation time escalate. Arash Mazidi et al. 
have combined the Ant colony and the ga to solve a navigation routing problem. For the generation of the initial population, they have used the Ant colony algorithm and for learning and evolving every single node, they used the ga. The results of this paper were not only a better approach to reach the global minimum, moreover, the computation time also decreased slightly in comparison to other algorithms like Particle swarm optimization or Simulated annealing. Sanchez et al. have solved tsp using ga on a GPU. The method they used, was to parallelize the mutation of different nodes on different threads and CUDA blocks. This approach resulted in a significant improvement on the problems computation time. Urrutia et al. 
have used stack data structure and dynamic programming to solve tsp. Results show that optimal solutions are obtained. Also rl and supervised learning are applied for tsp. With rl several variants were proposed, like a hybrid algorithm combining a ga and multi-agent rl  as well as Deep rl 
. With supervised learning, variants, like recurrent neural networks, and more recently, also the transformer architecture  were proposed to solve the tsp.
A recent and comprehensive overview on tsp using ai is given by Osaba et al. , covering the ga. This work highlights the most notable ga crossover variants.
The *hca is comprehensively described in Russel and Norvig, including several modifications to overcome issues, like plateaus or ridges . Additional extensions, like Simulated Annealing, Tabu Search, the Greedy Randomize Adaptive Search Procedure, Variable Neighborhood Search, and the Iterated Local Search support to overcome the local optima problem .
To the best of the author’s knowledge, the proposed contributions have not been covered in literature so far.
Ii Solve tsp using mlrose
The python package mlrose provides a set of randomized optimization and search algorithms to a range of different optimization problems. This paper deals with the tsp, therefore the necessary steps to solve this problem using mlrose will be discussed. Fig. 1 shows an overview of how to solve tsp using mlrose.
First, a fitness function object has to be defined, which is to be maximized. In the case of the tsp, the fitness function is given by , i.e., maximizing the fitness is equivalent to minimizing the costs expressed by the tour length of a tour . The TravellingSales class implemented by mlrose provides a fitness function object for that purpose. An object of this class is initialized by input an ordered list of elements. The evaluate method of the TravellingSales
class receives a state vector as an input parameter and returns the cost of the given state. The state vector is represented by a vector ofintegers in the range to , specifying the order in which the elements are visited. Once a fitness function object is created, it can be used as an input to an optimization problem object. This object is used to contain the relevant information about the optimization problem to be solved. The TSPOpt class implemented by mlrose is used. When initializing an object of this class, the maximize parameter can be used to determine whether it is a maximization or minimization problem. Finally, an optimization algorithm is selected and used to solve the problem . The optimization algorithms are problem-agnostic, while the optimization problem object and the fitness function are problem-specific. In the following sections, the randomized optimization algorithms hca and ga are introduced, weak points of the implementation of mlrose are pointed out and modifications are carried out to increase the performance of these algorithms.
The precalculated data set att48 from TSPLIB  is used to measure the performance of the standard and the modified algorithms. This data set contains 48 cities in a coordinate system with a minimal tour length of (unit-less).
Iii-a General basics
The ga is an optimization and search procedure that is inspired by the maxim “survival of the fittest” in natural evolution. A candidate solution (individual) is encoded by a string over some alphabet (genetic code). Individuals are modified by two genetic operators: (i) random alteration (mutation) of single individuals and (ii) recombination of two parents (crossover) to form offsprings. Given a set of individuals (population), a selection mechanism based on a fitness function together with the two genetic operators produce a sequence of populations (generations). The genetic operators promote exploration of the search space while the selection mechanism attempts to promote the survival of fit individuals over generations. The ga as implemented in mlrose terminates after no progress has been made for a certain number of generations or a predefined maximum number of generations.
For ga to work well, it is paramount that a reasonable genetic representation of individuals is used. In particular, the crossover operator needs to have the property that the recombination of two fit parents produces fit offsprings again, otherwise the genetic structure of fit individuals would not survive over generations and ga easily degenerates to a randomized search. Furthermore, the efficiency of ga depends on the initial population, the selection, and the recombination strategy. The mutation rate and the size of the population are hyperparameters.
Iii-B Implementation in mlrose
mlrose uses the state vector representation from section II directly as genetic representation, i.e., an individual is encoded as a permutation sequence of the integers , which are indices of the locations to be visited. Recombination of a first parent and a second parent works as follows: The sequence is considered to be split at a random position, the prefix of is taken and the missing locations in the genetic string are taken from in the order as they appear in .
Note that tsp has the symmetry property that a solution candidate and its reverse counterpart can be considered to be the same solutions. Not only is but in some sense the structure of the solution is the same. The reason behind this is that the pairwise distances between locations in the Euclidean plane (or adequate metric spaces) are invariant with respect to reflection.
However, the recombination strategy does not take this symmetry property into account. This leads to the following problem: Consider the recombination of two parents, and , that are reasonably similar and fit, however, their direction of traversal is essentially opposite. Then the offspring first traverses the locations like and then continues with that possesses the reversed direction, which likely destroys the fit solution structure displayed by and . That is, two fit parents produce unfit offsprings, as illustrated by Figure 1(a).
As an illustrative extreme example, assume is a globally optimal solution of tsp and . For sake of argument, assume and as in fig. 1(a). Then the offspring that results from a split in the middle of the genetic string would be , which is now typically far from globally optimal, i.e., , see fig. 1(a). This recombination would only not hurt if the middle of the fit tour , where the split point of the recombination is located, would happen to be close to the start or end of .
To mitigate the presented issue of the recombination strategy, we would like to have a natural notion of direction of traversal of a tour, so we could figure out whether we would need to reverse the parent before recombining it with . But since we lack an adequate mathematical notion, we factor out the two possibilities of tour traversals of in a different way.
When recombining and , we actually consider two candidate offsprings: offspring from and and offspring from and . See fig. 1(b) for an illustration of for the extreme example from before and fig. 1(a) for . We then compare the fitness values of the two candidate offsprings, i.e., we compare and , and keep only the better one as the recombination result. Following our observation from the previous section, we expect that one offspring of two fit parents results from a direction-conforming recombination and the other does not. (Of course, it still can happen that the two parents are bad mates for other reasons, i.e., they can still be structurally insufficiently compatible.)
This way we turn the original recombination operator into a reversal-invariant recombination operator. Note that our proposed recombination operator is beneficial not only for tsp, but generally for all problems with this reversal symmetry of the genetic encoding of individuals.
An experiment was carried out over attempts to measure the performance of the modified ga in comparison with the implementation by mlrose. Both algorithms use a shared parameter configuration of a population size of and a maximum number of generations, if not stated otherwise. While we would set a small positive mutation rate in practice, the higher the mutation rate is, the closer both implementations converge to a random search. Hence, for the sake of comparison, we set the mutation rate to
for our experiment. All results visually look like Gaussian distributed (not shown in the paper) and hence the notation of meanstandard deviation is used in the following discussion.
The tour lengths of the modified and the original implementation are shown in fig. 3. The red dashed horizontal line marks the optimal solution at a tour length of . The results show, that the modified ga () is closer to the optimum than the original algorithm by mlrose ().
The gain in the performance by the modified algorithm is at the expense of a higher computation time. The initial algorithm by mlrose () is faster than the modified ga ( ms). The modification of the ga results in a mean computation slowdown by a factor of about , compared to the original implementation by mlrose.
For a fair comparison of the computation times between these two implementations, a configuration was empirically determined, where both approaches perform comparably on the computation time. This evaluation uses the CPU clock time, based on a Intel Core i7-7700K (4.20 GHz). For this experiment, the population is set to , resulting in a computation time of ms, which is close to the result by the initial implementation by mlrose. The tour lengths for this configuration are shown in green in fig. 3 (), and still performs significantly better than the original version with a larger population size of .
Iv-a General basics
hca is a simple and most widely known optimization technique, cf. . When phrased as a maximization (minimization, resp.) problem of a function over some domain, Hill Climbing moves stepwise upward (downward, resp.) along the steepest ascent (descent, resp.) until it reaches a local maximum (minimum, resp.) of . As mentioned in section II, for tsp we minimize , which equivalently means maximizing the fitness function . The domain is given by the transposition graph over the vertex set of permutations , where contains an edge between vertices and iff we can turn into via a single transposition, i.e., the tour results from the tour by swapping two locations only. Figure 4 illustrates the transposition graph .
Given this formalization, we can speak of a neighborhood of as the set of permutations adjacent to within . Note that each permutation has neighbors in . Furthermore, can be considered to be a scalar field over with respect to which we can apply hca along the edges of . The paths traced out by hca on the way to a local optimum within a transposition graph have been studied by Hernando et al. . They show, for instance, that the distance to the local optimum is often not monotonically decreasing for various optimization problems. (However, tsp was not part of their study.)
In more detail, hca starts with a random tour and calculates the cost to be minimized. Then it considers all neighbors of within and their costs. If the neighbor with minimum costs has a lower cost than then it moves to and repeats. Otherwise, constitutes a local minimum and hca either terminates or restarts with a new random tour , as implemented in mlrose. After a given maximum number of restarts, hca returns the best permutation found in all runs.
Iv-B Implementation in mlrose
While vanilla hca is a simple optimization technique, various improvements are known in the literature to overcome different shortcomings, see  for an overview.
First of all, hca gets stuck in local optima. While this is no issue for convex (minimization) problems, the magnitude of this issue increases with the number of local optima and their respective sizes of the associated basins, i.e., the set of points from which hca ends up in a given local optimum, see also . For tsp this is an issue and the mlrose implementation provides a restart mechanism to mitigate this. A loosely related issue is the presence of ridges where local optima are sort of arranged at a string.
A second well-known issue for hca is the existence of plateaus, i.e., subregions of the domain where the fitness function is constant such that hca is no uphill direction. A mitigation to this issue is to allow a certain number of sideways moves  to cross the plateau for the lucky case that an uphill direction does indeed exist at the boundary of the plateau. No such mechanism is implemented in mlrose. On the other hand, for tsp it is very unlikely that the best neighbor of a permutation would have the same fitness (tour length) although and differ by a transposition. That is, it is unlikely that the tour length stays the same after swapping two locations.
From the discussion in the previous section, we take away that local optima are the prominent issue for hca on tsp in mlrose. In topography, we have the notion of the prominence of a mountain. Vaguely speaking, it tells how far we need to go down from a local maximum to a shoulder from which we can pursue an ascending path that exceeds this local maximum. A more precise notion is given by the persistence of the local maximum in the super-level set filtration in the field of persistent homology, see Huber .
Here a natural measure for the prominence of a local maximum is the number of steps in the transposition graph. A prominence of means that we have to admit downward steps from a local maximum until we can pursue an ascending path that allows us to escape the local maximum.
In the mlrose implementation, hca is already stuck at local maxima with a prominence of only , and it would resort to a restart. Our modification simply allows for a single downward step from local maxima to overcome local maxima of prominence . If the following step would lead us back to the old local maximum, we terminate this run and apply a restart as the original version. More generally, we simply keep a data structure of previously visited permutations to disallow cycles in the paths traced by hca.
This modification leads to another advantage: In the course of restarts, a series of Hill Climbing searches from randomly generated starting points is performed. After a restart, if the algorithm reaches a state that was visited in a previous trial, the Hill Climbing implemented by mlrose will take the same path again and will reach the very same local minimum again. The modified algorithm terminates after an already visited state is reached, which constitutes an early out optimization.
An experiment over attempts was performed to measure the performance of the modified hca in comparison with the hca implementation by mlrose. The tour lengths of the modified algorithm and the hca implemented by mlrose using and restarts are shown in fig. 5. The results again look visually rather Gaussian distributed (not shown in the paper), such that we use the notation of meanstandard deviation in the following. The red dashed horizontal line again marks the optimal tour length of .
With restart, the modified algorithm performs best (), while the implementation by mlrose has a slightly higher overall tour length (), which is expected since the modified version effectively extends the exploration.
For further comparison, the restarts parameter of both algorithms (standard hca implementation and modified) are limited to to test the performance against the default configuration of mlrose. Here the improvement of the modified version () (red) over the original version () (green) becomes more significant.
The modification influences on the computing time as the modified algorithm is slightly slower ( ms) than the hca implemented by mlrose ( ms). Similar, the results for the experiments with restarts. The modified algorithm has a higher computing time ( ms) than the hca implementation by mlrose ( ms). Increasing the number of restarts from to gives a slowdown of a factor of , for the modified and the original implementations likewise.
V Conclusion and final remarks
This paper was motivated by the industrial application of ai to the industrial problem of optimizing commissioning tasks in a high-bay storage, which translates to the tsp.
We chose mlrose for a ai library that already provides optimization routines for tsp and had a closer look at two optimization techniques, namely ga and hca. Moderately exploiting the problem structure of tsp, we propose two improvements, one for the ga and one for the hca, respectively, which improved the mean tsp results by for ga and for hca. The modifications we propose, however, have some generic character and are not only applicable to tsp.
For the ga, a significant improvement on the computed tour length can be shown based on our reversal-invariant crossover operator. This gain comes to the cost of the computational time. However, we show that even if we compensate for additional computational time by reducing the population size, our modification still outperforms the original version of mlrose.
For the hca, the goal of the experiment was to show, that a problem-specific treatment is necessary for tsp. The implementation provided by mlrose is a problem-agnostic vanilla implementation, that does not offer any problem-specific optimizations. By altering the algorithm towards the properties of the tsp, a clear improvement can be observed.
Finally, we would like to remark that to some extent our paper could be seen as a showcase that ai libraries should only carefully be applied as plug-and-play solutions to industrial problems and the specific problem structure of the industrial problem at hand likely provides means to improve the performance of the generic implementations. While the democratization through meta-learning facilities like AutoML relieve an application engineer from the tedious search for ml methods and their hyperparameters for a given problem at hand, we believe that in general, they do not make an understanding of the underlying methods obsolete.
-  (2012-03) Analyzing the Performance of Mutation Operators to Solve the Travelling Salesman Problem. arXiv:1203.3099 [cs]. Cited by: §III-C.
-  (2017-12) -Hill climbing: an exploratory local search. Neural Comput & Applic 28 (S1), pp. 153–168 (en). External Links: Cited by: §I-B.
A hybrid algorithm using a genetic algorithm and multiagent reinforcement learning heuristic to solve the traveling salesman problem. Neural Comput & Applic 30 (9), pp. 2935–2951 (en). External Links: Cited by: §I-B.
-  (1996) Polynomial time approximation schemes for euclidean tsp and other geometric problems. In Proceedings of 37th Conference on Foundations of Computer Science, Vol. , pp. 2–11. External Links: Cited by: §I-B.
The Transformer Network for the Traveling Salesman Problem. arXiv:2103.03012 [cs]. Cited by: §I-B.
-  (1976) Worst-case analysis of a new heuristic for the travelling salesman problem. Technical Report Technical Report 388, Graduate School of Industrial Administration, Carnegie Mellon University. Cited by: §I-B.
-  (1995) Ant-Q: A Reinforcement Learning approach to the traveling salesman problem. In Machine Learning Proceedings 1995, pp. 252–260 (en). External Links: Cited by: §I-B.
-  (2019) mlrose: Machine Learning, Randomized Optimization and SEarch package for Python. Note: https://github.com/gkhayes/mlrose Cited by: §II.
Hill-Climbing Algorithm: Let’s Go for a Walk Before Finding the Optimum.
2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, pp. 1–7. External Links: Cited by: §IV-A, §IV-B.
Persistent homology in data science. In Proc. 3rd Int. Data Sci. Conf. (iDSC ’20), Data Science – Analytics and Applications, Dornbirn, Austria (virtual). External Links: Cited by: §IV-C.
-  (2017) Genetic Algorithm for Traveling Salesman Problem with Modified Cycle Crossover Operator. Computational Intelligence and Neuroscience 2017, pp. 1–7 (en). External Links: Cited by: §III-C.
-  (1972-01) Reducibility among combinatorial problems. Complexity of Computer Computations 40, pp. 85–103. External Links: Cited by: §I-B.
-  (2008) Good triangulations yield good tours. Computers & Operations Research 35, pp. 638–647. External Links: Cited by: §I-B.
-  (2016) Meta-heuristic approach to cvrp problem: local search optimization based on ga and ant colony. Journal of Advances in Computer Research 7 (1), pp. 1–22. Cited by: §I-B.
Applying Deep Learning and Reinforcement Learning to Traveling Salesman Problem. In 2018 International Conference on Computing, Electronics & Communications Engineering (iCCECE), Southend, United Kingdom, pp. 65–70. External Links: Cited by: §I-B.
-  (1996) Guillotine subdivisions approximate polygonal subdivisions: a simple new method for the geometric k-mst problem. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’96, USA, pp. 402––408. Cited by: §I-B.
-  (2020) Traveling salesman problem: a perspective review of recent research and new results with bio-inspired metaheuristics. In Nature-Inspired Computation and Swarm Intelligence, pp. 135–164 (en). External Links: Cited by: §I-B.
-  (1991) TSPLIB–a traveling salesman problem library. ORSA Journal on Computing 3 (4), pp. 376–384. Cited by: §II.
-  (2019-10) A novel memetic genetic algorithm for solving traveling salesman problem based on multi-parent crossover technique. Decis. Mak. Appl. Manag. Eng. 2 (2). External Links: Cited by: §III-C.
-  (2010) Artificial intelligence: a modern approach. Prentice Hall. Cited by: §I-B, §I-B, §IV-A, §IV-B, §IV-B.
-  (2015) Parallel genetic algorithms on a gpu to solve the travelling salesman problem. Revista en Ingeniería y Tecnología 8 (2). Cited by: §I-B.
Solving the traveling salesman problem using a recurrent neural network. Numer. Analys. Appl. 8 (3), pp. 275–283 (en). External Links: Cited by: §I-B.
-  (2015) Parallel genetic algorithms on a gpu to solve the travelling salesman problem. Revista en Ingeniería y Tecnología 8 (2). Cited by: §I-B.