1 Introduction
Workforce scheduling and routing problems (WSRP) [2] are challenging problems for organisations with staff working in areas including health care [1] and engineering [4]. Finding solutions is the responsibility of a planner within the organisation who will have an interest in the wider organisational policy decisions surrounding the solution. Such wider issues could include the implications of solutions with a lower environmental impact, the effects of switching to public transport, or the impact of changing the size of the workforce.
Multiobjective optimisation approaches are commonly used to find solutions, to WSRP instances, as they can provide a front of solutions that tradeoff objectives [12]. However, fronts may only comprise a small section of the total solution space, and are difficult to visualise if there are many dimensions. Thus, it can be difficult for a planner to understand the range of solutions, why solutions were produced, and in particular to know whether other compromise solutions might exist.
A class of algorithms known as illumination algorithms have recently been introduced by Mouret at al [6], with a number of variants following, e.g. [9, 7]. Fundamentally different to a traditional search algorithm, the approach provides a holistic view of how highperforming solutions are distributed throughout a solution space [6]. The method creates a map of highperforming solutions at each point in a space defined by dimensions of variation that are chosen by a user, according to characteristics of a solution that are of interest. The resulting map (a Multidimensional Archive of Phenotypic Elites ) enables the user to gain specific insight into how combinations of characteristics of solutions correlate with performance, hence providing insight as well as multiple potential solutions. As the approach encourages diversity, it has often been shown to more capable of fully exploring a searchspace, outperforming stateoftheart search algorithms given a singleobjective, and can be particularly helpful in overcoming deception [8]. We therefore hypothesise that an illumination algorithm might provide particular benefit to realworld problems such as WRSP, which contain multiple, and sometimes conflicting, objectives. However, in contrast to the majority of previous applications of (ME) MapElites however which fall mainly in the domain of design problems (e.g. designing robot morphology), WSRP is a repetitive problem, which requires solving new instances repeatedly and obtaining acceptable solutions in reasonable time. While investing effort into producing an archive of solutions can pay off in a design domain, it may prove prohibitive for repetitive problems. Therefore, in the context of a WSRP based on the city of London, using real geographical locations and real transport information, we consider the following questions:

How does the relative performance of ME compare to a standard Evolutionary Algorithm (EA) in terms of satisfying a single objectivefunction over a range of evaluation budgets?

Does MAPElites provide useful insights into problem characteristics from a realworld perspective through providing a range of diverse but hiqhquality solutions?
Using 10 realistic problem instances, we demonstrate that for a small fixed evaluation budget, MAPElites does not outperform an EA in terms of the objective function, but as the budget increases, it outperforms the EA on the majority of instances tested. Furthermore, even when it is outperformed by an EA in terms of the single objective, it can discover solutions that have better values for the individual characteristics. From a userperspective, it may therefore present an acceptable tradeoff between overall quality and insight.
2 Previous work
The Multidimensional Archive of Phenotypic Elites (MAPElites) was first introduced by Mouret et al [6] and as discussed in the introduction, provides a mechanism for illuminating search spaces by creating an archive of highperforming solutions mapped onto solution characteristics defined by the user. To date, the majority of applications of illumination algorithms have been to design problems [6, 15]. Another tranche of work focuses on behaviour evolution in robotics, for example Cully et al [3], who evolve a diverse set of behaviours for a single robot in a “preimplementation” simulation phase: these are then used in future when the robot is in operation to guide intelligent choice of behaviour given changing environmental conditions.
To the best of our knowledge, an illumination algorithm has never been used to solve repetitive problems, i.e. problems faced in the realworld where acceptable solutions to problems have to be discovered in short timeframes, often many times a day. Typically these types of problems are combinatorial optimisation problems, e.g. scheduling, routing and packing, that often utilise indirect genotypic representations as a result of having to deal with multiple constraints. This contrasts to much of the existing work using MAPElites which uses a direct representation of design parameters (although the use of MAPElites with an indirect representation was discussed in [10]).
3 Methodology
We consider a WSRP characterised by timewindows, multiple transport modes and service times, variations of this scenario include the scheduling of health care workers as well as those providing other services such as environmental health inspections.
We assume an organisation has to service a set of clients, who each require a single visit. Each of the visits must be allocated to an employee, such that all clients are serviced, and an unlimited number of employees are available. Each visit is located at , where represents a real UK postcode, has a service time and a timewindow in which it must commence described by , i.e. the earliest and latest time at which can start and finish. Visits are grouped into journeys, where each journey contains a subset of the visits and is allocated to an employee. Each journey starts and ends at the central office. Two modes of travel are available to employees: the first mode uses private transport (car), the second makes uses of available publictransport, encouraging more sustainable travel. The overall goal is to minimise the total distance travelled across all journeys completed and forms the objective function for the problem. However, in addition, discussions with endusers [13] highlights four characteristics of solutions that are of interest:

The total emissions incurred by all employees over all visits

The total employee cost the total cost (based on £/hour) of paying the workforce

The total travel cost the cost of all of the travel activities undertaken by the workforce

The % of employees using car travel
We develop an algorithm based on MapElites to minimise the distance objective through projecting solutions onto a 4dimensional map, with each axis representing one of the above characteristics. Solution quality is compared to an Evolutionary Algorithm that uses exactly the same distance function as an objective, and an identical representation, crossover and mutation operators.
Both the MapElites algorithm and the EA use an identical representation of the problem, previously described in [13]. The genotype defines a grandtour [5], i.e. a single permutation of all required visits. This is subsequently divided into individual feasible journeys using a decoder. The genotype also includes additional genes that denote the model of transport to be used for the visit, i.e. public or private.
The decoder converts the single grand tour into a set of journeys to be undertaken by an employee. It examines each visit in the grand tour in order. Initially, the first visit in the grand tour specified by the genotype is allocated to the first journey. The travel mode(car or public transport) associated with this visit in the genome is then allocated to the journey: this travel mode is then adopted for the entire journey (regardless of the information associated with a visit in the genome). The decoder then examines the next visit in the grand tour: this is added to the current journey if it is feasible. Feasibility requires that the employee arrives from the previous visit using the mode of transport allocated to the journey within the time window associated with the visit. Note that a travel mode cannot be switched during a journey. Subsequent visits are added using the journey mode until a hard constraint is violated, at which point the current journey is completed and a new journey initiated.
3.1 The MAPElites Algorithm
The implementation of MAPElites used in this paper is given in Algorithm 1 and is taken directly from [6]. The algorithm starts with an empty, Ndimensional map of elites: {solutions and their performances }. An initialisation method generates randomsolutions; subsequent solutions are generated from elites stored in the archive. Following random selection of a solution (or solutions) from the archive, the RandomVariation() method applies either crossover followed by mutation, or just mutation, depending on the experiment. All operators utilised are borrowed from [13]. The mutation operator moves a randomly selected entry in the grandtour to another randomly selected point in the tour. The crossover operator selects a random section of the tour from parent1 and copies it to the new solution. The missing elements in the child are copied from parent2 in the order that they appear in parent2. For each solution , a featuredescriptor is obtained by discretising the four features of interest associated with the solution (Section 3) into 20 bins; for 4 dimensions this gives a map containing cells. The upper and lower bounds required for discretisation are taken as the maximum and minimum values observed by [13] for each dimension during an extensive experimental investigation. A new solution is placed in the cell in the archive corresponding to if its fitness (calculated as total distance travelled) is better than the current solution stored, or the cell is currently empty.
3.2 The Evolutionary Algorithm
The EA uses exactly the same representation and operators as the MapElites algorithm. The EA uses a population size of 100, with 40 children being created each generation. Each child is created by cloning from one parent or crossover using two parent. Parents are selected using a tournament of size 2. A mutationrate of 0.7 is applied to each child. The children are added back into the population, replacing the loser of a tournament, providing the child represents an improvement over the loser. The parameters for the EA were derived from the authors’ previous experience with similar algorithms applied to the same problem instances.
3.3 Problem Instances
We use a set of problem instances based upon the city of London, divided into two problem sets, termed London (60 visits) and BigLondon (110 visits). These instances were first introduced in [13]. Each visit represents a real postcode within London. For each of the problem sets, 5 instances are produced in which the duration of each visit is fixed to 30 minutes. Visits are randomly allocated to one of timewindows, where . For , the timewindow has a duration of 8 hours, for , the timewindows are “9am1pm” and “1pm5pm” etc. These instances are labelled using the scheme <set> numTimeWindows, i.e. Lon1 refers to an instance in the London with one timewindow and Blon2 refers to an instance of the BigLondon problem with 2 time windows. The fifth instance represents a randomly chosen mixture of time windows based on 1,2,4 and 8 hours.
If a journey is undertaken by car, paths between visits and distance is calculated according to the real roadnetwork using the GraphHopper library^{1}^{1}1 https://graphhopper.com/. This relies on Open StreetMap data^{2}^{2}2https://openstreetmap.org/. Car emissions are calculated as 140 g/km based upon values presented in [11]. For journeys by publictransport, data is read from the Transport for London (TfL) API^{3}^{3}3 https://api.tfl.gov.uk/ which provides information including times, modes and routes of travel by bus and train. Public transport emissions factors are based upon those published by TfL [11].
3.4 Experimental Parameters
The function evaluation budget is fixed in all experiments. We test two values: 1 million evaluations and 5 million, in each case the first 1000 evaluations are used for initialisation.. Each treatment is repeated 10 times on each instance. The best objective (distance) value is recorded for both treatments in each run. We apply Vargha and Delaney’s statistic [14] to assess difference between the algorithms. This is regarded as a robust test when assessing randomised algorithms. The test returns a statistic, , that takes values between 0 and 1; a value of 0.5 indicates that the two algorithms are stochastically equivalent, while values closer to 0 or 1 indicate an increasingly large stochastic difference between the algorithms. One of the most attractive properties of the VarghaDelaney test is the simple interpretation of the
statistic: for results from two algorithms, A and B, then is simply the expected probability that algorithm A produces a superior value to algorithm B. We follow the standard interpretation that a value in the range
indicates a small effect, a medium effect and a large effect. In addition we use two metrics to further analyse MapElites that are now defacto in the literature:
Coverage represents the area of the featurespace covered by a single run of the algorithm, i.e. the number of cells filled. For a single run of algorithm , where is the total number of cells filled by combining all runs of any algorithm on the problem under consideration.

Precision is also defined as optin reliability: if a cell is filled in a specific run, then the cellprecision is calculated as the inverse of the performancevalue (distance) found in the that cell in that run, divided by the bestvalue ever obtained for cell in any run of any algorithm (as this is minimisation). Cellprecision is averaged over all filled cells in an archive to give a single precision value for a run.
From the perspective of a planner, this represents the choice of solutions available to them, while precision indicates whether a cell contains a solution that is likely to of potential use to the planner. The averaged precision for a run indicates the overall quality of the solutions produced.
London Problems  Big London Problems  

Lon1  Lon2  Lon4  Lon8  Lonrnd  Blon1  Blon2  Blon4  Blon8  Blonrnd  
ME(1M) vs EA(1M)  
ME(5M) vs EA(5M) 
4 Results
The first research question aims to compare the performance of MAPElites and EA algorithms under different evaluation budgets to determine whether MAPElites might be useful in producing a set of acceptable solutions quickly. Two values are tested : the first is relatively small with 1 million evaluations (as in [13]); the second increases this to 5 million.
Figures 3(a,b) show the objective fitness values achieved by ME and the EA under both budgets on each of the problem instances. Table 1 shows effect size and direction according to the VarghaDelaney metric.
We note firstly that for 1 million evaluations for both sets of problems, the EA outperforms MapElites: the median of the EA is lower than ME, and the effect size is large in each case. However, when the budget is increased to 5M, MapElites outperforms the EA on all of the smaller problems with a large effect size; it also outperforms the EA on two of the larger problems, although the effect size is small. In the remaining 3 cases, the EA still wins.
Note that the figures (a)a and (b)b only show performance in terms of distance and do not take into account the four characteristics which provide insight to the endusers. These values are given in table 2. Firstly we note that for the smaller Lon problems, the bestvalue for each characteristic is obtained from the MAPElites algorithm in call cases. This includes Lon8 in which the best objective value for a solution is obtained by the EA, but the solution has poorer values for each of the 4 characteristics than the best solution obtained by MAPElites. Examining the results for the larger BLon problem demonstrates that MAPElites, despite a suboptimal performance (w.r.t the objective function), can still find solutions that out perform the EA in terms of the individual characteristics. We also note that for the Lon problem ME performs equally well regardless of time window constraints.
Dist  StaffCost  TravelCost  CO2  CarUse  
Lon1  204.64 : 206.93  841 : 974.67  82.54 : 85.79  133.83 : 163.75  0 : 0.25 
Lon2  223.3 : 231.02  870.67 : 1014.67  89.71 : 103.04  148.94 : 192.85  0.06 : 0.33 
Lon4  225.37 : 244.09  904.33 : 1276  94.63 : 116.74  158.77 : 194.59  0.04 : 0.33 
Lon8  230.8 : 230.34  967.33 : 1376.67  103.5 : 140.1  159.07 : 240.54  0.04 : 0.35 
LonRnd  244.91 : 259.11  944 : 1140.33  99.48 : 107.4  155.17 : 216.53  0.04 : 0.33 
Blon1  698.48 : 619.15  1987 : 2182.33  222.63 : 207.17  527.27 : 506.02  0.04 : 0.25 
Blon2  729.21 : 644.07  2107.67 : 2385.67  244.54 : 243.55  584.99 : 581.16  0.07 : 0.32 
Blon4  708.25 : 722.53  2183.33 : 2545.67  267.85 : 272.34  584.19 : 637.26  0.08 : 0.33 
Blon8  688.94 : 658.52  2209 : 2772  272.22 : 311.52  586.81 : 637.5  0.08 : 0.38 
Blonrnd  730.3 : 666.29  2256 : 2717.67  251.31 : 263.1  580.16 : 602.47  0.09 : 0.36 
4.1 Coverage and Precision
The coverage metric evaluates the ability of an individual run of an algorithm to place individuals in each of the cells. Note that it is possible that some of the cells cannot be filled in because the characteristics of that instance do not allow a feasible solution in that area.
The coverage achieved is displayed in figures (a)a and (b)b. Observe that coverage of over 70% is common with MAPElites, but the EA gives very poor coverage as it converges to a single solution. In realworld terms, the EA leaves the user with little choice of solution and no insight into the problem.
Figures (c)c and (d)d show the precision achieved by MAP Elites and the EA. Note that the highest precision achieved by the EA outperforms ME, note that precision is calculated over only those cells that are filled. The EA allocates all of its evaluations to very few cells, and thus find good solutions for those cells. In contrast, MAPElites has to distribute the same budget of evaluations across a much larger number cells, making it hard to always find a highperforming solution in each cell. In addition,many of the lowprecision scores for ME occur when one run does not find as highperforming a solution in a cell as another run of MAPElites. Running ME for more evaluations would likely improve precision (without danger of convergence due to its propensity to enforce diversity).
4.2 Gaining Insight into the Problem Domain
Figure 15 plots the cells, and the elite solutions contained, for each 2dimensional pairing of the 4 dimensions. Although the archive could be drawn in 4dimensions, discussion with users suggested that presenting 2dimensional maps provides more insight. Within each plot, each cell that is occupied is coloured to represent the distance objective value of the elite solution  lowest (best) values being green, highest being red. Note that most of the cells have a solution within them. Where there is an area with no solutions it tends to be at a corner of the plot. For instance, there are a lack of solutions with low and high travel costs (figure (e)e) or high car use and low (figure (a)a). From a planning perspective, figure 15 indicates (1) combinations of objectives that have no feasible solutions, and (2) quality of feasible solutions.
MAPElites tends to cover a larger part of the solution space. A common trend is that the solutions that are better in terms of one or two of the four characteristics are not always solutions that exhibit the lowest distance objective. The map also quantifies tradeoffs in objective value: for example, the extent to which increased car use increases compared to options that utilise more public transport. Another insight to be gained is the effects of higher public transport use (i.e. low car use) and staff cost: staff costs rise as public transport usage increases (figures (c)c a). This is due to the longer journey times experienced with public transport leading to increased working hours for staff.
A planner with responsibility for determining policies regarding staff scheduling may make use of the diagrams in figure (c)c to understand what solutions are possible given a specific priority. For instance, if it is determined that reducing is a priority then they can determine what possible tradeoffs exist for low solutions. Where a balance is required (i.e. lowering but also keeping financial costs in check) MAPElites allows the planner to find compromise solutions that are not optimal in any single dimension, but may prove useful when meeting multiple organisational targets or aspirations.
5 Conclusions
In this paper we have applied MAPElites to a real world combinatorial optimisation problem domain — a workforce scheduling and routing problem. Unlike previous applications of MAPElites that have tended to concentrate on design problems, WSRP is an example of a repetitive problem, requiring an optimisation algorithm to find acceptable solutions in a short period of time. In addition to an acceptable solution however, a user also requires choice, in being able to select potential solutions based on additional criteria of relevance to a particular company.
With reference to the research questions in section 1, we note that MAPElites tends to require a larger evaluation budget to produce results that are comparable with a straightforward EA for the problems tested. However, for small problems, affording a larger evaluation budget to MapElites enables it to discover improved solutions, compared to the EA. For larger problems, although our results show that MAPElites cannot outperform the EA in terms of objective performance, it does find solutions that outperform the EA in terms of the individual characteristics. It is likely that running MAPElites for longer would continue to improve its performance, without risking convergence. The increased cputime required for such a budget may be easily obtained through the use of multicore desktop computers or cloud based resources in a practical setting. We also note that the illumination aspect of MAPElites may aid the ability of planners to understand the factors that lead to good solutions and subsequently influence policy planning/determine choices based on organisational values, and that this aspect is of considerable benefit. Illumination of the solutionspace also provides additional insight to planners, who can gain understanding into the influence of different factors on the overall cost of a solution.
Future work will focus on further exploration of the relationship between objective quality and function evaluations, to gain insight into the anytime performance of MapElites, for use in a realworld setting. The granularity of the archive clearly influences performance and should be investigated by depth. Finally, an additional comparison to multiobjective approaches is also worth pursing — while this may improve solution quality however it is unlikely to offer the same insight into the entire searchspace.
References
 [1] Braekers, K., Hartl, R.F., Parragh, S.N., Tricoire, F.: A biobjective home care scheduling problem: Analyzing the tradeoff between costs and client inconvenience. European Journal of Operational Research 248(2), 428 – 443 (2016)
 [2] CastilloSalazar, J.A., LandaSilva, D., Qu, R.: A survey on workforce scheduling and routing problems. In: Proceedings of the 9th international conference on the practice and theory of automated timetabling. pp. 283–302 (2012)
 [3] Cully, A., Clune, J., Tarapore, D., Mouret, J.B.: Robots that can adapt like animals. Nature 521(7553), 503 (2015)

[4]
Hart, E., Sim, K., Urquhart, N.: A realworld employee scheduling and routing application. In: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation. pp. 1239–1242. ACM (2014)
 [5] Laporte, G., Toth, P.: Vehicle routing: historical perspective and recent contributions. EURO Journal on Transportation and Logistics (2013)
 [6] Mouret, J., Clune, J.: Illuminating search spaces by mapping elites. CoRR (2015)
 [7] Pugh, J.K., Soros, L.B., Stanley, K.O.: Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI 3, 40 (2016)
 [8] Pugh, J.K., Soros, L.B., Stanley, K.O.: Searching for quality diversity when diversity is unaligned with quality. In: International Conference on Parallel Problem Solving from Nature. pp. 880–889. Springer (2016)
 [9] Smith, D., Tokarchuk, L., Wiggins, G.: Rapid phenotypic landscape exploration through hierarchical spatial partitioning. In: Handl, J., Hart, E., Lewis, P.R., LópezIbáñez, M., Ochoa, G., Paechter, B. (eds.) Parallel Problem Solving from Nature – PPSN XIV. pp. 911–920. Springer International Publishing, Cham (2016)
 [10] Tarapore, D., Clune, J., Cully, A., Mouret, J.B.: How do different encodings influence the performance of the mapelites algorithm? In: Proceedings of the Genetic and Evolutionary Computation Conference 2016. pp. 173–180. GECCO ’16, ACM, New York, NY, USA (2016)
 [11] TFL: Travel in london: Key trends and developments. Tech. rep., Transport for London (2009)
 [12] Urquhart, N., Fonzone, A.: Evolving solution choice and decision support for a realworld optimisation problem. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 1264–1271. GECCO ’17, ACM (2017)
 [13] Urquhart, N.B., Hart, E., Judson, A.: Multimodal employee routing with time windows in an urban environment. In: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation. pp. 1503–1504. ACM (2015)
 [14] Vargha, A., Delaney, H.D.: A critique and improvement of the cl common language effect size statistics of mcgraw and wong. Journal of Educational and Behavioral Statistics 25(2), 101–132 (2000)
 [15] Vassiliades, V., Chatzilygeroudis, K., Mouret, J.B.: Using centroidal voronoi tessellations to scale up the multidimensional archive of phenotypic elites algorithm pp. 1–1 (08 2017)
Comments
There are no comments yet.