Preparing for the Unexpected: Diversity Improves Planning Resilience in Evolutionary Algorithms

by   Thomas Gabor, et al.
Universität München

As automatic optimization techniques find their way into industrial applications, the behavior of many complex systems is determined by some form of planner picking the right actions to optimize a given objective function. In many cases, the mapping of plans to objective reward may change due to unforeseen events or circumstances in the real world. In those cases, the planner usually needs some additional effort to adjust to the changed situation and reach its previous level of performance. Whenever we still need to continue polling the planner even during re-planning, it oftentimes exhibits severely lacking performance. In order to improve the planner's resilience to unforeseen change, we argue that maintaining a certain level of diversity amongst the considered plans at all times should be added to the planner's objective. Effectively, we encourage the planner to keep alternative plans to its currently best solution. As an example case, we implement a diversity-aware genetic algorithm using two different metrics for diversity (differing in their generality) and show that the blow in performance due to unexpected change can be severely lessened in the average case. We also analyze the parameter settings necessary for these techniques in order to gain an intuition how they can be incorporated into larger frameworks or process models for software and systems engineering.



There are no comments yet.


page 1


Planning with Incomplete Information

Planning is a natural domain of application for frameworks of reasoning ...

Task Interaction in an HTN Planner

Hierarchical Task Network (HTN) planning uses task decomposition to plan...

A feasibility study of a hyperparameter tuning approach to automated inverse planning in radiotherapy

Radiotherapy inverse planning requires treatment planners to modify mult...

Constraining Influence Diagram Structure by Generative Planning: An Application to the Optimization of Oil Spill Response

This paper works through the optimization of a real world planning probl...

Query and Resource Optimizations: A Case for Breaking the Wall in Big Data Systems

Modern big data systems run on cloud environments where resources are sh...

Provenance-Based Assessment of Plans in Context

Many real-world planning domains involve diverse information sources, ex...

Workflow Complexity for Collaborative Interactions: Where are the Metrics? -- A Challenge

In this paper, we introduce the problem of denoting and deriving the com...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

As automatic optimization in various forms makes its way into industrial systems, there is a wide range of expectations about the upcoming capabilities of future “smart systems” [1, 2, 3, 4, 5]. For most of the current applications, the optimization part of the system takes place offline, i.e., not while the application is actually performing its main purpose: The product shipped to the customer is fixed after initial training and does not self-adapt (anymore). Instead, it may only gather data that is then used at the vendor’s side to either improve the product’s performance via software updates later on or assist in building the product’s successor. This, of course, misses out on interesting applications that may highly benefit from further optimization even while they are running. In this paper, we focus on the exemplary case of a layout configuration for the positioning of work stations inside a (smart) factory: Depending on the products that need to be build and depending on the current status of the machines involved, we may desire different workflows for the same product at different times during the factory’s life. For most current factories, however, the arrangement of workstations is planned far in advance and then fixed until human intervention.

One of the reasons for opting for offline adaptation is that the vendor usually has access to more computational power and that the employed adaptation process can benefit from connecting data input from a variety of customers. However, increasing computational resources and online connectivity mitigate these issues. A possibly more important aspect is the issue of consistent performance: An online planner, while theoretically able to react to sudden changes in its environment and/or objective, may take some time to reach good plans and during that time the solutions provided by the planner may be unsuitable.

Expected Change

The usefulness and importance of self-optimization at the customer’s side has already been claimed in the original vision of autonomic computing [6] and has been shown on many occasions since [3, 7, 8]

. In these cases, self-optimization usually refers to a process of specialization, i.e., the system is built with a large variety of possible use cases in mind and learns to work best for the few of these it actually faces on site. Intuitively, we may want to build a planner that works on factory layouts in general and that can then specialize on the specific needs of a single factory or a single situation (machine failure, e.g.) if necessary. We expect this approach to work iff every possible situation and every pair of follow-up situation is considered when evaluating a factory layout. As long as we know that machines might fail with a certain probability, we can take this into account and plan redundantly with respect to machine usage. This is what we call

expected change of the evaluation function.

Unexpected Change

Still, we may not want our self-optimizing planner to completely break on any deviation from the specified scenarios. We imagine that intelligent planners should invest a certain amount of effort to think about and prepare for “what ifs”, even when the respective scenarios have not been expected to happen during system design or training. This is further motivated by the fact that many industry applications require the adaptive component to produce a solution better than a certain quality threshold but do not benefit as much from the system finding configurations that are just slightly better beyond that threshold. Instead, that computational effort might be better put into finding alternative solutions that might not be just as good as the primary solution that was just found, but then again might be feasible even when the primary solution fails for some unexpected reason.

This argument falls in line with the claim of self-protection for autonomic systems [6]: Our system should not only be able to react and recover from negative external influences but also spend a reasonable effort on actively preparing for negative events. Via this self-protection property we aim to increase the overall resilience of the planning process and by extent the robustness of the system using our planner.

Scope of This Work

As the original contribution of this paper we identify that diversity in evolutionary algorithms, which we consider a primary example for a heuristic optimization algorithms in this paper, is of central importance for the algorithm’s reaction to change and that explicitly optimizing for diversity helps to prepare for changes, even when they cannot be foreseen by the optimization process in any way. We introduce means to formally define the phenomenon of unexpected change in relation to an online planner.

To this end, we first formally define the notions of change and unexpectedness that we used intuitively until now (Section II). We then immediately turn to an example of a smart factory domain in which unexpected change might occur and specify our experimental setup (Section III). We introduce our approach at maintaining diversity using two different diversity metrics (Section IV) and sum up the results of applying this approach in the previously defined experiment (Section V) before we discuss related work (Section VI) and conclude this paper (Section VII).

Ii Foundations

We assume that to realize modern challenges in industry, software products need to feature a certain degree of autonomy, i.e., they feature at least one component called planner capable of making decisions by providing a plan of actions which the system is supposed to perform to best fulfill its intended goal [8, 9]. This goal is encoded by providing the system with a fitness function that can be used to evaluate plans. A planner respecting a fitness function performs self-optimization.

We claim that for many real-world applications it is often not only important to eventually adapt to new circumstances but also to avoid causing any major damage to overall success while adapting. It follows that the planner needs to offer a suitable solution at all times, even directly after change in the environment. This property can be compared to the robustness of classical systems, i.e., the ability to withstand external changes without being steered away too far from good behavior [10]. Robustness can often be tested against a variety of well-defined external influences. However, not every influence a system will be exposed to can be foreseen.111When possible, endowing systems with means to perceive all possible influences and events might be highly beneficial to resilience. We work with the assumption that this is not always possible or feasible. The notion of resilience captures the system’s ability to withstand unanticipated changes [11].222It follows that we consider resilience a special instance of robustness: Robustness may include both anticipated and unanticipated change. Resilience focuses on the latter. One approach to prepare a system for unexpected circumstances is to make it adapt faster, so that its adaptive component finds a new plan of actions faster once the old one is invalidated. However, this approach is still purely reactive and we thus cannot prevent the immediate impact of change.

To increase system resilience, we thus might want the planner to become proactive towards possible changes that may occur to the environment and by extension the planner’s objective. In order to lessen the blow of unexpected changes, the planner thus needs to prepare for it before it actually occurs. Note that for the changes we are talking about in this section, we still assume that they are unexpected at design time. The planner therefore has no means of predicting when or what is going to happen. Still, we desire for a planner to be caught off-guard as seldom as possible. A planner that needs to re-plan less often would then be considered more resilient with respect to unexpected change. We claim that explicitly increasing planning resilience aids a system’s ability to self-protect and is thus a useful handle to explicitly expose to the developers of such a system.


Planners perform (usually stochastic) optimization on the system’s behavior by finding plans that (when executed) yield increasingly better results with respect to a specified objective. That objective is given via a fitness function , where is the domain of all possible plans and is the domain of environments said plans are to be executed in. For the purpose of this paper, we assume that we want to minimize the real-valued output of the fitness function. We can then describe a planner formally as a function from an environment to a plan with the following semantic:

Note that due to the possibly stochastic nature of the environment and in extent the evaluation of the fitness function , we compute the expected value of the application of . Further note that due to the stochastic nature of the planning methods considered in this paper, we may not actually return the single best result over the domain of all plans but when the stochastic optimization process works, we expect to yield a result somewhat close (described by ). To compute a reasonable value for , a given plan will usually be executed in a simulated version of . We call the process of repeatedly calling plan to execute the currently best solution online planning, which implies that we may call it for changing .

Changing Environments

We can write any occurrence of change in the environment as a function . Obviously, if we allow any arbitrary change to happen to the environment, we can construct arbitrarily “evil” environments and cause the planner to perform arbitrarily bad. But frankly, we do not care for a planner managing a smart grid’s power production to perform well when a meteor destroys Earth. What is much more realistic and thus much more desirable to prepare for, however, is changes that apply only to parts of the environment. Without looking into the data structure of the environment, we assume that these kinds of changes then only affect the fitness of some possible plans, but do not change the fitness landscape of the domain completely. We thus call a given change function within a given environment reasonable iff it fulfills the formula:

Here, described a small value used as a minimally discernible distance between fitness values. Likewise, the exact meaning of

is to be defined by the case. From this definition, it follows that a planner can prepare for a reasonable change by finding a good plan among the plans that are not affected by the reasonable change. When the change occurs, it can then provide a “quite good” plan immediately before it even begins to search for good plans among the changed parts of the domain. Thus, to increase planning resilience, we want our planner to not converge on and around the best optimum it has found so far, but to always keep an eye out for other local optima, even when they do not look as promising at the moment.

Note that this behavior can be likened to strategies developed to prevent premature convergence, a problem with metaheuristic search methods that occurs even in static domains [12, 13].


Even if a planner can prepare for a reasonable change by diversifying, there are often more efficient ways to prepare for expected change: Usually, we would include instances of expected change into the fitness function by simply evaluating the system in the changed environments as well. In that case, the planner can still fully converge on the predicted path of the environment and not spend computational resources on diversification. However, we claim that in most practical applications the future is not completely predictable and changes may happen that the planner cannot anticipate.

We define a change function to be called unexpected iff the planner is not prepared for the change induced, i.e., if the actions it would take in the unchanged environment differ from the actions it now has to take in the changed environment . Formally, this can be expressed as follows:

Again, an exact definition of would need to be derived from specific system requirements. Note that this is a purely extrinsic view on unexpectedness. We want to provide a black-box definition of unexpectedness that does not depend on the internal workings of the planner and is thus as general as possible. The intuition behind it is that if there was a way for the planner to know that and how the change is going to happen when looking at the environment , the plan generated via would already consider the consequences of said change and thus (to some extent) match the plan for .333Note that this argument is based on the fact that we defined plan in such a way that it tries to optimize for when possible. The result is that we can regard the definitions of “reasonable” and “unexpected” as upper and lower bounds on the amount of change introduced in the fitness landscape.

Iii Experiment

To test the validity of our claims about the importance of diversity for planning resilience, we build a model example in which we try to observe the effects of environmental changes as clearly as possible.


We imagine a smart factory scenario where a work piece carried by a mobile (robotic) agent needs to be processed by a setup of work stations. More specifically, we need to perform the 5 tasks in order on a given work piece as quickly as possible. In order to do so, our factory contains 25 work stations placed randomly on a grid structure. Each work station can only perform one of the tasks, so that our factory has 5 identical work stations to use for any specific task. Given a work piece starting at the top left corner of the grid, we need to determine the shortest route the work piece can travel for it to reach exactly one station of each task in the right order. See Figure 1 for a simplified illustration of this setup.

For each run of our experiment, we randomly generate an matrix of work station coordinates where each row in corresponds to a task and each column to an identification number for each of the available work stations for each task. Thus, in our experimental setup we fix and .

Genetic Algorithm

In order to find a short path that fulfills our requirements, we employ a genetic algorithm [12]

. Closely modeling our problem domain, we define the genome as a 5-dimensional vector

so that denotes which of the 5 identical work stations should be visited next in order to fulfill the -th task where denotes the task , denotes task , and so on. The environment provides a mapping from these to their respective positions on the grid, which is used by a distance function for the environment to compute the traveling distance between two work stations. We then define a function waycost to compute the overall length of a given path, summing the Manhattan444Obviously, real-world mobile transport robots are more likely to navigate in Euclidean space. However, we argue that this is not crucial for the results presented in this paper and choose the computationally simpler approach. distance between all its vertices:

Fig. 1: Illustration of the factory setup, simplified for only 3 tasks and 4 stations for each task. Coming from the starting point , the genetic algorithm needs to determine an as short as possible path that traverses a station for each task in order (see arrows).

For the standard genetic algorithm, this waycost function is already sufficient as a fitness function to evolve a shorter navigation path. It is important to note that while we closely restrict the search space to paths that cross each type of station exactly once (and in the right order), we do not aid the genetic algorithm by providing any notion of position in space or the closeness of different stations beyond what is encoded in the waycost function above.

For the genome defined above, we use the following evolutionary operators: Mutation chooses a value uniformly at random, then generates a new random value , assigning . Recombination happens through uniform crossover on the vectors of two individuals. Furthermore, for all experiments performed in this paper, we use a mutation rate of per individual to provide strong random input and a crossover rate of . That means that with a chance of per individual that individual is selected as a first mate for recombination. Two potential mates are then randomly selected from the population: the fitter one is used for as a partner for crossover. We further augment the search by randomly generating some new individuals from scratch each generation. This process (also called hyper-mutation [14]) happens with a chance of per individual in the population.

Random Change

The crucial point of this experimental setup is the occurrence of a random change of environmental circumstances. The present experimental setup is fixed to an evaluation time of generations as earlier experiments have shown our setup of an evolutionary algorithm can easily converge in under generations. We then define a function for unexpected change , which chooses factory stations at random and effectively disables them. This is implemented by repositioning them to an area far off the usual factory area by adding to their respective coordinates. This means that while the plans containing the removed stations are still theoretically feasible and can be assigned a valid waycost, the increase in waycost is so high that no plan containing any of the removed stations should be able to compete with plans contained within the actual factory area when it comes to evolutionary selection. From a random initial factory layout we generate two changed factory layouts by applying the randomized change function . Because we want to be able to compare the scale of fitness values before and after the unexpected change more easily, we start the evolutionary algorithm on the factory configuration that is already “missing” a few stations. After generations, we switch to factory configuration , which has stations disabled as well, but probably different ones.555It is important to note that this setup means that in many cases none of the stations that go bad during the switch are even included in the best path found by the genetic algorithm. In these cases, the evolutionary process does not have to adapt in any way. In order to analyze the cases when the removal of stations actually does make a huge difference, we need to execute the experiment multiple times. We chose this approach because it allows us use an unbiased change function as opposed to a change function that specifically targets the workstations actually used throughout the experiment. The realm of biased, even directly adversarial change functions is an interesting topic of future research.

Note that this change is reasonable for small (according to the definition above) because it only affects the fitness of a maximum of possible plans, i.e., those plans which include at least one of the “wrong” machines in or . Furthermore, the change is unexpected as the shakeup of the stations’ positioning is communicated to the evolutionary algorithm only via the change of the waycost function’s values in its fitness evaluation step and thus leaves the adaptation process without any chance of anticipating that event. Nonetheless, the individuals of the evolutionary process are constantly evaluated according to their fitness in the current state of affairs, thus forcing them to adapt to the new situation in order to keep up once reached levels of fitness values.

Iv Approach

We attempt to solve the problem described above using evolutionary algorithms. Evolutionary algorithms have already been applied successfully to many instances of online adaptation, i.e., problems with a changing fitness function [15, 16, 17]. They are an instance of metaheuristic search algorithms and work by emulating natural evolution.

Diversity in Genetic Algorithms

In the standard scenario, once the fitness function changes, previously good solutions can possibly be evaluated to have very bad fitness and are thus removed from the evolutionary process. However, if the genetic search has already converged to a local optimum, it can be very hard for the search process to break out of it, because when all known solutions lie very closely together in the solution space, there is no clear path along which the population must travel in order to improve. The problem of a genetic search getting stuck in a local optimum with little chance to reach the global optimum (or at least much better local ones) is called premature convergence [12]. It is known that the diversity among the members in the population has a strong impact on the evolutionary process’s likelihood to converge too early. The Diversity-Guided Evolutionary Algorithm (DGEA) observes a population’s diversity throughout the evolutionary process and takes action when it falls below a given threshold [18].

For online genetic algorithms, we show that maintaining a certain level of diversity throughout the population helps to react better to the change occurring in the environment. To this end, we apply two possible measurements for diversity, which we will both test for the above scenario. In either case, we transform the genetic algorithm’s fitness function to a multi-objective optimization problem [19, 13, 20] with a weighting parameter , yielding a fitness function depending on the individual to be evaluated , the environment , and the population as a whole:

It is important to note that in order to meaningfully define the diversity of one individual, we need to compare it to the rest of the population, causing us to introduce the population as an additional parameter to the fitness function.666

In general, we might want approximate this comparison by using a sample drawn from the population or another estimate instead. Likewise, we could consider computing diversity not only against the current generation of individuals but also against a selection of individuals from the past, using for example a “hall of fame” approach

[21]. The evaluation of such techniques is left for future research. The fitness function thus becomes a relative measure with respect to other individuals in the population. This makes it necessary to re-evaluate fitness in each generation even for unchanged individuals. However, since we assume changes in the environment and thus the fitness function may occur during the online execution of the genetic algorithm anyway, this model seems to fit our situation. We can now define two different diversity measures by providing a definition for the similaritycost function, which penalizes low diversity.

Domain-Distance Diversity

This can be thought of as the more standard approach to diversity in search and optimization problems. In fact, the authors of [22] show that many common diversity measurements are quite similar to this basic method: We define a simple distance measure between the individuals in the solution space. For a discrete, categorial problem like the one presented here, there is little alternative to just counting the specific differences in some way.

Note that we write to access the -th individual of the population and to represent the amount of individuals in a population. We subtract from the sum because the given individual is still part of the population and thus adds a cost of by matching itself perfectly. We thus maintain the (cosmetic) property that in a population of completely different individuals, the average similarity is .

While the implementation of this diversity measure looks pretty straightforward, it requires complete prior knowledge of the search space provided and and thus introduces further dependencies. For example, the above definition is unfit for continuous search spaces and while a continuous similaritycost function may easily be thought up, optimization problems consisting of a mix of discrete and continuous variables then require more weighting parameters to adequately combine the scales over which the respective similaritycost functions operate.

Genealogical Diversity

As a more different comparison we implemented a inheritance-based diversity estimate introduced in [13]. The aim of genealogical diversity is to utilize those parts of the domain knowledge that are already encoded in the setup of the genetic algorithm, i.e., the mutation and recombination function the human developer is required to code for the specific genome anyway. We can thus try to quantify the difference between two individuals by estimating the amount of evolution steps it took to develop these different instances of solution candidates. This yields a measure of “relatedness” between individuals not unlike genealogical trees in biology or human ancestry. If all individuals in a population are closely related (sibling or cousins, e.g.), we know that there can only be limited genetic difference between them and thus estimate a low diversity for the respective individuals with respect to that population.

However, instead of building and traversing a genealogical tree, the implementation of genealogical diversity used in [13] employs a technique inspired by the way genetic genealogical trees are constructed from the analysis from genomes in biological applications: For this approach, we first need to augment the individuals’ genome by a series of trash bits . For our experiment, has proven to be reasonable. However, we do not change the waycost fitness function, so that it does not recognize the additional data added to the genome. This leads to the trash bits not being subjected to selection pressure from the primary objective of the genetic algorithm.

As the trash bits are randomly initialized like the other variables in the genome, every individual of the first generation should most probably start out with a very different trash bitstring from anyone else’s, given that we choose the length of the trash bitstring sufficiently large. Without direct selection pressure, there is no incentive for individuals to adapt their trash bitstring in any specific way. However, the trash bits are still subjected to mutation and recombination, i.e., whenever a specific individual is chosen for mutation, a random mutation is performed on the trash bitstring as well and whenever a recombination operation is executed for two individuals, their trash bitstrings are likewise recombined. In our implementation at hand, we use one-bit flip for mutation and uniform crossover for recombination.

Using the definition of a comparison function as provided above, we can thus define the similaritycost function for genealogical diversity as follows:

Again, we subtract to ignore self-similarity when iterating over the population. It should be noted that when accessing the -th component of an individual inside the sum, we are protruding into the dimensions solely populated by trash bits, retrieving the -th trash bit of said individual.

In order to compute the similarity between two individuals, we now only consider the trash bits, for which we always have the same distance metric regardless of the actual problem domain of the original genetic algorithm. Domain logic is only used indirectly, as the measure we estimate can be regarded as the edit distance between two individuals using the genetic operators the evolutionary process is equipped with. However, since the trash bits are inherited by individuals from their parents and without direct selection pressure, they are not biased toward values resulting in higher fitness; yet, they are still a sufficient representation of the genealogy of an individual, as we show in the following section.

V Results

In order to evaluate the benefit of the presented approaches, we simulate the different behavior of genetic algorithms when using the presented diversity measures or no diversity measure at all. In order to achieve a meaningful result considering the highly probabilistic nature of the applied method to generate scenarios, we perform the evaluation on 1000 different scenarios. Figure 2 shows the top fitness achieved at a specific point in time by a single run averaged over all 1000 runs. By taking a look at the optimization process as a whole, it can be seen that a great deal of improvement compared to the random initialization is done during the first steps of evolution, giving an estimate of how good the achieved solutions are in relation to “just guessing”. In Figure 3 we show the respective diversity measurements from these runs.

Fig. 2: Best (i.e., lowest-valued) fitness for current generation averaged over 1000 runs. While the evolutionary algorithm without any recognition of diversity (black) shows a steep spike at the time of the environmental change (after 50 generations), genealogically (red) and the domain-dependent (blue) diverse genetic algorithms manage to mitigate the negative amplitude to varying extent.
Fig. 3: Diversity measures of the top individual (solid line) as well as the population average diversity (dotted line) per generation averaged over 1000 runs. We draw both genealogical (red) and domain-dependent (blue) diversity into the same figure as they are both normalized on , even though no direct translation is possible between their values. In both cases, the population’s average diversity shows a specific behavior following the unexpected change.

We can observe that the diversity-aware algorithms show a slower learning rate in the beginning, since they do not only optimize the plotted primary fitness function, but also the diversity function and thus cannot focus as well on better primary results. However, once the environmental change occurs, they are likewise better prepared for a change in fitness and react with a much smaller increase in waycost than the standard genetic algorithm. In a scenario like ours, where a smart factory needs to be able to efficiently dispatch new workpieces at all times, this can be a huge advantage. We observe that following the unexpected change, average diversity first increases as well-established “families” of similar individuals die out. Due to a new convergence process, diversity then drops until good solutions are found. Finally, diversity seems to reach a similar level as before the unexpected change. The “right” amount of diversity is naturally controlled by the parameter of the combined fitness function. For these experiments we found parameters for domain-dependent diversity and for genealogical diversity via systematic search.

The definition of “right”, however, depends on the problem domain. In most practical cases, we expect some (non-functional) requirements to be present, which specify the robustness properties we want to uphold. For now, these properties must then be verified via statistic testing. Deriving (statistical or hard) guarantees from a stochastic search process like an evolutionary algorithm is still an interesting topics of future work. Goven no further requirements for consistent quality of service, a reasonable setting for might achieve that the online planner does not perform worse than a random planner at any point in time, even at the moment of unexpected change.

Figures 4 and 5 show that systematic search, including the random population’s value before the evolutionary process starts: the fitness achieved by the domain-dependent and the genealogical genetic algorithm, respectively, strongly depends on the choice of parameter , i.e., how to distribute focus between the primary objective (small waycost) and the secondary objective (high diversity). Experiments have shown, that diversity-aware genetic algorithms can show a variety of behaviors for different . To provide an intuition about the effects various settings for have on the algorithm’s performance, we can see that higher values of generally cause the evolutionary search to produce less optimal results but to perform more stable when facing unexpected change. For the domain-dependent diversity, this phenomenon shows stronger with higher -values showing almost no impact of the unexpected change but relatively bad results in general. The approach of genealogical diversity seems to be a bit more robust to the setting of in that it still clearly shows a tendency to optimize over time.

We chose to showcase genealogical diversity specifically because it works on a rather domain-independent level and introduces only few parameters. Furthermore, it is rather robust with respect to the choice of said parameters. For the length of the used bitstring , Figure 6 shows that on all but the smallest values for the genetic algorithm performs most similarly. Especially rather large values for (that still take up very little memory) do not show any deterioration in the planner’s behavior, which means that the choice for that parameter can be made rather comfortably.

Fig. 4: Top fitness for current generation averaged over only 100 runs each, plotted for using domain-dependent diversity. The darker the color of the line, the higher is the depicted value.
Fig. 5: Top fitness for current generation averaged over only 100 runs each, plotted for using genealogical diversity. The darker the color of the line, the higher is the depicted value.
Fig. 6: Top fitness for current generation averaged over 100 runs each, plotted for using genealogical diversity. The darker the color of the line, the higher is the depicted value.

We also analyze how much change a diversity-aware planner can handle. Figure 7 shows the behavior of the three exemplary planners just around the moment of unexpected change for various amounts of change they are subjected to. Naturally, bigger (and thus un-reasonable) change can impact even diverse system. The increase in costs for the large alterations in the generation-49-line (dashed) shows that on the upper end of the scale we started generating problem instances that generally have fewer good solutions. For more reasonable change (, which still means that up to out of machine positions may be changed), both diversity-aware algorithms perform comparably and clearly better than the non-diverse planner. Most remarkably, the domain-dependent variant manages to cope with changes with almost no consequence for its performance.

Fig. 7: Analysis of the fitness amplitude around unexpected change of varying intensity for the non-diverse (black), the genealogically diverse (red) and the domain-dependent diverse (blue) evolutionary algorithm respectively, plotted against the parameter for the alternation amount of the change function , all results averaged over 100 runs. The dashed line shows the population’s top fitness value just before the change (generation 49). The solid line shows the top fitness just at the moment of unexpected change (generation 50). The dotted line shows the fitness one generation later (generation 51), when it has started to improve again.

Vi Related Work

The notion of diversity is researched in many different fields of science. In this Section, we discuss a few of these and their relation to the issue presented in this paper. To the author’s knowledge, the issue of planning resilience is a rather novel one and applying genetic diversity to promote it is a unique contribution of this paper.

Planning and Reinforcement Learning

Rolling horizon genetic algorithms for online planning are widely used in real-time general video game playing [16, 17]. However, these approaches typically optimize with respect to the expectation of reward, thus they suffer from the drawbacks of non-diverse planning as discussed in this work. Statistical online planning has recently attracted a fair amount of research interest [23, 24, 25]

, also due to the successful combination with deep learning technology

[26, 27, 28]. In general, diversity as a consideration for resilient planning is orthogonal to these approaches and could straightforwardly be combined with recent developments from the online planning literature.

We also see a relation to another recent line of research in reinforcement learning that explores ways of modeling aleatoric or epistemic uncertainty about future rewards or action selection mechanisms as distributions rather than by expectation

[29, 30, 31, 32, 33, 34]. This enables learning and decision making agents to explicitly deal with multimodal distributions of utility. It allows to incorporate risk and uncertainty into the decision making process, and to effectively deal with the exploration-exploitation tradeoff. In particular, distributional approaches foster the learning of diverse behavior, yielding robust transfer of learned skills to new, unseen situations [31].

Diversity in Software

In software engineering, diversity often takes the form of generating, offering, or using functionally equivalent variants of software artifacts during software development or deployment [35]. An extensive survey of current techniques is given in [19]. However, all of these differ from the approach in this paper in that we explicitly search for functional alternatives in the context of this paper, i.e., we want our diverse solutions to represent solutions to different problems (in order to possibly anticipate future problems) and not different solutions of equal quality to the same problem.

Still, the techniques presented in literature to exploit the prevalence of multiple instances of the same software artifact during runtime might be applied to variants generated by a diverse genetic algorithm as well. The work in this paper can be regarded as a first step to expose the population-based view of diversity within an automated search process with the process of software development. Similar trends in software engineering are discussed in [3, 36].

Diversity in Genetic Algorithms

Genetic algorithms make up a vast field of research. Regarding the basic definitions, this work follows the comprehensive description in [12]. Diversity has been recognized as an important indicator for good performance, although mainly applied to the static scenarios of offline adaptation: The authors of [37] provide an extensive survey of various methods to enforce diversity in genetic algorithms. These fall into the categories of external methods controlling the evolutionary process “from the outside” [38, 18] and methods integrating diversity as an additional objective into the genetic algorithm, using the concepts of multi-objective genetic optimization [39, 40]. Following the biological inspiration, the aptitude of genetic algorithms to an online setting with a changing environment has been thoroughly analyzed [41, 42].

The author of [14] describes a problem setting not unlike the one presented in this paper, i.e., the combination of maintaining diversity and searching in a changing environment. The issue of premature convergence is tackled by integrating a certain amount of random search into the genetic algorithm by performing hyper-mutation. This has since become standard procedure and is included in all genetic algorithms presented in this paper, which aims to further improve the resilience of the search process.

Most recently, the authors of [43] tackled the issue of using an evolutionary algorithm as an online planner for a complex software system. While they discuss high diversity as a key factor in achieving better re-planning results, they use diversity purely as an observation not as a direct goal of the evolutionary process. Instead, they too resort to an operator akin to high amounts of hyper-mutation to increase diversity by creating a new population that only inherits certain parts of the old population. To this end, the system must be able to directly recognize the event of an unexpected change after it has happened.

It is important to note that a lot of literature about diversity in genetic algorithms (or metaheuristic search in general) is concerned about covering the frontier of Pareto-optimal solutions in the search space [44]. The notion of diversity used in this paper, however, is a more genetic one and has as one of its main features that is not

automatically derived from the nature of the fitness function. Interesting connections to game theory may still be made but are outside the scope of this work.

Resilience and Robustness

The preparation for unexpected or previously wrongly modeled change is an important issue for the practical application of machine learning in industry

[4]. From an engineer’s point of view, the diversity of the population of plans can be regarded as a typical non-functional requirement (NFR) with the cost of the plan representing the functional requirement. Applying NFR engineering processes to self-adaptive systems is still a new idea and a clear canon of relevant NFRs for these new challenges has not yet been found [2, 9].

Vii Conclusion

Since we expect future software systems to be increasingly self-adaptive and self-managing, we can also expect them to feature one or multiple components tasked with online planning. Online planning allows systems to learn to optimize their behavior in the face of a moving target fitness. However, it comes with a few pitfalls, one of which is the fact even small changes in the target fitness can have detrimental effects on the current plans’ performance. It is thus imperative to keep an eye on a healthy level of diversity in our pool of alternative plans. As we have shown, this can severely soften the blow to overall performance, should only a few plans become impractical due to external circumstances.777It still holds that if we allow arbitrary changes in the environment, it is always possible to design a completely new fitness function so that any given instance of an evolutionary process becomes arbitrarily bad with respect to the new altered fitness function. This is due to the No-Free-Lunch theorem [45]. For realistic scenarios, however, there usually is a limit to how quickly and how drastically the fitness function is expected to change. A thorough analysis of those limits for some practical domains may present an interesting point for further research.

The diversity of a planner functions as a non-functional requirement for classic applications. Certain levels of desired diversity may be specified in order to augment system architectures that revolve around the optimization process of the system in order to provide flexibility on the component level [46]. This should be expected to strongly influence other properties commonly applied to complex self-adaptive systems like robustness or flexibility.

On an application level, the introduced concept of diversity-aware optimization may prove especially useful when the reduction in amplitude of fitness causes the system behavior to fall below a predefined quality threshold (or to do so more often at least). A diversity-aware planner might then be able to continue working as usual as its back-up plans fulfill the required quality agreement just as well while a non-diverse planner might more often feel the need to stop the execution of its plans (and thus halt the system in general) until it reaches a new plan of acceptable quality. In this case, we may formulate a non-functional requirement such as planning resilience, measuring how frequent and how big unexpected changes need to be in order to push the planner out of its quality requirements. Using the parameter , engineers can adjust the focus point of the planning component between performance and resilience optimization. How well statistical judgements can be made about said resilience property still needs to be evaluated, though.

It is up to future research to determine how the concept of diversity (especially genealogical diversity) generalizes for other optimization techniques like the cross-entropy method or simulated annealing. One way to integrate these techniques into the framework defined in this paper may be to set up a pool of solution candidates via ensemble learning [47].

Embracing diversity seems especially promising in search-based software testing (SBST) as test suites need to adapt faster to new possible exploits. In DevOps, developers push relatively small updates that need testing more frequently. Nonetheless, the changes applied to the code by the developer usually fall into the category of unexpected change as we defined it in this paper. That means, that diverse test generators could possible adapt quicker to the new software system under test. The mutual influence between diversity-aware evolutionary algorithms and co-evolutionary approaches888For example, when SBST is used to analyze a system under test that is by itself capable of adapting and evolving, the complete testing cycle features two adversary evolutionary algorithms and is thus considered co-evolutionary [1]. may be an interesting point of further research [21]. A likewise connection in biological systems has been found [48].

Many of the theoretical foundations explaining the ideal structure of a population for various optimization purposes are still unexplored. For instance, we assumed an unpredictable but neither explicitly hostile nor cooperative environment. Any scenario where the change occurs not only unexpected but intentional is likely to have fundamentally different properties.

We focused our study on the implications of using diversity within a planner and how the resilience to environmental change may be indicated in a quantifiable way. We have shown that diversity during planning can aid planning resilience in the face of change. Furthermore, we can employ such method in a domain-independent way using genealogical diversity and still achieve valuable results. Software engineering frameworks and processes are now needed to expose desired NFRs like planning resilience to the software and system design and test them adequately.


  • [1] M. Harman, S. A. Mansouri, and Y. Zhang, “Search-based software engineering: Trends, techniques and applications,” ACM Computing Surveys (CSUR), vol. 45, no. 1, p. 11, 2012.
  • [2] R. De Lemos, H. Giese, H. A. Müller, M. Shaw, J. Andersson, M. Litoiu, B. Schmerl, G. Tamura, N. M. Villegas, T. Vogel et al., “Software engineering for self-adaptive systems: A second research roadmap,” in Software Engineering for Self-Adaptive Systems II.   Springer, 2013.
  • [3] M. Wirsing, M. Hölzl, N. Koch, and P. Mayer, Software Engineering for Collective Autonomic Systems: The ASCENS Approach.   Springer, 2015.
  • [4] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané, “Concrete problems in AI safety,” CoRR, vol. abs/1606.06565, 2016. [Online]. Available:
  • [5] K.-D. Thoben, S. Wiesner, and T. Wuest, ““industrie 4.0” and smart manufacturing–a review of research issues and application examples,” Int. J. of Automation Technology Vol, vol. 11, no. 1, 2017.
  • [6] J. O. Kephart and D. M. Chess, “The vision of autonomic computing,” Computer, vol. 36, no. 1, pp. 41–50, 2003.
  • [7] B. Chen, X. Peng, Y. Yu, and W. Zhao, “Requirements-driven self-optimization of composite services using feedback control,” IEEE Transactions on Services Computing, vol. 8, no. 1, pp. 107–120, 2015.
  • [8] P. Arcaini, E. Riccobene, and P. Scandurra, “Modeling and analyzing mape-k feedback loops for self-adaptation,” in Proceedings of the 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems.   IEEE Press, 2015, pp. 13–23.
  • [9] L. Belzner, M. T. Beck, T. Gabor, H. Roelle, and H. Sauer, “Software engineering for distributed autonomous real-time systems,” in Proceedings of the 2nd International Workshop on Software Engineering for Smart Cyber-Physical Systems.   ACM, 2016, pp. 54–57.
  • [10] S. C. Bankes, “Robustness, adaptivity, and resiliency analysis.” in AAAI fall symposium: complex adaptive systems, vol. 10, 2010.
  • [11] V. D. Florio, “On the constituent attributes of software and organizational resilience,” Interdisciplinary Science Reviews, vol. 38, no. 2, 2013.
  • [12] A. E. Eiben and J. E. Smith,

    Introduction to evolutionary computing

    .   Springer, 2003, vol. 53.
  • [13] T. Gabor and L. Belzner, “Genealogical distance as a diversity estimate in evolutionary algorithms,” in Genetic and Evolutionary Computation Conference, Berlin, Germany, July 15-19, 2017, Companion Material Proceedings, P. A. N. Bosman, Ed.   ACM, 2017.
  • [14] J. J. Grefenstette et al., “Genetic algorithms for changing environments,” in PPSN, vol. 2, 1992, pp. 137–144.
  • [15] I. K. Nikolos, K. P. Valavanis, N. C. Tsourveloudis, and A. N. Kostaras, “Evolutionary algorithm based offline/online path planner for uav navigation,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 33, no. 6, pp. 898–912, 2003.
  • [16] D. Perez, S. Samothrakis, S. Lucas, and P. Rohlfshagen, “Rolling horizon evolution versus tree search for navigation in single-player real-time games,” in Proceedings of the 15th annual conference on Genetic and evolutionary computation.   ACM, 2013, pp. 351–358.
  • [17] R. D. Gaina, S. M. Lucas, and D. Perez-Liebana, “Rolling horizon evolution enhancements in general video game playing,” in Computational Intelligence and Games (CIG), 2017 IEEE Conference on.   IEEE, 2017.
  • [18] R. K. Ursem, “Diversity-guided evolutionary algorithms,” in Internat. Conference on Parallel Problem Solving from Nature.   Springer, 2002.
  • [19] A. Konak, D. W. Coit, and A. E. Smith, “Multi-objective optimization using genetic algorithms: A tutorial,” Reliability Engineering & System Safety, vol. 91, no. 9, pp. 992–1007, 2006.
  • [20] T. Gabor, L. Belzner, and C. Linnhoff-Popien, “Inheritance-based diversity measures for explicit convergence control in evolutionary algorithms,” in The Genetic and Evolutionary Computation Conference (GECCO).   ACM, 2018.
  • [21] C. D. Rosin and R. K. Belew, “New methods for competitive coevolution,” Evolutionary Computation, vol. 5, no. 1, pp. 1–29, 1997.
  • [22] M. Wineberg and F. Oppacher, “The underlying similarity of diversity measures used in evolutionary computation,” in Genetic and Evolutionary Computation (GECCO 2003).   Springer, 2003, pp. 206–206.
  • [23] A. Weinstein and M. L. Littman, “Open-loop planning in large-scale stochastic domains.” in AAAI, 2013.
  • [24] R. Eastwood, R. Alexander, and T. Kelly, “Safe multi-objective planning with a posteriori preferences,” in High Assurance Systems Engineering (HASE), 2016 IEEE 17th International Symposium on.   IEEE, 2016.
  • [25] L. Belzner, “Time-adaptive cross entropy planning,” in Proceedings of the 31st Annual ACM Symposium on Applied Computing.   ACM, 2016.
  • [26] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al.

    , “Mastering the game of go with deep neural networks and tree search,”

    Nature, vol. 529, no. 7587, 2016.
  • [27] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton et al., “Mastering the game of go without human knowledge,” Nature, vol. 550, no. 7676, 2017.
  • [28] T. Anthony, Z. Tian, and D. Barber, “Thinking fast and slow with deep learning and tree search,” CoRR, vol. abs/1705.08439, 2017. [Online]. Available:
  • [29] M. G. Bellemare, W. Dabney, and R. Munos, “A distributional perspective on reinforcement learning,” arXiv preprint arXiv:1707.06887, 2017.
  • [30] W. Dabney, M. Rowland, M. G. Bellemare, and R. Munos, “Distributional reinforcement learning with quantile regression,” arXiv preprint arXiv:1710.10044, 2017.
  • [31] T. Haarnoja, H. Tang, P. Abbeel, and S. Levine, “Reinforcement learning with deep energy-based policies,” CoRR, vol. abs/1702.08165, 2017. [Online]. Available:
  • [32] J. Schulman, P. Abbeel, and X. Chen, “Equivalence between policy gradients and soft q-learning,” arXiv preprint arXiv:1704.06440, 2017.
  • [33] M. Ghavamzadeh, S. Mannor, J. Pineau, A. Tamar et al., “Bayesian reinforcement learning: A survey,” Foundations and Trends® in Machine Learning, vol. 8, no. 5-6, pp. 359–483, 2015.
  • [34] B. O’Donoghue, I. Osband, R. Munos, and V. Mnih, “The uncertainty bellman equation and exploration,” arXiv:1709.05380 preprint, 2017.
  • [35] I. Schaefer, R. Rabiser, D. Clarke, L. Bettini, D. Benavides, G. Botterweck, A. Pathak, S. Trujillo, and K. Villela, “Software diversity: state of the art and perspectives,” 2012.
  • [36] M. Hölzl and T. Gabor, “Continuous collaboration: a case study on the development of an adaptive cyber-physical system,” in Proceedings of the First International Workshop on Software Engineering for Smart Cyber-Physical Systems.   IEEE Press, 2015, pp. 19–25.
  • [37] G. Squillero and A. Tonda, “Divergence of character and premature convergence: A survey of methodologies for promoting diversity in evolutionary optimization,” Information Sciences, vol. 329, 2016.
  • [38] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii,” in International Conference on Parallel Problem Solving From Nature.   Springer, 2000, pp. 849–858.
  • [39] M. Laumanns, L. Thiele, K. Deb, and E. Zitzler, “Combining convergence and diversity in evolutionary multiobjective optimization,” Evolutionary computation, vol. 10, no. 3, pp. 263–282, 2002.
  • [40] C. Segura, C. A. C. Coello, G. Miranda, and C. León, “Using multi-objective evolutionary algorithms for single-objective constrained and unconstrained optimization,” Annals of Operations Research, vol. 240, no. 1, pp. 217–250, 2016.
  • [41] F. Vavak and T. C. Fogarty, “Comparison of steady state and generational genetic algorithms for use in nonstationary environments,” in Proc. of IEEE Internat. Conference on Evolutionary Computation.   IEEE, 1996.
  • [42] N. Bredeche, E. Haasdijk, and A. Eiben, “On-line, on-board evolution of robot controllers,” in International Conference on Artificial Evolution (Evolution Artificielle).   Springer, 2009, pp. 110–121.
  • [43] C. Kinneer, Z. Coker, J. Wang, D. Garlan, and C. Le Goues, “Managing uncertainty in self-adaptive systems with plan reuse and stochastic search,” in Proceedings of the 13th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), 2018.
  • [44] J. Horn, N. Nafpliotis, and D. E. Goldberg, “A niched pareto genetic algorithm for multiobjective optimization,” in Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on.   IEEE, 1994.
  • [45] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Trans. on Evolutionary Comp., vol. 1, no. 1, 1997.
  • [46] M. Hölzl and T. Gabor, “Continuous collaboration for changing environments,” in Transactions on Foundations for Mastering Change I.   Springer, 2016, pp. 201–224.
  • [47] E. Hart, A. S. Steyven, and B. Paechter, “Evolution of a functionally diverse swarm via a novel decentralised quality-diversity algorithm,” arXiv preprint arXiv:1804.07655, 2018.
  • [48] C. Bérénos, K. M. Wegner, and P. Schmid-Hempel, “Antagonistic coevolution with parasites maintains host genetic diversity: an experimental test,” Proc. of the Royal Society of London B: Biological Sciences, 2010.