Optimizing genetic algorithm strategies for evolving networks

04/07/2004 ∙ by Matthew J. Berryman, et al. ∙ The University of Adelaide 0

This paper explores the use of genetic algorithms for the design of networks, where the demands on the network fluctuate in time. For varying network constraints, we find the best network using the standard genetic algorithm operators such as inversion, mutation and crossover. We also examine how the choice of genetic algorithm operators affects the quality of the best network found. Such networks typically contain redundancy in servers, where several servers perform the same task and pleiotropy, where servers perform multiple tasks. We explore this trade-off between pleiotropy versus redundancy on the cost versus reliability as a measure of the quality of the network.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Evolutionary computation uses solution space search procedures inspired by biological evolution [1]. These search procedures use ideas from biological evolution such as mating, fitness, and natural selection. Individuals undergo natural selection, whereby organisms with the most favorable traits are more successful in having offspring. Genetic algorithms (GAs) rely on describing systems in terms of their traits (or phenotype) and then a fitness function (or how well they reproduce). Then we can evolve better solutions, with a higher fitness function, by allowing transfer of hereditary characteristics (genes) to the next generation for fit functions. The idea of applying such biological concepts to evolutionary computing originated with John Holland in his seminal paper on the topic of adaptive systems [2].

Evolutionary computational techniques such as genetic algorithms have many advantages over traditional optimization algorithms. Current optimization algorithms require many assumptions to be made about the problem, for example with gradient-based searches, the requirement is that the function be smooth and differentiable. Evolutionary algorithms require no such assumptions, only requiring a way of measuring the “fitness” of a solution 

[3]. With each succeeding generation, the algorithm tries to better fulfill the specifications described by the fitness function. The other advantage is adaptability to a changing problem. For example with traditional optimization procedures, any change in the specification or problem constraints requires solving the problem from the start. This is not necessary with evolutionary algorithms where one can continue the algorithm with a different set of constraints or solution using the current “population” or set of solutions [4]. GAs can offer advantages over related techniques such as hill climbing [5, 6]. Of great importance is the requirement of the genetic algorithm to support mating, or crossover between two graphs [5]. In our previous work [7], we only considered mutation operators. Thus our solution was never able to truly explore a wide region of the search space, and apart from initial hill climbing and other wide random jumps was quite slow at improving the network design. This current paper details the existing background to our work and the ongoing work of implementing crossover operators and different fitness functions.

Although there are a large number of applications of genetic algorithms to designing neural networks 

[8, 9, 10], there are very few devoted to designing computer or telecommunication networks [11, 12], and none of these explicitly capture the issue of pleiotropy. An alternative evolutionary approach using cellular automata has been applied to pleiotropy versus redundancy tradeoffs in an organizational system [13]. Pleiotropy is a term used to describe components that perform multiple tasks [14, 15], while redundancy refers to multiple components performing one same task. Such pleiotropy and redundancy of components can be clearly seen in a client-server based network comprising of server nodes and client nodes. The conventional setup of such networks can have servers serving multiple clients, which is an example of pleiotropy, while clients can be connected to many servers, which is an example of redundancy.

A typical engineering problem is to determine the optimal design solution or set of solutions, while maximizing efficiency. The main aim of the project is to use evolutionary computation algorithms to search for an optimal client-server network, which minimizes cost and maximizes reliability and flexibility by exploring the pleiotropy-redundancy search space.

Redundancy is where one task or outcome is carried out by more than one agent – assuming the independence of agents, either acting independently or together. It has the advantage of conferring robustness or integrity upon the system, because if one agent were to fail, others are able to perform the task. However, redundancy may be costly, as the overlapping of agents may be inefficient or wasteful. Despite this, in some systems the wastage may be justified if the task or outcome is so important that the system will fail in its absence, and therefore be selectively disadvantaged. Figure 1 shows an example of a redundant system.

Figure 1: This figure shows the task, labeled 1, being performed by agents A, B, and C; thus two of these agents are redundant. An example in a network situation would be load-balancing a web server, where any one of three servers can serve a particular site to a client. Note that there is an extra cost associated with this redundancy, so in this example we are paying for three servers instead of one. However the system is robust, so if one of the servers is busy or breaks down, then the task (such as serving a web site) can still be performed.

The opposite of redundancy is pleiotropy, where one agent may perform many tasks. This has a number of distinct advantages. It is efficient, and allows for spatial and temporal flexibility. Its major cost is that it is dependent upon the history of the system, that is, any given agent may only be working under certain constraints imposed upon it by the peculiar evolutionary history of that system. Despite this, both temporal and spatial pleiotropy may exist, where a given agent can perform qualitatively different tasks, as well as perform the same task (or different tasks) at different times. As pleiotropy enables efficiency, it therefore confers selective advantages. How pleoitropic an agent is will depend upon the context in which it is working.

Figure 2: This figure shows a single agent, A, performing multiple tasks, labeled 1, 2 and 3. An example in a network situation could be a server handling multiple client requests, such as sending email to client 1 while sending a web page to client 2. While this is cost effective, it lacks the robustness to failure of a redundant system as shown in Figure 1.

What happens when you combine the two? When both pleiotropy and redundancy are combined, the system possesses properties that it otherwise lacks when pleiotropy or redundancy exist on their own. The advantages include an increase in the robustness of the system due to the redundancy build into it. It is more efficient due to the pleiotropy. The system becomes inherently more flexible and the costs of redundancy are offset by the increase in efficiency due to the presence of pleiotropy. An example of a system with a combination of pleiotropy and redundancy is shown in Figure 3.

Figure 3: This figure shows two agents, both of which perform multiple tasks. An example in a network situation would be two servers, both providing the same email and web services to a number of clients. This system has robustness as if one server fails, both email and web services can still be provided. It also minimizes cost, as it would cost the same as two servers with one providing just email services and the other hosting a web site.

2 Methods

In this section we describe the structure and representation of the network, the details of the genetic algorithm used, the initialization and parameters used for designing the network using the genetic algorithm, and the fitness function. We developed a graphical user interface (GUI) for running the genetic algorithm, to allow for easy user modification of the network parameters, this also allows one to watch the evolution of the network.

2.1 Network structure

The network consists of a set of servers, a set of clients (which can also function as routers), and a set of links between those various nodes. A graph data structure is used to represent the network, with each node (client or server) in the graph having the following properties:

  • node label, “C” for a client (including routers) or “S” for a server

  • node ID, which also serves as a grid reference of the node for display in a GUI

  • node failure rate, a value between zero and one giving the probability of failure per time step

  • current state, working or non-working

  • number of time steps since failure, zero if working

  • details of the inbound and outbound network connections.

The edges, the links in the network, have the following properties:

  • link label indicating whether the link is a link between clients (including routers) or between a client or router to a server

  • edge ID, which also serves as a pair of grid references for display in a GUI

  • edge failure rate, a value between zero and one giving the probability of failure

  • current state, working or non-working

  • number of time steps since failure, zero if working.

We note here that for the set of nodes in a network, , the set of edges is . This becomes important where we later consider the crossover operation.

2.2 Network construction

We initially start with a set of clients () and servers (), with no links. The positions of the clients and servers are set at random, with a minimum spacing between them. Each client is assigned a traffic value, , at random (), which indicates the amount of traffic requested by the client that is to be transmitted across the network. Each server has a fixed amount of traffic it can serve.

2.3 Mutation operator

We define a utilization parameter,

(1)

where is the size of the set of servers, describing how well the servers are able to deliver their available load to the clients. If the utilization is less than , then more links are added at random to carry the extra server capacity to clients. If, on the other hand, the utilization is greater than then either links are removed (reducing the amount of traffic that is able to be requested from servers) or more servers are added. The network thus evolves by starting without any connections, and through mutations including:

  • adding links to increase

  • removing links to decrease

  • adding servers to decrease

  • links failing

  • links being repaired.

2.4 Crossover operator

Consider two graphs and , with nodes and , and edges and respectively. We then wish to consider forming a new graph , with a set of edges . To do this, we need to make sure that each node used by appears in , since we require . We could add each node associated with to , guaranteeing that , in time (for edges and nodes). Instead, we simply create the set of nodes in time. This (in the average case) overestimates the number of nodes required, somewhat breaking the role of the repair rate of the mutation operator. However, considering this is not clearly defined for crossover anyway, we choose the faster solution. For a selection of the top networks for mating, we have a total population size of . The default value of we use is , giving a total population size of , however we vary this to ascertain the effect on convergence on the optimal solution.

Having established a network and a set of genetic algorithm operators, we then need a measure of fitness, in order to define that we first introduce Dijkstra’s algorithm.

2.5 Dijkstra’s algorithm

Dijkstra’s algorithm is an efficient algorithm for finding the shortest path between two nodes (or vertices) in a graph [16, 17]. Define to be the shortest path from node to node . If there is an edge on then we can break down the path into and . The distance, of the shortest path from to , including at most edges, can be written as:

(2)

for the edge cost of and the set of nodes. What we wish to calculate is the cost for all pairs of nodes and . By starting with the adjacency matrix where for not an edge, we compute the shortest distance matrix by calculating in modified matrix multiplications where is replaced by and is replaced by in the computations of each element. Note that where no path exists between and , so it provides information on connectivity in addition to costs.

2.6 Fitness and cost functions

The aim is to find an optimal network, which minimizes cost () and maximizes reliability (). With this in mind, we define our fitness function, , to be:

(3)

for a cost function and a reliability function . Our cost function is given by

(4)

for and as defined above. Minimizing this cost function decreases the total cost of the links towards the minimum cost graph for that connectivity. Using just the cost alone results in a system that tends towards no links over many generations. Our reliability function is defined by

(5)

which describes the connectivity of the graph.

2.7 Redundancy and pleiotropy functions

We define the overall measure of redundancy for the whole network as

(6)

where is the redundancy, the out-degree or number of links out of client , and the set of servers. Similarly, the overall measure of pleiotropy is

(7)

where is the pleiotropy, the in-degree or number of links into server , and the set of clients.

3 Results

3.1 Overview

We used both GA strategies to build a network for a number of different link failure probabilities and repair rates, and evaluated the performance of the strategies, and found the best network parameters to use. Using these network parameters, we then used the best GA strategy to find the best network possible. An example of an evolved network is shown in Figure 4. The initial network contains no links, the first ones added with the first mutation step of the GA.

Figure 4: This figure shows an example of an evolved network, with clients and servers and a set of links between them. The clients and servers have been positioned at random, with a minimum spacing to avoid clutter.

3.2 Varying failure probability

Here we consider the rate of convergence of the genetic algorithm, the cost, and fitness functions for varying link failure probabilities. The rate of convergence is defined as the time to three successive generations with the same maximum fitness. As a worst-case scenario, we consider a link failure probability of 10%, and consider a range of failure probabilities as low as . The results of this are shown in Table 1 and Figure 5.

Link failure prob. Convergence time Max fitness Final cost Final pleiotropy Final redundancy
10%
1%
0.1%
0.01%
0.001%
Table 1: This table shows the convergence time in generations, the maximum fitness (no units), the final cost (’000s of arbitrary units), final pleiotropy (network links) and final redundancy (network links) for varying link failure probabilities. We have averaged the results over five runs of the network optimizer, considering 75 generations each run, and show the average result

one standard deviation.

(a) The fitness, pleiotropy, and redundancy measures for networks evolving with a link failure probability of 10%.
(b) The fitness, pleiotropy, and redundancy measures for networks evolving with a link failure probability of 0.001%.
Figure 5: Here we plot the fitness, pleiotropy, and redundancy measures in of the current best network out of a set of networks that are evolving through 75 generations, where the networks are evolving to different link probabilities. As expected, a system with greater probability of link failure converges to a lower pleiotropy and higher redundancy.

3.3 Impact of population size on convergence to solution

Here we consider the same parameters as for the previous subsection, but with the link failure probability set to 1% and for varying population sizes. Again, the rate of convergence is defined as the time to three successive generations with the same maximum fitness. Not shown here is the average time to evolve through the 75 generations, which takes longer for longer populations, since more offspring have to be generated and therefore the fitness function has to be computed more often. The results of this are shown in Table 2.

Pop. size Convergence time Max fitness Final cost Final pleiotropy Final redundancy
6
10
15
21
28
Table 2: This table shows the convergence time in generations, the maximum fitness (no units), the final cost (arbitrary units), final pleiotropy (network links) and final redundancy (network links) for varying population sizes. We have averaged the results over five runs of the network optimizer, considering 75 generations each run, and show the average result one standard deviation.

4 Discussion & Conclusions

The cross-over operator allows the genetic algorithm to converge much faster to solutions than the mutation operator alone, and allows it to explore a much wider range of networks in the search for a solution. Our improved cost function allows the genetic algorithm to search the space around a given reliability much more effectively, since it no longer wants to remove links to reduce the cost to zero (giving an unrealistic infinite fitness). The pleiotropy and redundancy converge to a rate of about one to two links from each client to each server. For the population ranges and link failure probabilities we considered, the convergence time shows no significant difference, although it seems to decrease for increasing link reliability.

More work is needed to analyze the convergence time in more detail, in particular, removing some of the initial graph randomness and link failures would eliminate most of the factors contributing to the high variance in convergence time for varying population sizes. It would also be an interesting idea to benchmark the genetic algorithm on a small network for which a human could easily determine the optimal solution. We also propose optimizing for reliability and cost separately, and combining the two populations using the crossover operator.

Acknowledgements.
We gratefully acknowledge funding from The University of Adelaide.

References

  • [1] P. H. Winston, Artificial Intelligence, 3rd Ed., Addison Wesley, 1993.
  • [2] J. H. Holland, “Outline for a logical theory of adaptive systems,” Journal of the Association for Computing Machinery 9(3), pp. 297–314, 1962.
  • [3] M. Mitchell, An Introduction to Genetic Algorithms, MIT Press, 1996.
  • [4] D. B. Fogel, “What is evolutionary computation,” IEEE Spectrum 37(2), 2000.
  • [5] M. Mitchell, J. H. Holland, and S. Forrest, “When will a genetic algorithm outperform hill climbing,” in Advances in Neural Information Processing Systems, J. Cowan, G. Tesauro, and J. Alspector, eds., 6, pp. 51–58, Morgan Kaufmann Publishers, Inc., 1994.
  • [6] M. Mitchell and S. Forrest, “Fitness landscapes: royal road functions,” in Handbook of Evolutionary Computation, T. Bäck, D. Fogel, and Z. Michalewicz, eds., pp. B.2.7.5:1–25, Oxford University Press, 1997.
  • [7] M. J. Berryman, W.-L. Khoo, H. Nguyen, E. O’Neill, A. Allison, and D. Abbott, “Exploring tradeoffs in pleiotropy and redundancy using evolutionary computing,” in Proc. SPIE: BioMEMS and Nanotechnology, D. Nicolau, ed., 5275, p. in press, 2003.
  • [8] R. K. Belew, J. McInerney, and N. N. Schraudolph, “Evolving networks: using the genetic algorithm with connectionist learning,” in Proc. Second Conference on Artificial Life, C. Langton, C. Taylor, J. Farmer, and S. Rasmussen, eds., pp. 511–547, Addison-Wesley, 1991.
  • [9] F. Gruau and D. Whitley, “Adding learning to the cellular development of neural networks: Evolution and the baldwin effect,” Evolutionary Computation 1(3), pp. 213–233, 1993.
  • [10] I. de Falco, A. Iazzetta, P. Natale, and E. Tarantino, “Evolutionary neural networks for nonlinear dynamics modeling,” Lecture notes in computer science 1498, pp. 593–602, 1998.
  • [11] A. Abuali, W. Wainwright, and D. Schoenfeld, “Determinant factorization: a new encoding scheme for spanning trees applied to the probabilistic minimum spanning tree problem,” in Proc. Sifth Intl. Conf. on Genetic Algorithms, pp. 470–477, 1995.
  • [12] H. Sayoud and K. Takahashi, “Designing communication network topologies using steady-state genetic algorithms,” IEEE Communications Letters 5(3), pp. 113–115, 2001.
  • [13] T. L. Hoo, A. Ting, E. O’Neill, A. Allison, and D. Abbott, “Real life: a cellular automaton for investigating competition between pleiotropy and redundancy,” in Proc. SPIE: Electronics and Structures for MEMS II, N. Bergmann, ed., 4591, pp. 380–389, 2001.
  • [14] S. N. Coppersmith, R. D. Black, and L. P. Kadanoff, “Analysis of a population genetics model with mutations, selection, and pleiotropy,” J. Statistical Physics 97, pp. 429–457, 1999.
  • [15] M. Morange, “Gene function,” C.R. Acad. Sci. III 323, pp. 1147–1153, 2000.
  • [16] E. Dijkstra, “A note on two problems in connection with graphs,” Numeriche Mathematik 1, pp. 269–271, 1959.
  • [17] P. Narváez, K.-Y. Siu, and H.-Y. Tzen, “New dynamic algorithms for shortest path tree computation,” IEEE/ACM Trans. Networking 8, pp. 734–746, Dec. 2000.