Aging, double helix and small world property in genetic algorithms

by   Marek W. Gutowski, et al.

Over a quarter of century after the invention of genetic algorithms and miriads of their modifications, as well as successful implementations, we are still lacking many essential details of thorough analysis of it's inner working. One of such fundamental questions is: how many generations do we need to solve the optimization problem? This paper tries to answer this question, albeit in a fuzzy way, making use of the double helix concept. As a byproduct we gain better understanding of the ways, in which the genetic algorithm may be fine tuned.



There are no comments yet.


page 1

page 2

page 3

page 4


Surface Registration Using Genetic Algorithm in Reduced Search Space

Surface registration is a technique that is used in various areas such a...

An Adaptive Genetic Algorithm for Solving N-Queens Problem

In this paper a Metaheuristic approach for solving the N-Queens Problem ...

Genetic Algorithms and its use with back-propagation network

Genetic algorithms are considered as one of the most efficient search te...

Industrial Application of Artificial Intelligence to the Traveling Salesperson Problem

In this paper we discuss the application of AI and ML to the exemplary i...

Genetic Bi-objective Optimization Approach to Habitability Score

The search for life outside the Solar System is an endeavour of astronom...

WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models

Learning sparse linear models with two-way interactions is desirable in ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The carrier of genetic information in the Nature, the DNA molecule, has very peculiar structure. This is a spiral-staircase shaped object, whose steps are made from pairs selected from among only four other kinds of molecules. Those are: cytosine (C), guanine (G), adenine (A) and thymine (T). There are only two correct kinds of those pairs, not , as one might naively suspect. Namely, the cytosine always forms a pair with guanine (C G), and adenine “likes” thymine (G T). So, if the sequence in a part of DNA molecule is something like

then the other half of DNA’s double helix has the structure

Such a redundancy has not yet been exploited by genetic algorithms, at least the present author failed to find in available literature anything similar in conjunction with this class of optimization algorithms (or any other class …).

In this paper we will show, that the idea of doubled (or duplicated) genetic information may be very useful in fine-tuning of genetic algorithms and in formulating the stopping criteria, which are quite general, independent of the problem under study.

2 The role of mutations

Many elements of existing genetic algorithms are straightforward computer implementations of the natural evolutionary phenomena. We have an evolving population, usually fixed in size, in which individuals mate, have offspring and are mutated. All those processes are driven by one or more random number generators, that is by purely stochastic forces, and, additionally, by the Darwinian rule of survival of the fittest.

It is well known that the crossover processes alone cannot guarantee finding the optimal region in a search space, at least when the population is small. Without mutations, the premature convergence is then almost certain, thus revealing the inability of the algorithm to find the desired solution. The mutations cause rapid changes of location in the search space thus making possible to reach an explore the regions, which are at all not accessible without them.

The majority of researchers and practitioners prefer rather low rate of mutations. This is because the very frequent mutations turn the genetic algorithm into generic Monte Carlo procedure, which is to be avoided, since we hope that some kind of intelligence will lead us much faster to the desired solution than completely blind trials.

3 The fate of mutated genes

Suppose, for simplicity, that the smallest part of the chromosome, called here gene, consists of only a single bit. Let this bit has initial value of . The mutation process is then nothing else than negating (flipping) this bit. The question we want to examine now is:

What will be the state of this bit after generations?

Let the probability of bit flip per generation is equal to

. We can write


where numerates consecutive generations and the number in parenthesis means the state of the bit in question. Taking into account that for any


we can rewrite the first row of relation (1) as


or, in shorter form


where is the probability that the examined bit is in the state after generations.

The recurrence formula (5) seems very similar to the construction of a simple random number generator, so one might expect that it produces chaotic sequences of numbers. Surprisingly (?) this is not the case, as we shall see.

Consider the behavior of when . It is easy to see, that for (no mutations) the sequence is constant (and thus convergent); it remains at whatever initial state – in our case . The other limiting case, , is also easy: now the recurrence relation (5) simplifies to


and our bit permanently oscillates between two possible states — there is no convergence. For all other cases, i.e. , the sequence (5) is always convergent and we have


We skip the easy proof. It is interesting, however, that the convergence has oscillatory character when , while for the sequence grows monotonically. There is no chaotic behavior. The case is special again: it produces constant sequence .

4 Aging of the population

It is widely believed, that the mutations are essentially wrong thing for any individual. They usually damage the genetic material, sometimes to the extent which makes further existence of the individual impossible. It only rarely happens that the mutated individual is better fitted to its environment than the average one in a given population. On the other hand not all mutations are lethal. In the well known Penna [1], [2] model of biological evolution, for example, the subsequent mutations are accumulated in every chromosome. The older the chromosome, the more mutations it carries and finally dies either after single lethal mutation or due to the accumulation of many less damaging defects (we do not discuss here the so called Verhulst factor describing the decrease of the population size due to overcrowding and thus the limited accesss to necessary resources by any single individual). If no replication occurs, then the entire population is eventually extinct. By contrast, the “small” changes in the genotypes are silently transmitted to the offspring chromosomes.

In the Nature, however, an error correcting capability is at work. The double helix structure of the DNA strand limits the proliferation of defective genes. The two parts of DNA (and RNA as well) are not independent, but complementary. After unzipping, just before the replication, only one strand of DNA contains the distorted genetic information, while the other sequence is correct. Therefore only % of offspring DNA helices will be damaged.

5 Implementation in genetic algorithms

The idea is to use in genetic algorithms the chromosomes, which are “doubled”, i.e. which consist of regular, well known part (“visible”) and the other (“invisible”) parallel structure, of identical length, which is initially the exact negation of the first part, as below:

0 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 — visible part
1 1 0 1 1 0 0 1 0 0 0 1 0 1 1 0 — invisible part

We do not mark any gene boundaries here. A single gene may consist of one or more bits; the lengths of consecutive genes need not to be equal to each other. The chromosome becomes mutated when at least one of its bits is flipped.

The genetic material, representing trial solution of the original problem, is placed in the visible part. Every mutation is reflected in the state of this very part, the invisible part being completely insensitive to mutations. Quite contrary, the crossover operation is performed on both parts, mixing independently two visible and two invisible parts of the involved chromosomes, at the same crossover point(s). It is easy to see, that the crossover operation alone, no matter one- or multipoint, will never destroy the complementarity of the visible and invisible part of any chromosome. Of course, the fitness factor will always be computed for the visible part only. Now the process of aging of the population, generation after generation, may be tracked by looking at the invisible part of every chromosome. We can easily count all the bits in the entire population, which were changed by mutations — it is enough to compare the visible and invisible parts of all chromosomes and count the bits, which are identical in both parts. Of course, the even number of mutations applied to the same bit will go unnoticed. This is sometimes called backward evolution. From the earlier considerations, and assuming that the mutation probability per bit and per generation is small (more precisely , as is always the case), we can conclude that the fraction of (effectively) mutated bits in the entire population will be an increasing function of time. The stochastic limit for this fraction is equal to . The value of uniquely separates the genetic algorithms from Monte Carlo type of optimization.

6 Discussion

For the fraction of mutated bits initially increases linearly with the number of generations. This observation does not take into account directly neither the structure of the individual chromosome (number of genes or bits it consists of) nor the number of chromosomes in the population. Such information is hidden in the value of parameter

— probability of a particular bit being flipped during a single cycle of simulation (single epoch). Thus the the number of generations to reach

% of flipped bits may be roughly estimated as

. In practice, as the performed simulations show, the fraction of effectively mutated bits reaches the value of — for the first time — after roughly to times greater number of epochs. The exact analytical formula for as a function of is neither easy to obtain nor very informative. For any , is a polynomial of order in variable . The graph of vs. , however, shows striking similarity to the graph of the expression:


This is only an observation based on several evolutionary processes, guessed rather than formally derived. Anyway, utilizing this approximation, we can say that after epochs the population is mature and gradually looses its innovative forces, and after


epochs it becomes old and practically useless. It is time then to switch to other locally searching routine to improve further the optimal set of unknown parameters, if appropriate. The formula (10) is the main result of this paper. It sets the upper limit for the number of necessary generations.

In conclusion, during calculations we should monitor the behavior of the fraction of mutated bits. When this variable reaches the value of , then we may say that every second bit in the population was flipped at least once. This is the other way of saying that the search space has been explored quite thoroughly and no significant improvement of the fitness should be expected.

Why do we make such a claim?

If the fitness of all chromosomes was the same, i.e. there was no evolutionary pressure of any kind, then after generations the search space would be covered quite uniformly, although irregularly, with trial points. Further sampling would only make finer the already existing irregular grid of trial points. When there is a preference for better fitted chromosomes, then after generations all interesting regions, or at least their neighborhoods, should have already been found. Observing the values of the fitness function alone, either as an average for the population or for the best individual only, may be very misleading.

Considering the efficiency of the algorithm, understood as the number of generations necessary to find the optimal fitness, we can see that it is inversely proportional to . For every practical purpose the condition


(average number of bits flipped in the whole population during one epoch is equal to one) sets the lowest sensible limit for . is the total number of bits in the population (in its visible part). Of course, higher values of will speed up the evolutionary search.

Is the prescription      perfect?

Certainly not. It gives the reasonable estimate of the number of generations needed in fairly regular cases. The genotype space with only one well fitted chromosome and all other equally bad is an evident exception. What can be done in such cases is to increase the number of bits assigned to every continuous unknown (kind of oversampling), for the price of increased computational effort, of course, since this approach is equivalent to the use of finer grid of points in the search space. To purely combinatorial problems (only integer and/or logical unknowns) our estimate should be of similar value.

As an example let us take the data from [4]. Here the population consisted of 30 chromosomes, bits long each. and was set equal to . According to (11), should be set as at least — while the quoted mutation rate was times higher than that number. The population should become mature after some generations and the evolutionary search should be terminated after generations at the latest. In fact, the author of [4] reports that satisfactory results were achieved after generations (every run was limited to generations). The search space had only points, so this example may be considered small and not representative. Nevertheless, no more than only evaluations of the fitness factor were enough to reach valuable conclusions.

7 Why is genetic algorithm so effective anyway?

We must be aware, that the search space, no matter how large, is always discrete and finite for this class of optimization algorithms. It may be considered as a random graph, in which every chromosome is a vertex (we are not interested in the edges). This graph is by no means random, but clearly exhibits the small world property, i.e. the average (Hamming) distance between its vertices scales as the logarithm of their number. Indeed, -bit chromosomes are points in element universe. The maximum distance between any two chromosomes is equal to , so the average distance must never exceed this number. Incidentally . This fact alone explains why genetic algorithms are fairly insensitive with respect to the number of unknowns. On the other hand, speaking of the neighboring points in the search space makes sense, especially when we think of nearest neighbors. It is therefore intuitively appealing that we should be able to find a (hopefully small) subset of points in the search space with the property that any point is distant from this set no more than unit – something similar to the backbone or spanning tree known from the graph theory. In case it is easy to check that exactly two points, namely and , are enough to form such a subset with requested property. Evaluating fitness for each member of this subset should be nearly equivalent to the exhaustive (“brute force”) search, since then, for arbitrarily chosen point in the search space either this point itself or one of its nearest neighbors was visited and evaluated during evolutionary process. Unfortunately, today we don’t know how to find such a set in general case; we don’t even know what is its cardinality – hopefully much lower than that of original search space. It is certain, however, that the solution of this problem need not to be unique.111Consider the simple case of -bit chromosomes. The universe consists of just elements: , , and . There are two subsets with desired property: and . Perhaps the genetic algorithm is the best available tool for approximate construction of such subsets?

8 Comments on performance

Using “doubled” chromosomes we successfully mimic the double helix structure of DNA. The cost is moderate: we need to double the storage for the population. Counting the flipped bits is performed only once per generation, so this cost should be negligible in comparison with evaluations of fitness function. Instead of the true implementation of the “double helix”, one can use the simplified formula (10) as a stopping criterion. Direct observation of the fraction of effectively mutated bits will signal the end of calculations usually much earlier.

9 Acknowledgment

This work was done as part of author’s statutory activity in Institute of Physics, Polish Academy of Sciences.