Recursion and evolution: Part I

by   A. D. Arvanitakis, et al.

A self-editing algorithm is one that edits its program. The present paper studies evolution of self-editing algorithms that undergo some form of natural or artificial selection.



There are no comments yet.



Faster parameterized algorithm for Bicluter Editing

In the Bicluter Editing problem the input is a graph G and an integer k,...

Trends at NIME – Reflections on Editing "A NIME Reader"

This paper provides an overview of the process of editing the forthcomin...

Evolving Boolean Networks with RNA Editing

The editing of transcribed RNA by other molecules such that the form of ...

Exploration of RNA Editing and Design of Robust Genetic Algorithms

This paper presents our computational methodology using Genetic Algorith...

Taking the redpill: Artificial Evolution in native x86 systems

In analogon to successful artificial evolution simulations as Tierra or ...

Artificial Life in Game Mods for Intuitive Evolution Education

The understanding and acceptance of evolution by natural selection has b...

Simple Primary Colour Editing for Consumer Product Images

We present a simple primary colour editing method for consumer product i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

As the abstract announces, in the present paper we study the notion of self-editing algorithms, i.e. algorithms that may edit their program. Such an algorithm will be represented by a state code (Almost) following Gödel, we use the symbol to denote the algorithm that codes for. It is easily seen that a self-editing algorithm may reproduce its code: For if is a code for the algorithm:

and is self-editing, then the result of on input (denoted as ) will be two exact copies of We can turn this into a more complicated but more meaningful procedure as follows:

Given a code of (any) algorithmic procedure, let be a code for the following algorithm:

Run on the input and output the result.

Assuming again that is self-editing, the result of running it on will be the set of codes Thus a self-editing algorithm may compute its descendant (instead of just copying it) and exactly this simple observation is behind the idea of combining self-editing algorithms with selection: For let us assume that selection evolves the codes as it regards perpetuation. Such a system thus would evolve to be fitter, not only in computing an answer to the environment, but also in computing the codes of its descendants, formalizing thus better answers. This seems to give clues of an answer to the question whether it is possible to design a computational system that not only learns, but also learns how to learn.

The paper is organized in a self-contained manner and it does not assume any familiarity with mathematics other than mathematical reasoning and some experience with the notion of computability. However the ideas involved are closely related to recursion, which goes back to the ancient Greeks, self-reference and the diagonal method which have been introduced and studied by various mathematicians such as Cantor, Church, Gödel, Kleene, Tarski, Turing etc. The interested reader may find relative material in any context of Set theory, Logic and Recursion Theory such as [9, 7].

Finally, it is well known that the idea of evolving by means of proliferating and selection is due to Darwin.

The research in this paper has been done in the course of many years and many people (mostly mathematicians) have contributed to it in various ways. (In chronological order) Maria Avouri [2], James Stein [11], Despoina Zisimopoulou [12], Dimitris Apatsidis [1], Fotis Mavridis [8], Antonis Charalambopoulos [3], Vanda Douka [4], Antonis Karamolegos [6], Helena Papanikolaou [10] and Miltos Karamanlis [5] are among them.

2. Structured codes and locations

2.1. The set of (structured) codes

Let be a finite set not containing the symbols ”(”, ”)” and ”,” to be used as an alphabet. Assume for simplicity that We wish to define the set of structured codes that can be written using Obviously any finite sequence on should be a structured code. We ’ll call such a sequence word. For example are words. (In the case of words we will identify a sequence of the form with the string ) Assume now that are words. We ’ll call the sequence phrase. A phrase is therefore a sequence of sequences of elements of and we ’d like to consider it also a structured code. All the same, sequences of phrases should also belong in this set of structured codes and so on.

The following definition clarifies further the notion:

Definition 1 (Definition of the set of structured codes.).

The set of (structured) codes on a finite alphabet is defined as the -least set with the following properties:

  1. and

  2. Whenever we have also that

Structured codes of the form or are also considered and we ’ll explain their use when necessary.

2.2. Locations

We are going to need a way to navigate through a structured code. This is done using locations. Assume for example the code

We wish to be able to refer to its various subcodes: For example in order to specify the subcode to the extreme left, we may say that it lies in the location of Similarly lies at location and at location We can specify the digit which is the leftmost component of the code written in location of by saying that it lies in location That means the first component of the second component of In location lies and so on. Notice that conventionally lies in location

Thus by the term location we mean a finite sequence of elements of the set to be thought of as describing a part of a structured code that we want to refer to.

We use the greek letters for locations to avoid confusion.

For simplicity we assume that both environmental input and output are coded in some specially reserved locations of the code.

We are going to establish the ”meaning” of locations, by defining the way to use them. This can be done by the use of two algorithmic procedures and

Let’s first describe the inputs of these functions and what are intended to calculate.

Read has two variables. The first one is a location, say and the second one a code, say Read then should output the code written in location of

Replace has three variables. The first one is a location, say the second and third ones are codes, say and The function then should replace the code written in the location of by For example

The following (optional to read) is a somehow technical definition of these two functions and is contained here for the shake of completeness.

Definition 2 (Recursive definition of functions related to locations).

So, let be structured codes and a location. Next we define the functions Read and Replace, recursively on the length of the sequence


For set and by the recursive definition, let Then

For the recursive step of Replace, let as before and

  • if ( and ) then set where

  • else set

Finally, using the recursive definition step, define

Let be a code and a location.

  • The notation stands for a simplified version of

  • Assume that is also a code.

    • If is a location in then by we denote the output of

    • Else if is a location not in the same notation will stand for a new code that results from by adding the new location containing

    In both cases we ’ll simplify the notation to c(s) whenever is easily understood.

2.3. The problems of self-editing.

By the term state code we mean a complete description of a computing system. For example,

  • in a Turing machine, a description of its program, together with the tape on which the machine registers.

  • In a computer program, a description of the program itself together with all the information needed to conclude the next steps of calculation.

We will assume here that given the state code of a computing system, we can deduce the algorithm, (denoted by ), that computes the next step of the procedure that state implies, so that the next state code of the system is calculated by this algorithm, () acting to some data contained in the code We ’ll refer to such a step using the term transition of the system.

It is evident that any kind of computation can be described thus, yet this general definition includes the case where in some computational step, the algorithm changes the executable program of its code. Half of the situation (namely algorithms that input their program) is well known in Logic, and useful to a lot of interesting and beautiful proofs therein.

This assumption, although perfectly reasonable, induces a lot of ambiguity to the issue of what such a system is capable of doing. To see a simplified example, if we assume a copying algorithm copy and it happens that then according to our assumption about transitions, the next state would be two identical codes This sounds surprising, and in more complex situations it may become totally confusing. So, in order to simplify things and make transitions of such a system more clear, we intend to handle the difficulty of this matter, by assuming that programming instructions carried out by the algorithm have the general form:

(1) Run the code on location of the input


(2) Run the code on location with input on location and output on location

Concluding, we assume that any computation that the system performs is of the general form from state to state where


Below we study some simple transitions of the form (3).

2.4. Location creation

Let be a code. Define the algorithm by

which adds a new location into an existing code internally, and the algorithm

which also adds a new location externally. Assume that and are respectively their codes situated in locations and of the code and let be any of these two locations, thus if


is executed in the sense of (3), it will produce the output of or , namely the creation of a new location.

We will abbreviate this instruction in obvious ways as for example: Create a new location.

Unless otherwise stated, we will assume that the creation of a new location preserves the meaning of any old ones.

It is useful to remark that if the code in (4) is situated itself in an location of we can also run the code in the location with the same effect.

3. Programming and Self-editing.

3.1. Memory

Assume a general transition


where as in (2), is the algorithm

Run the code on location with input on location and output on location

which runs with input (the code)

We wish to transform (5) into


where is a new location (not existing in ) for the simple reason to keep track of the previous code. To this purpose we consider the following alternative form of (2):

(7) Let be the code on location Run with input and output in and let be the result. Next create a new location in and store there Output on location

It is easy to check that if the code of this algorithm is activated, then we ’ll achieve the required transition (6)

A transition is called memory storing if there is a location such that Transition (6) is an example of such a transition.

We should remark that transition (6) is very space consuming and is considered only for theoretical reasons, since it is guaranteed that it stores all available information.

3.2. Code generators.

We are going to need a general notion of a code generator: Assume that is a finite set of algorithmic procedures Let be a code for A code generator

with lexicon

is a computational system with initial configuration such that:

  1. contains for all in some locations

  2. If is a calculation with then for every is the result of a transition of the form (1) or (2) or (7).

The output of the search is considered or a code in a fixed location of

A set of algorithmic functions and operations will be called adequate, if its closure is the set of algorithmic functions. Candidate examples for this definition are any programming language, Turing machines or -recursive functions. It is known (see [7]) that all these models are equivalent regarding their closure.

One can see that if is an adequate set of algorithmic functions, then every algorithmic function is (theoretically) the output of a code generator with lexicon The reason is apparent if we use as a lexicon of a programming language. In this case we may simulate the execution of a given instruction of the language by the execution of its code which is situated in a location of the structure If necessary, the input of can naturally be stored in some locations created for this purpose.

What is more interesting here is that the selection of

greatly impacts the probability under which various outputs may occur.

3.3. Parallel computation.

A parallel action of two algorithms on a code denoted by is defined as a sequence of computational steps

where and in every transition algorithms and function as having as an input and one or more locations of as an output. Computational steps are considered behaving according to the following conventions:

  1. Locations on that are not output of any of retain their content in

  2. Locations on that exactly one of outputs, change according to this output in

  3. In the case where output to the same location, the computational step will be considered non deterministic and thus may contain in this location either ’s output or

    ’s. For the moment we are not interesting on this non deterministic behavior, so we ’ll postpone fixing the details for later on.

Parallel computation is not essential, and it can be simulated by sequential computation, yet it greatly facilitates our study, as in the case of memory storing, so we ’ll assume hereafter that any transition is a set of instructions of the form (1) or (2)

3.4. Recursors

Let be a transition of a computational system and a common location in By our convention about transitions, is the result of restricted to location For reasons that will become apparent soon below, we are going to identify not taking any action about with copying Obviously these two materialize the same algorithm.

On the other hand it will be very useful to override this behavior by means of a given algorithm that we wish to use to compute Such an algorithm will be called the recursor of or -recursor. A code for such an algorithm will be called also by the same term. Thus given a code the instruction

Use as a recursor

should be thought of as an instruction for the algorithmic procedure

that replaces the contents of in location with as intended.

It should be noticed here that we are going to assume (and use) that the location of a recursor is the same for fixed

We will refer to such a transition by the term diagonalisation of

Let a transition and a recursor. We say that is compatible with if

for all locations such that outputs to the -location. Notice here, that the definition has nothing to do with the actual computation of location of We are mainly interested in the situation in which running the code on the same input will produce the same result.

4. Diagonalisation procedures over memory

Assume is a state code such that the transitions with initial state are all storing memory ones. Thus a calculation beginning with and of length will be of the form


where all locations are new. Let

For any in the range in the transition the active algorithm may be regarded as having as an input the sequence being the ’now’ active state.

A diagonalisation procedure (over memory) is generally an instruction that relates the sort of application of a candidate recursor, based on a memory test to establish validity.

A simple yet basic example of this kind, given a location and an is the following instruction:

(9) If and for all transitions the same code has been used as a recursor, then use it also in the current transition.

In most cases we are going to use (9) as follows: Assume Let be a sequence of transitions, for which is a code which happens to have been used as a recursor for some location in both transitions and If (9) is active during the transition then will be also the recursor of the latter, and recursively (assuming the activation of (9)) of every transition thereafter.

Let us see a more concrete example of the application of (9). Assume is a location in the state code where an integer is written, that is Assume also that the recursor is the non deterministic algorithm with

Thus in the sequence of transitions (8), for every

Presumably, if diagonalisation (9) is active during the transitions and for all happens to be equal to then the same will happen also for deterministically.

Likewise, if for all then also for

It is worth to remark here for future reference, that if (9) is active, then after transition, locations that incidentally have not be changed, will be copied thereafter.

In order to use diagonalisation, we ’ll have to combine it with proliferation and selection.

4.1. Proliferation

We assume here a population of state codes. A set of them consisting of say will be denoted by Clearly, the function defined on the set of structured codes as:


is recursive, so if is a code for it, situated in location of some state code its activation on location will produce the transition given by (10).

Finally, given codes situated in locations of a state code the transition

may be produced, and the latter combined with (10) will give


proliferating thus according to a given (by ) calculation of its descendants. Codes will be called also recursors of the corresponding descendants. We should remark here that every separate birth


may be regarded as any other transition we have discussed until now. (The main difference being the switch from not ouputing to to copying ) For example, choosing appropriately we may (or, depending on the descendant, may not) assume that transition (12) is a memory storing one, etc.

4.2. Selection.

Hereafter, by the term selection we will mean the act of choosing a subset of descendant codes in (11) to continue the calculation excluding the rest. Clearly, such an act may be completely random and therefore containing no information at all.

4.3. Recursor validity.

Given a state code and a location a instantly valid recursor for is a code such that its activation as a recursor on is environmentally preferable.

A valid recursor (for ) is one such that for every surviving sequence is instantly valid for

In the special case where may be omitted from the previous terminology.

4.4. Effect of diagonalisation

Before passing to explore further instructions like (9), let us indicate how we are planning to use them: Assume a proliferating population of self-editing codes that undergo some short of logical selection. Even though we do not understand anything about the nature of such a selection, we might be able to help such a system by trying to fill the dots in a surviving sequence of descendants:


The reason is that with the aid of an appropriate structure we could be able to guess a pattern in the surviving sequence (13) that reveals the preferences of this logical selection. The point is that a self-editing algorithm can fill the dots the same way as we would and a very simple example of this is exactly (9). We will leave the consequences of filling (13) for the examples that are presented after the diagonal instructions (16), (17) and (21).

Assume a proliferating transition


Let also be a location. We will call (14) searching, if the values vary according to If (14) is searching, every or will be called a attempt, and if the code has been used as a recursor for computing will be called attempting recursor.

If (14) is not searching, it will be called ruled, and in the case where in the calculation (14) has been used the same recursor for every descendant then we ’ll call it ruled by In this case, is called a -ruler or a -stabilizer. So, if (14) is ruled by then

Let now


be a sequence of transitions. We will call (or ) a searching point of the sequence if the transition induced by is searching.

Consider now the following variation of (9):

(16) Given a location if there are more than -searching points in the memory and for every searching point the same code has been used as an attempting recursor in the transition then hereafter use as a ruler.

Let us explore the use of (16): Assume that for some code

  1. The algorithm

    Run on location and output to

    is valid.

  2. itself is attempting in every searching point.

Due to (1) and (2) above, it is expected that in every researching point of surviving branches (15), has been used as an attempting recursor. So granting that (16) is activated, after enough ( precisely) searching points for the system will continue using as a rule.

The following is a straightforward consequence of the above principle:

Example 1.

In an evolving system, assume a location where copying is valid. If there are copying attempts for and moreover the system diagonalises by (16), then the system can learn to copy

5. More diagonals.

There is no reason to restrict our study to the basic diagonal (16). For example, and for future use also, we can replace the search of a consistently repeated partial recursor, by the search of a consistently repeated sequence of partial recursors. We do not intend to investigate this now, instead we would like to introduce a more interesting kind of diagonal instruction which reads as follows:

(17) Given a location if there are more than -searching points in the memory, search to find a code such that for every searching point If you find such an use it hereafter as a ruler.

There are various parameters to consider about an actual use of the previous instruction:

First of all we should fix Secondly, we should fix the program to be used for the computation of such an And lastly we should also put a limit in the search, otherwise we run into the possibility of searching in vain for something that simply does not exist (or is practically very difficult to find). We won’t deal with these problems right now. Instead we are going to assume a basic functionality of (17) just enough to produce some simple codes. We ’ll refer to such a code by this same term simple code. Thus, a simple code is a code that we strongly expect on a probabilistic setting, to be produced by the procedure used in (17) to compute

Obviously, we can generalize the discussion after (16), as follows:

Assume that is any simple code and any location such that

Run on and output to

is valid. Then by means of (17) the evolving system can learn to use as a ruler.

5.1. Some discussion about validity

Before passing to demonstrate potential uses of (17), we should notice that obviously (17) is not valid, since for every finite set of pairs of transitions


there are infinite algorithms such that Nonetheless we are going to use its flexibility in order to detect valid transitions.

It is interesting here to notice that everything works very well when The reason is that if is a valid recursor for and any code with different algorithm, then for sufficiently large number in almost every set of at least pairs as in (18) with

we expect that for at least one

It can be easily seen that this observation yields that for any valid recursor and any code generator with an adequate lexicon, there is a (sufficiently large) such that it is expected that (or a code generating the same algorithm) is the simpler code that fits a total of -searching points.

5.2. Understanding implications empirically

We should notice here that we may use (17) with much more flexibility, for example for any code of a valid algorithm of the conditional form

if then

granting that is simple. Let us formalize this a bit better:

Notice first, that if condition is not parametrized, then it is assumed that it is always either true or false and we are interesting in more complex situations. Therefore assume that and its validity depends upon Assume that is a code such that

About condition let be a code of a recursor, such that

It is obvious that we choose here to translate the validity of condition into the validity of a recursor. This is done for convenience, since it is going to be more useful in this form. Anyway we may reconstruct the full meaning of understanding assuming that the activation of the code simply means that the system understands that is correct. So the target of our mental experiment is to see whether

(19) If then activate

can be established by means of (17). In order to continue, let us compare this to a real life test: For there is not always the case that an intelligent being could empirically establish (19) even if it is valid. It is easy to check that this is so, granting that one notices being interesting about the validity of We are going to make a translation for this condition into our system: The analogue for ’interesting about the validity of ’, would be to use (17) for the location of activation of the code

For ’noticing ’ the translation would then be to run the code say in location with input in location

It is easy now to check then that a code for the algorithm:

Run the code in location with input in and if the ouput is 1 then activate the code in location

satisfies the conditions in (17) to be validated to be used as a ruler (if simple).

5.3. Generalizing

We are going to investigate here if (17) may be used to generalize. So let us consider as an example a valid expression of the form:

By replacing as before the occurrences of and with the codes and we are going to assume that for a sufficient number of elements the system has experienced that if then is a valid recursor. Assuming that is the recursor that was randomly used in such cases, and as before, in can be easily seen that the code

Run the code in location with input in and if the ouput is 1 then activate the code in location with input in

if simple, it will be validated by (17) to be used as a ruler.

5.4. A self-editing example

A surprising consequence of self-editing and diagonalization is that an established diagonal instruction may establish another diagonal instruction granted that the second one has a simple code. This follows since a diagonal instruction may be described as a general algorithm of the form

where involves a test of over memory content and involves using as a recursor. As we discussed earlier a diagonal instruction need not be valid, yet if practically useful it should be almost valid in the sense we also discussed earlier. Thus this could be thought of as the case of a generalization over an implication and we have discussed both procedures in the previous examples.

An interesting example of this kind is a diagonal scheme that may enrich the lexicon used by a state code with memory and it reads as follows:

(20) Given and assume that is true for a proportion of -researching points Then use as a recursor with probability

6. Localized diagonals

We are interested here in diagonals that follow by testing a recent (usually short) part of memory. The action to be taken is to use a candidate code as a recursor according to some predetermined frequency. In the general case we assume that this is calculated by an algorithm of code which inputs a positive integer (which is supposed to be the number of -attempts) and outputs a positive integer (which is the number of exceptions to the rule). Thus the intended meaning of

Use as a recursor with exception frequency

is to ensure that in a proliferating transition with -attempting recursors

the number of ’s such that is the output of on input

To begin with, assume that is the memory input of the algorithm of a state code and let be a small positive integer. A localized diagonal, given a candidate recursor and an exception frequency code (or their locations), is of the general form:

(21) Given a location if there are more than -searching points in the recent memory, search to find a code such that for every searching point and use it hereafter as a recursor with exception frequency

Next, we are going to investigate through some examples the use of (21). We are going to assume as before that the candidate recursor in this instruction, is the result of some search using an appropriate lexicon of codes. In all examples that follow, we set yet we plan to initiate a discussion about an optimal value.

6.1. Simple lessons in a sequence

6.1.1. A simple example

In this example, we ’ll assume a location called environmental output in which we wish to test the ability of the system to deduce and fill the gaps of given structured sequences. Such a simple test would be for example to fill the fourth place in the sequence We will make some conventions about this kind of test, which have to do with the initial lack of our system’s capability to communicate.

The main convention is that instead of giving the first digits of the sequence to be filled, we will demand from the system to guess the entire sequence. We will call the digits that we want to reveal, as the digits and above, non intended to be guessed digits. The above convention, has two aspects:

First, we will have to give the system some attempts to guess a digit which is not intended to be guessed, for example the very first digit. Doing so, we will identify attempts to guess a particular digit, with attempts (as they have been already defined) of proliferating transitions. This convention is established in order to avoid more complex situations with which we ’re going to deal later on.

Secondly, we ’ll take for granted that (somehow) always at least one of the descendants guesses right about a non intended to be guessed digit.

Since we would like to deal with structured sequences, we will often use left and right parentheses to denote the beginning and the end of a sequence respectively. Parentheses are considered also part of the sequence, so they have to be guessed, intended or not.

Let us begin with describing the mental experiment of a simple sequence as We will need to formulate a language to speak about the parts of such tests and fortunately we have it already: We will refer to the first digit of the sequence as the digit in location of the test etc. In this experiment we assume that we begin with a state code with (the digit in location of the test) written in location environmental output of . Since now (the digit in location of the test) is not intended to be guessed, we will assume that there is a proliferating transition after which gives (at least one) descendant with written in environmental output. The same assumption holds true for the next digit, and at this point we have come up with a sequence where writes in location environmental output. Granting (21) and assuming that the code for the function is simple, all descendants of (except many) will write in the corresponding location, signaling the guessing of the fourth digit.

Let us make some first remarks:

  1. The example functions equally well if we would like two or more digits to be guessed.

  2. Exception frequency does not play a major role here. We will see its use later on.

6.1.2. A more complicated example of the same kind

Let us try to generalize the previous example, by demanding a sequence of sequences of guesses. This will enable us to see what happens both with exception frequency and of the use of parentheses. So let us assume a simple example of this form:


We may use here also, the terms test or experiment to refer to the various experiments included in a composite one like this, so for example test of (22) is and experiments are defined similarly and by experiment we mean all of (22).

The intended successful outcome of this test in all experiments, is the guess of digits after and these can be thought of as in the previous example. Yet we have here the chance to consider a far more serious problem to be solved, namely the guess of the experiments for According to the previous terminology, this is the experiment.

So, for the guess in , let be the recursor location of location environmental output. Then, during experiment, it is expected that the code written on is a code for Similarly, for and and are expected to be written in the later being the codes for Add 2 and Add 3 respectively. Assume now in addition that the code of the number is, (as it is actually written here,) in some location of that is in a sublocation of Since it is assumed that Add 1 has a code that is simple, the same diagonal (21) will guess the value for the same location in experiment, so that (not counting parentheses) the guessed fourth sequence is expected to be

Again if Start with 1 has a simple code, then since in the first three cases the sequence starts with granted again the localized diagonal (21), the expected value of will be again resulting in the sequence

as wished.

Before we examine further this approach, let us remark some things about the use of parentheses. First notice that here is apparent the use of the exception frequency in localized diagonal (21). Indeed, assume that in guessing of test the digits in up to are intended to be guessed. As a first case, consider that the numbers are not fixed. In this case, the exceptions of the frequency are to ensure that at this point, any state code will produce descendants both of the kind that guess the next digit and some exceptions to this rule, in order to ensure survival. Here, since the digit is not intended to be guessed, according to our conventions we take for granted that in some attempt at least one descendant writes the right digit in environmental output, which in this case is a right parenthesis Up to now this convention holds for all experiments. Yet, evidently, we would like this to be guessed also by our system.

To investigate this, assume a location that contains a code to be executed for the calculation of the descendants following the frequency exception of (21). Following the same arguments, let be a code that changes this behavior into writing in environmental output in the case of exception and assume that is simple. By localized diagonal (21) again, will apply to the code in causing (almost) all exceptions to write (the correct) right parenthesis digit after experiment

Similar arguments may be used to show various other cases of such a guessing and it is interesting to analyze a few more. Assume for example that the number of guesses to be filled in any experiment is fixed. Again if a code that forces this is simple and some first experiments have been successful then diagonal (21) will ensure the application of this code, guessing thus the number of guesses to be filled in the experiments.

6.1.3. Guessing in higher ranks

We may rank experiments according to the rank (level) of the recursor that is intended to be used. So the experiment in the first example is of rank whereas that of the second of rank More accurately (for this example) the rank of an experiment may be defined as the rank of the tree of the experiment’s locations.

It is evident that the ideas about guessing an example of rank can be generalized further for guessing in higher ranks if necessary. An interesting question that arises naturally, is if the guessing based on diagonals is closed under recursion experiments of greater than finite rank. As we see below, this is not to be thought as an extreme theoretical case.

Consider for example the case of a sequence to be filled of the form where is a non bounded sequence of integers. For example let Since a sequence to be filled has to be somehow ”logical”, it should be expected that for the intended to be guessed This is much more complicated and essentially not to be examined here, yet let us give several general directions, since they contain also some useful discussion:

Inductively, let be the sequence of state codes that solve experiments respectively. Then there is a substructure of that has been created exactly for solving the respective experiments, the local program, so to speak, for producing respectively. Since our algorithm has to guess non intended to be guessed digits in order to pass from solving to solving we will have to assume that and are the surviving branches of attempts to solve the problem.

We have several options here regarding both the input memory sequence of localized Diagonal (21), and the location in which it applies.

Ideally, we would like to have in place of the memory input the sequence and in place of the location that localized diagonal applies to, the (one) location of the programs that solve respectively and The same argument then would apply a simple code to that location that stabilizes the sequence and hopefully this would be the required code to guess

There are nonetheless problems to the above scenario in two places: The first one is that it requires a structured memory and not just the sequence of transitions. Until now this has been solved by the simplicity of the experiments: Our assumption to solve experiments is that have to be the successors of the researching points of otherwise that the transition has exactly one researching point, yet it can be easily seen that in a more complicated and general experiment, researching points might need to be more than one between the two successful states and Thus it is evident that for more complex experiments we will need to consider a more sophisticated mechanism of memory. Notice, for later use, that this could be somehow surprisingly solved assuming that we can ensure that memory mechanisms can evolve to be more successful. (Something we intend to discuss later on.)

The problem of location naming can be solved also in a similar manner: The simple case and the not simple one. In the simple case, we may assume that the recursor may be tested to location, that is on themselves. Obviously, this diminishes the efficacy of the searching, yet on the other hand it does not require more sophisticated mechanisms. In the not simple case, we would need more sophisticated mechanisms to naming substructures, and we haven’t discussed this. The previous notice fits here also: If we could ensure that ”naming” can become an evolving mechanism, then the problem would be automatically solved.

Notice that in all cases the recursor to solve a sequence of unbounded rank, has to be constructive, in the sense that it has to create new locations as recursor locations of old ones. This remark on the other hand suggests that diagonalisation in general may occur also in creating locations (and thus naming).

6.2. Smoothly changing environment (yet another example)

This example is similar to the previous one, the main difference being that we omit the parentheses. It is intended to be applied in combination with a smoothly changing environment in order to check the ability of our system to adapt in it. We isolate a single parameter changing, such as temperature for example, and assume that our evolving system has to adapt accordingly. The assumptions about its ability to adapt are also simplified to the following: The adaptation takes place by editing a corresponding number in the state code. For our example assume that is such a location in which contains an integer. Adaption is thought to be made by raising, lowering or just keeping the same number in As in the previous example this can be done using proliferating attempts. Thus, it is not assumed that there is a kind of sensing this particular environmental parameter, other than life or death of the descendants. We may and do categorize such a single parameter smoothly changing environment, into degrees by considering the rate of change of the variable: Let


be the sequence of the values that should be guessed in The changing rate of that sequence is defined to be the sequence

A first degree smoothly changing environmental variable is one whose rate is constant.

It is easy to see that by (21), a system can guess a first degree smoothly changing environmental variable which is named in some location of an evolving sequence of codes: For if are the values of this variable that should be guessed in a location granting that the code of

is simple, then since the sequence is of first degree, for every so that the application of (21) with code should guess the intended to be guessed sequence.

Similarly, a second degree smoothly changing environmental variable is one whose the rate of its rate is constant. Let us continue the previous example into an example where the sequence

has to be guessed in and is smoothly changing of the second degree. So, if (using the same as above code ) where is the sequence of recursors of the location which is situated in some location then the sequence is of first degree and in some sublocation of so again by (21) the sequence can be guessed.

It is clear that we can generalize the notion of first and second degree to an arbitrary th degree. The previous example modified accordingly can show that (21) can be used also to guess a variable that is locally smooth, i.e. consists of non-disjoint intervals where the variable is smooth for some

It is obvious that in our consideration it is assumed that at the beginning, the evolving system knows the correct answer of the variable situated in A very interesting question that arises here, is if we can assume that beginning with a false value will have the same effect.

So assume as a simple case that the sequence to be guessed is constant and our system guesses meaning that is written in In order to solve this using (21), we will assume that the system has the information of the distance from the correct value in the following manner: Units that approach better this value are more likely to survive. We have to consider two similar cases here:

The first is that (The second being similar.) This implies that descendants that compute a higher value in have a greater probability to survive, so that we may expect that in location of the surviving descendants is written a value greater than If it is the case where still then again a still greater value is needed and we will assume that is computed randomly into at least one’s descendant location. Using (21) now for we conclude that the system will continue to produce still greater values guessing thus that the correct value is higher. Clearly, this is not always the case, yet it seems to be the best strategy among all such cases, since it is the obvious (for us) way to fill a sequence It is evident that approaching the correct value the exception is going to be used, to compute a descendant that guesses (randomly) the correct value for This is also to be expected, since there is no other information about the correct guessing. Yet if a correct value is guessed, then again by (21) the system can guess that it is constant, by copying it. (As always we will have to assume that copying has a simple code.)

For simplicity we have constructed the example using integers, yet it is obvious that what we really need is an appropriate value to use as a unit trying to guess the correct value for Finding the best to be used is not a subject of our treatment here, yet it should be noticed that diagonalisation could also play an important role: For, assuming that the value that is used is stored in a location we may also apply the same strategy as in the example to guess the correct number. Codes that we may use are for example to divide or multiply by the number in instead of adding or subtracting

Let us notice that we may use diagonalising by (21), to guess the correct (fitting the example) number in this same instruction, a classical effect of using self-editing. (We just have to use the instruction on the location of )

6.3. A crucial example: Memory sequence

Up to now we have theoretically assumed that memory storing is active at any transition. It is obvious that this cannot be of use, but only for theoretical study. In practice, memory length simply cannot grow for ever. So the question arises naturally: How can one compute the necessary memory length to accomplish specific tasks at hand.

A related and more complicated problem about memory is the choice of locations that can serve for researching and diagonalising.

Due to the self-editing properties, we could arrange both problems using diagonalisation.

For the memory length problem, it suffices to assume that the length of memory is referred within the structure in some location. Then researching and diagonalisation in that same location will ensure according to the previous examples adaptation of memory length to the specific task at hand.

The same holds true for solving the second problem. For, assuming that every location is assigned to some number indicating the likelihood of researching or diagonalising, then again due to the self-editing property, researching and diagonalisation can occur at the location of that number. The result would be that the probability under which this particular location is researched or diagonalised will adapt in the same manner as in the previous examples.

7. Some discussion about intelligence

Let us follow a simple real life example about intelligence:

Assume that a certain businessman runs an information technology company. A very common problem of the company is a well-known problem, namely the famous halting problem:

Some program codes sometimes, under some inputs run for ever. We won’t make things more complicated, yet assume that decides to hire to investigate this matter. Now here is the point: Although ’s job sounds interesting, it is not at all a job that a computer can do, in the sense that it is well-known that there isn’t any program for doing so. ’s job on the other hand does not sound as something that could not be a job for somebody.

The example implies that we cannot expect to find any fixed programs to act as an intelligent being would, neither could we expect that, as any of the diagonal instructions considered so far could serve as a program for being intelligent. The main intention was rather to show their potential ability to support a great number of steps in a far more complicated and sophisticated procedure which we ’ll try to investigate further in Part II.


  • [1] Dimitris Apatsidis. Personal communication.
  • [2] Maria Avouri. Personal communication.
  • [3] Antonis Charalambopoulos. Personal communication.
  • [4] Vanda Douka. Personal communication.
  • [5] Miltiades Karamanlis. Personal communication.
  • [6] Antonis Karamolegos. Personal communication.
  • [7] Stephen Cole Kleene. Introduction to Metamathematics. Wolters-Noordhoff publishing - Groningen North-Holland publishing company - Amsterdam New York, 1952.
  • [8] Fotis Mavridis. Personal communication.
  • [9] Yiannis Moschovakis. Notes on Set Theory. Springer, 2006.
  • [10] Helena Papanikolaou. Personal communication.
  • [11] James Stein. Personal communication.
  • [12] Despoina Zisimopoulou. Personal communication.