How swarm size during evolution impacts the behavior, generalizability, and brain complexity of animats performing a spatial navigation task

While it is relatively easy to imitate and evolve natural swarm behavior in simulations, less is known about the social characteristics of simulated, evolved swarms, such as the optimal (evolutionary) group size, why individuals in a swarm perform certain actions, and how behavior would change in swarms of different sizes. To address these questions, we used a genetic algorithm to evolve animats equipped with Markov Brains in a spatial navigation task that facilitates swarm behavior. The animats' goal was to frequently cross between two rooms without colliding with other animats. Animats were evolved in swarms of various sizes. We then evaluated the task performance and social behavior of the final generation from each evolution when placed with swarms of different sizes in order to evaluate their generalizability across conditions. According to our experiments, we find that swarm size during evolution matters: animats evolved in a balanced swarm developed more flexible behavior, higher fitness across conditions, and, in addition, higher brain complexity.



There are no comments yet.


page 5

page 7


Environment induced emergence of collective behaviour in evolving swarms with limited sensing

Designing controllers for robot swarms is challenging, because human dev...

Analysis of Evolved Response Thresholds for Decentralized Dynamic Task Allocation

We investigate the application of a multi-objective genetic algorithm to...

Efficiently Evolving Swarm Behaviors Using Grammatical Evolution With PPA-style Behavior Trees

Evolving swarm behaviors with artificial agents is computationally expen...

How to Make Swarms Open-Ended? Evolving Collective Intelligence Through a Constricted Exploration of Adjacent Possibles

We propose an approach of open-ended evolution via the simulation of swa...

Evolution of a Functionally Diverse Swarm via a Novel Decentralised Quality-Diversity Algorithm

The presence of functional diversity within a group has been demonstrate...

Evolution of sustained foraging in 3D environments with physics

Artificially evolving foraging behavior in simulated legged animals has ...

A Framework for Automatic Behavior Generation in Multi-Function Swarms

Multi-function swarms are swarms that solve multiple tasks at once. For ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

When watching swarms in real life people often assume a global intelligence behind the swarm behavior, e.g. a flying mock of birds may seem to behave like a single organism (Garnier et al., 2007). However, we now know that individuals in a swarm often act based on local rules to achieve global goals (Reynolds, 1987). This principle underlies the development of dedicated algorithms to solve single and multi-objective optimization problems, like Particle Swarm Optimization (PSO) or Ant Colony Optimization (ACO). Again, from the outside perspective, the abstract and virtual organisms seem to have swarm behavior, but their most basic rules are predefined by the optimization algorithms, e.g. in ACO all actions are predefined by the algorithm and its parameters (Dorigo et al., 1996). The observed complexity of swarm intelligence is then a result of the optimization process through interaction with the environment (Ilie and Badica, 2013)

. Using machine learning approaches, it is also possible to evolve such swarm behavior without the need of a predefined algorithm that controls the organism, which means that the hard-wired decision rules of the earlier mentioned algorithms are now replaced by unsupervised learning techniques

(Stanley et al., 2005; Olson et al., 2012; König et al., 2009).

Being able to evolve swarm behavior brings up new questions, e.g. about the effects of swarm size on evolution and how swarm size during evolution influences the organisms’ decision rules. While in the scope of biology there are several studies on group size effects in swarms and optimal group size for different species (Pacala et al., 1996; Brown, 1982), it is hardly feasible to conduct studies spanning evolutionary time-scales. Here, computational approaches using Evolutionary Algorithms (EA) resembling evolution in nature provide new tools to address these questions. However, since the cognitive decision rules of adaptive artificial organisms are not hard-wired, but evolved over time they are readily observable but often hard to interpret.

In this study, we want to advance on these open questions by evolving animats111An animat is an artificial animal with the ability to have specific motor reactions to sensory signals and to have internal states (Wilson, 1985). with swarm behavior and analyzing their internal ‘brain’ states and decisions. We hypothesized that swarm size could have a great impact on the evolution and social interactions of the animats. For this reason, we designed a virtual experiment to test the effect of swarm size on the animats’ evolution and, moreover, assess the generalizability of their evolved behavior when performing in swarms of different sizes. In particular, we simulated and evolved groups of animats equipped with Markov Brains (MB) (Hintze et al., 2017) in a novel spatial-navigation task environment that facilitated swarm behavior. As observed in previous work (Dorigo et al., 2004; Trianni et al., 2003), we found that the final environmental task fitness during evolution depended negatively on swarm size: task difficulty increased with swarm size as the 2-dimensional task environment became more crowded. In addition, however, the evolved animats showed significant differences ( for 9 out of 10 tests) in generalizability regarding their fitness when placed in different-sized swarms, which peaked for animats evolved in swarms of medium size. Interestingly, animats evolved in swarms of medium sizes also evolved more complex, integrated brain structures, which was evaluated by measuring the largest strongly connected network component in their MBs.

2. Related Work

Adaptive animats equipped with MBs were first introduced by Edlund et al. (Edlund et al., 2011). In several works, Olson et al. used these animats to investigate the evolution of swarm behavior in a predator-prey (co-)evolution environment (Olson et al., 2016, 2013; Olson et al., 2012). By contrast, swarm behavior in the present study emerged directly as a result of the implemented selection rule (see below). The cognitive setup of animats used here to study group evolution and behavior most closely resembled the MBs described by Marstaller et al. (Marstaller et al., 2013), which could move left or right and were evolved to solve a temporal-spatial integration task. The same type of task and animats were also used later by Albantakis et al. (Albantakis et al., 2014), who evolved single animats in environments that required different degrees of context-dependent behavior and memory to investigate the evolution of integrated information (Oizumi et al., 2014), a measure of brain complexity developed with the objective to capture the quality and quantity of consciousness in organisms.

A different approach to the evolution of artificial swarm behavior is the neuroevolving robotic operatives (NERO) video game combined with real-time neuroevolution of augmenting topologies (rtNEAT) (Stanley et al., 2005; Miikkulainen et al., 2012; Karpov et al., 2015). Similar to Olson et al. (Olson et al., 2016), they also evolved swarm behavior in a predator-prey scenario, but with a learning technology based on

artificial neural networks (ANN)

. Miikkulainen et al. (Miikkulainen et al., 2012) reviewed the work around neuroevolution and discussed future research topics in this field. They concluded that cooperative multi-agent systems are the next frontier of neuroevolution and that research in this field is still in an early stage. Another alternative to animats equipped with MBs, could be Intelligent Distribution Agents (IDA) (Franklin et al., 1998) or, if adding the self-learning component, Learning IDA (LIDA) (Franklin and Patterson, 2006; Franklin et al., 2012). The goal using this architecture was to develop a model of human cognition to investigate answers about the human brain and to apply the agents in real-life communications with humans.

Hamann (Hamann, 2014) showed that swarm behavior in general and the type of swarm behavior in particular is dependent on the density of the swarm. In his work he used a one dimensional changing environment to influence the evolution. The fitness values are not comparable, since he rated the predictability of future states, not the behavior. Dorigo et al. implemented the evolution of self-organizing swarmbots and also investigated changing group size (Dorigo et al., 2004). They showed that it is easier for smaller groups to organize themselves as for larger groups. An earlier work considered different numbers of agents in the environment(Trianni et al., 2003) and also demonstrated that fitness decreases with increasing group size in a similar task of self-organization.

Apart from the above research on artificial systems, several studies investigated the subject of different swarm sizes in biological systems. Pacala et al. (Pacala et al., 1996) argue, on the example of ants foraging on food sources or nest maintenance, that a variation in swarm size implies that the organisms transfer different information and perform different tasks. It is mentioned that larger swarms can be more efficient than smaller ones, but very large swarm sizes can also be of disadvantage. In general, swarm behavior is the result of individual interaction with the environment combined with social interaction. Earlier, Brown (Brown, 1982) presented work on the threshold of sociality, the willingness of an organism to join a swarm depending on the environmental qualities and swarm density. Here, optimal swarm size is expressed as a compromise between advantages gained by sharing costs and disadvantages arising from the faster loss of resources.

3. Methods

In order to design an environment in which the animats are able to evolve swarm behavior and allow for efficient analysis of their behavior and internal states, we have identified the following constraints to frame our model: (1) Animats must be able to co-exist (multiple organisms in one environment). (2) Respecting other animats (non-egoistic behavior) should help to gain higher fitness. (3) The task should be simple enough to be solved by animats with only a small number of sensors, motors and hidden nodes.

In this section, we describe the three main simulation components: (1) the animat design, (2) the 2-dimensional, grid-based environment design, and (3) the EA’s fitness function. The EA was configured using the MABE (Modular Agent-Based Evolver) framework222 Bohm, Nitash C. G., 2017) for digital evolution. If not specified otherwise, we used MABE’s default parameters throughout, which can be reviewed in supplementary material A.1.

3.1. Animat Design

Each animat used in this simulation contains a set of 2 sensors (one for walls, one to detect other animats), 2 motors, and 4 hidden memory nodes. Figure 1 shows a schematic of the animats’ architecture.

Figure 1. Animat architecture. The green triangle/circle marks the sensor for the wall/other animats, the yellow circles mark hidden nodes and red triangles mark the motors. Sensors only connect to the hidden nodes and motors in a feedforward manner. Hidden elements and motors can feed back to all other nodes except sensors.

Each animat has two kinds of sensors: one sensor detecting obstacles, here the walls, (green triangle) and one sensor detecting other animats (green circle). Both types of sensors have a range of 1 unit directly in front of the animat. Animats are built with feedback motors. The motor elements can thus also act as memory just as the hidden nodes, which means that the current motor state at can be causal for the future ‘brain’ state (this is easily observable in the example wiring diagram below, Figure 11). All nodes in the network can have two states, 1 and 0. A sensor switches to 1 if it detects an obstacle or animat, respectively. The movement model contains four possible states mapped by a 2-bit tuple where and model the left and right motor. implies that only the left motor is active and therefore the animat turns left . The same holds for in which the animat turns right. indicates a static animat and means that it moves forward.

Designing the animats with limited sensors and motors increases relative task difficulty, which has to be compensated by more complex internal states (Albantakis et al., 2014) and, for the same reason, should also facilitate the evolution of cooperation. Using the sensor data as the environment’s representation, an animat has an internal representation stored in its artificial brain. There are several common types of such models: the brain could simply be a manually coded function, an ANN (Stanley et al., 2005), a simple finite state machine (König et al., 2009), or a MB (Edlund et al., 2011)

. In our case, the focus is on elucidating the animat’s behavior while observing its internal and external states. Therefore, a simple and representable cognitive system was required. This is why we chose to implement MBs. Moreover, MBs emulate principles of neocortical function due to their strong embodiment of neuronal cognitive processes

(Marstaller et al., 2013). Future work should also consider other methods like ANN and Finite State Machines.

A MB is composed of a set of nodes with a finite set of states, which have temporal dependencies. The nodes’ state-dependent update rules are implemented with the support of Hidden Markov Gates (HMG), which indirectly connect the different nodes in the MB. Figure 2 shows an example architecture of a 4-node MB with one HMG. Each node can have a state (e.g. or ). A set of input nodes is connected to a HMG. Inside the HMG there could be a lookup table, or any other mechanism, transforming the inputs at into an output at

. The HMG’s output is written to a set of output nodes, which could determine a motor state and/or be used as memory in the next time step. In this study, we exclusively used deterministic lookup tables to specify the HMGs’ input-output functions. The HMG’s input-output functions and their inputs and outputs are encoded via a genome consisting of a string of integer values. At every generation, each locus in the genome has a probability of mutation; small sections of the genome also have a probability of being deleted or duplicated

(Hintze et al., 2017; Edlund et al., 2011) (all parameters as in (Clifford Bohm, Nitash C. G., 2017), see also supplementary material A.1).

Note that there is no active communication between the animats in this study. This means that the agents do not share information between each other, as, for instance, two organisms having a dialog. Animats can only sense whether another animat is directly in front of their position (or not).

Figure 2. A MB (Edlund et al., 2011) is composed of nodes, HMGs and their connections. The HMGs specify the mechanisms to transform a brain state at time to the future state at , e.g. by fixed probabilities (indicated by ). The effective brain connectivity between nodes (upper diagram) is derived from the nodes’ hidden connections to and from the HMG.

3.2. Grid-Based Environment and the Challenge to Move Through the Gate

In this work, we were interested in swarms and the evolution of swarm behavior, not just single animats as in (Edlund et al., 2011; Marstaller et al., 2013; Albantakis et al., 2014). Accordingly, multiple animats were placed in the environment simultaneously. Here, a swarm contained only clones, meaning each animat had the same genome and thus MB. Swarm size stayed constant across generations during each evolution. 5 different swarm sizes were tested. This made it possible to investigate dependencies between swarm size and the animats’ (swarm) behavior. We distributed the swarm sizes uniformly up to a maximum of 72 animats, corresponding to all predefined starting positions in the task environment (see below):

  1. : 72 animats, the total number of available starting positions (constrained by the environment design).

  2. : 54 animats, 75 percent of the starting positions.

  3. : 36 animats, 50 percent of the starting positions.

  4. : 18 animats, 25 percent of the starting positions.

  5. : Only one333Obviously, one animat cannot form a swarm. We included this condition here for comparison and treated it equivalently to simplify formulations. animat is placed in the environment.

The grid-based, 2-dimensional environment is designed to have units (Figure 3). At initiation, the animats were placed randomly without overlap on 72 predefined starting positions as marked in Figure 3 by gray triangles. The environment was partitioned into two rooms connected only by a narrow gate, which the animats were supposed to cross frequently as part of their task. The design was inspired by the work of Koenig et al. (König et al., 2009).

Figure 3. Grid-based environment design. The environment contains two rooms, which are connected by a narrow gate. There are 72 fixed starting positions for the animats, onto which the animats are placed at random, facing in random directions (up, down, left, right).

The task environment is designed to pose a multi-objective problem. On the one hand, an animat receives reward if it is able to travel through the gate. On the other hand, it gets a penalty if it collides with other animats (occupying the same location). In this paper, we select a tailored weighted sum approach to solve the multi-objective problem. In order to select the weights, we consider the following feature. Since collisions between two animats are much more likely than crossing the gate, the collision penalty is set to a lower value than the reward for traveling through the gate, while still allowing for optimal behavior (0.075 compared to 1). If the penalty is chosen too low, all animats would learn to center around the gate area, colliding with each other all the time. If the penalty is too high, all animats would adapt to not move at all (data not shown). Additionally, since it is not desirable for animats to crowd around the gate area, a timeout of time steps is implemented until an animat can receive further rewards after crossing the gate once. Each trial had a total duration of time steps444With less time steps, early generations would hardly encounter the gate, impeding evolution.. This time-out period could be interpreted as simulating a requirement to make the way back to the organism’s nest and also promotes the evolution of unified behavior.

3.3. Setup of the Genetic Algorithm

In the following we define the mathematical notation and equations for the fitness function. Table 1 lists all parameters and variables. In Equation (1) the fitness for a single animat in the environment is defined. Equation (2) specifies the overall fitness of the genome, or MB, which is also used for the selection process in the genetic algorithm (GA) of MABE. At each generation, we tested the swarm’s genome times in the environment (), with different random starting parameters (starting position and orientation) to obtain robust fitness values for each genome. In each of these trials, we randomly picked a single animat out of the swarm555This was done in order to maintain the same sample size across conditions with different swarm sizes. In this way, we avoided any bias in the fitness evolution merely due to differences in the variability of the fitness values across generations. Averaging across the entire swarm in condition , for example, would eliminate any variability in fitness due to the random starting positions. and averaged across their fitness values to obtain the overall fitness assigned to the genome.

Identifier of a single animat , where
The set of all animats in a trial, i.e. a swarm
The set of all trials an animat is tested in
The fitness of a single animat
The average fitness of a genome
across all trials
Picks a random animat from the swarm
depending on the trial
Returns the count of gate-crossings between time
and time for a single animat
Returns the count of animats at a
specific position at time
A single time step , where and
Trial duration, i.e. the number of all time steps
in a trial
Returns the and position of animat
Table 1. Definition of the Mathematical Notation for the Fitness Function

f(a) = ∑_t=0^T-1 {1g(a,t, t+1) = 1 and g(a, t-100,t) = 00otherwise - ∑_t=0^T{0.075c(x(a),y(a),t) ¿ 10otherwise

F(A,R) = ∑i=1—R—f(randA(A,Ri))—R—

A single evolution experiment was run for 10,000 generations. At each generation, a population of 100 genomes was evaluated, encoding the animats’ MBs. After each generation, a set of 100 genomes was selected (with the possibility for duplicates) to enter the next generation based on their fitness values and the selection rules of the GA. These genomes were then mutated according to the probabilities specified in A.1 (Clifford Bohm, Nitash C. G., 2017). Note that the genome population should not be confused with the swarm size: a swarm is a set of clones with identical genomes and thus MBs. The population of genomes corresponds to the pool for selection. For each of the 5 conditions we ran evolution experiments with different random seeds.

Statistical differences were evaluated using a Kruskal-Wallis test, the non-parametric equivalent of a one-way ANOVA. Differences between pairs of task conditions reported in the results section were assessed by post-hoc Mann-Whitney U tests. See supplementary material A.2 for detailed comparisons.

4. Results

To address our research questions, we performed a multi-level analysis to evaluate the evolved genomes resulting from the GA. First, we compared the average fitness evolution of the 30 evolution experiments per condition of the different swarm sizes . Second, we investigated the movement patterns of all agents in a swarm to answer whether the animats evolved swarm behavior. Third, we evaluated the animats’ generalizability across swarm sizes. An animat is generalizable if it performs at high fitness when tested with different swarm sizes as it was trained in. We also report qualitative observations about behavioral differences between the various swarm sizes . Finally, we applied a simple graph-theoretical measure to the animats’ MBs as a proxy for brain complexity.

4.1. Evolution of Fitness

First, we analyzed the evolution of fitness across all test groups . As it can be observed in Figure 4, task fitness is strongly dependent on swarm size: the smaller the swarm the steeper the curve and the higher the final evolved task fitness. All groups differed significantly () from each other in their final fitness (see A.2 for details). This result shows that the task, in general, can be solved without any cooperation and actually becomes more difficult for larger swarm sizes since colliding with other animats results in penalty (ref. section 3.2).

Figure 4. Task fitness averaged across 30 evolutions of the five different configurations

. The overall fitness increases during evolution, while the final fitness decreases with swarm size. The shaded area indicates the Standard Error of the Mean (SEM).

4.2. Observation of Swarm Behavior

Secondly, we tested if the animats developed swarm behavior or only independent movements. For this purpose, we generated heat-maps highlighting the movement patterns of the animats during their trial. Figure 5 shows the 5 heat-maps of the best genomes for each swarm size condition at the final generation in the GA (). We also generated and inspected animations of the swarm’s evolved behavior (final generation) for each of the 30 evolutions per condition. The most common movement patterns fit a ‘stop-and-go’ wall-following strategy, as in the study by Koenig et al. (König et al., 2009). According to Pacala et al. (Pacala et al., 1996), such a strategy qualifies as swarm behavior as it is a result of social interactions and interaction with the environment. This would mean that also the interaction with the wall is part of the swarm-behavior. For example, organisms in tended to receive less stimuli from the wall as shown below in Figure 8.

Figure 5. Heatmaps of the best genome of all . From left: , , , , . Color indicates occupation density during the duration of one trial. Black areas were never visited, while red to white areas mark low to high density. Yellow/white cells and areas indicate spots where the animats turned or stalled frequently. This ‘stop-and-go’ behavior was more common for animats evolved in large swarms.

Most swarms with good fitness evolved such a wall-following strategy, but diversity in the movement patterns was also observed, particularly for animats with a lower final fitness. Nevertheless, only animats in evolved high fitness using qualitatively different strategies. By distinguishing between (dark) red and yellow/white cells it is possible to observe whether the swarm is moving steadily or not. As the examples in Figure 5 show, big swarms moved slower along the walls, while smaller swarms only exhibited a few halting or turning points, particularly in the corners.

4.3. Generalizability of Animats

To test the generalizability of the evolved animats, we tested the final generation of animats of all conditions () in swarm sizes other than the one they evolved to, specifically, at
of the maximal swarm size of 72 individuals. We then compared the robustness of their performance across swarm sizes to observe their generalizability (Figure 6).

Figure 6. Average performance of the final generation of each evolution experiment grouped by swarm size during evolution when tested at different trial swarm sizes .

While animats failed to maintain their fitness within a swarm, all animats that were evolved in an actual swarm demonstrated a fair amount of generalizability. We quantified the fitness robustness of the different animat conditions by calculating the Area under the curve (AUC), which is largest for the animats in , followed by . All conditions except for and () differ significantly () from each other (see A.2 for details). Note also, that showed comparable fitness values to all other conditions except in their original evolutionary swarm size. This suggests that adapting to intermediate swarm sizes may provide an advantage under changing environmental conditions, such as variation in swarm size due to rare environmental events. Our findings are also in line with Brown (Brown, 1982) that too low and too high swarm density is negative for the overall swarm performance and respectively for the individual organism as well.

4.4. The Cognitive Processes of the Animats

To identify regularities regarding the decision-rules in the respective swarm size conditions we evaluated the animats’ cognitive processes while performing the task. First, we evaluated the frequency of the animats’ various motor states (Figure 7) of all final animats while performing the task in different swarm sizes (same data as in Figure 6). Here, variation across conditions indicates cognitive flexibility. What is more, these data also allowed us to differentiate whether being part of the swarm made the animats act in a certain way, or if, in turn, the swarm is merely the result of individual reactions. For it should be obvious that individual animats were not influenced by the swarm and if seeming swarm behavior was observable it was not due to interactions with other animats. This is also supported by the data: shows no variation in its motor responses across conditions. By contrast, animats evolved in swarms adapted their behavior dependent on the swarm size they were placed in. In particular conditions , which demonstrated the greatest generalizability (Figure 6), also had the most dynamic reaction to the different swarm sizes.

Figure 7. Number of movements, turns, and no-movements scaled by trial duration ( time steps) for all plotted against the different test population sizes.

Second, we measured how often specific sensory-motor state transitions could be observed (Figure 8). Specifically, we recorded which actions at followed a particular input at . This corresponds to a complete external representation of the animats, i.e. their input-output behavior. Having two sensors and two motors, there are possible external states. These were condensed to 4 bits in order to capture the following information: (1) The animat senses a wall, (2) the animat senses another animat, (3) the animat turns left or right, and (4) the animat moves forward (2 motors active). Note that, because of the nature of the task environment, instances such as sensing a wall and an animat at the same time or turning and moving forward at the same time are impossible and thus not considered, leaving different state transitions to be evaluated. As an example, indicates that an animat sees another animat at and moves forward at .

Figure 8 shows the cumulative state transition count during a trial, averaged across the 30 evolutions for each . Since the first-order statistics of sensor and motor states depend on the size of the swarm during a trial, each animat evolved in a particular was tested on each swarm size in the interval of of the environment’s maximal capacity (72 animats). This means that we counted all occurred transitions in different trials of different swarm size. Moving forward without previously spotting another animat is by far the most frequent action in all conditions. It can also be observed that animats in simply ignore other animats colliding with them despite the penalty (), while the others instead evolved to turn () in such a situation. Finally, animats in seem to rely less on sensing the wall for guidance (), which could mean that they overall react less to their environment but rather use memory to solve the task (see also the heatmap example in Figure 5).

Figure 8. The average number of times an animat enters a specific state transition, grouped by . The tuple is coded as (wall sensed, animat sensed, turn, move forward)

. Black bars mark the bootstrapped confidence intervals of the number of observations across evolutions.

To observe more detailed differences between the respective conditions than in Figure 8, we took one time step more into account and considered inputs at , the reactions at , the new inputs at and the corresponding reactions at . To visualize the data we generated a transition probability matrix (TPM) (Figure 9). For the sake of readability, we limited the labels in the plot and will thus describe them here. On the x-axis there are all inputs/reactions at /(), on the y-axis there are all inputs/reactions at ()/. In the matrix one tile visualizes the scaled probabilities per 3-time-step transition and . Since we were more interested in differences across conditions than the absolute number of transitions, we scaled the bars according to their maximum and minimum values over all swarm conditions , to better spot possible differences. Furthermore, values were averaged over all different evolutions per condition, tested, as above, in 5 trials with different swarm sizes.

Figure 9. External states of all over three time steps. The axes are labeled with the state (see text for details). One tile shows how often the state transitions occur per . The values of each single tile are scaled between and , where is the maximum probability to enter that transition and simply the zero probability across conditions .

Animats in stand out regarding their generalizability. Reviewing the TPM one can observe that such animats stayed static more often as the average, especially when spotting an animat or a wall, and also remained static in the following time step. , who would be static more often, rather stays static when sensing nothing at all. This also supports our conclusion from Figure 7 that the behavior of animats in was influenced most by sensing other animats and the environment. While it is observable that animats in always tried to move forward (spotting an animat or not), which had no negative effect in their original evolution environment, we observed that animats in turned more often, even if they have no specific input.

4.5. Brain Complexity

The results presented above indicate that animats evolved the most generalizable behavior. Apart from the animats’ externally observable input-output behavior, we also wanted to take their internal structure into account. The node connectivity in a MB can be modeled as a directed graph. As a simple graph theoretical measure of brain complexity, we thus used the Largest Strongly Connected Component (LSCC), which is also a simple measure of a graph’s integration666Other, less significant, graph theoretic measures, which show the same trend, can be found in the supplementary material A.3.. Additionally, we would take the LSCC or similar metrics into account to determine, which parts of the MB influences the future brain states most. As shown in Figure 10, animats acting alone or in large groups tend to evolve significantly ( less complex brains even at high levels of fitness (see supplementary material A.3). Our assumption is that for the environment was comparatively simple and rules to achieve high fitness were easier to find. Animats evolved with could rely on sensing other animats with a high probability, which could serve as an orientation. evolved the most complex brain structures, which relates to our previous observations of the comparatively high behavioral complexity and generalizability of this group.

Figure 10. Fitness plotted against the LSCC of the animat’s brain, which we used as a proxy for brain complexity. One dot is the average LSCC over 30 experiments per generation evaluated at every generation)


Figure 11 shows the wiring diagram of the best animat in with an average task fitness of . The animat has feed-forward sensors. All other nodes can feed back to each other, except to the sensors. This animat thus has the largest possible LSCC of 6. Feedback-loops are also an indicator for memory in a MB.

Figure 11. Wiring Diagram of the best final genome in . The green circles mark sensor nodes, the yellow circles mark hidden nodes and the red circles mark motor nodes.

5. Future Work

The present results were obtained from one particular task environment. To build empirical strength, implementing more difficult and diverse tasks will be required, e.g., the predator-prey scenario used in earlier works by Olson et al. and Miikkulainen et al. (Olson et al., 2016; Miikkulainen et al., 2012). Additionally, it is important to investigate variations in the animats’ design to determine how the kind and number of sensors influence their behavior. In this work, animats were evolved and tested in an isolated manner. This means that a swarm only contained clones of one animat genome, which was necessary to make a first, specific evaluation. For future work, the effect of diverse swarms should be considered, which might increase the computational performance and make the simulated experiment more realistic. Finally, the development of more rigorous statistical analyses of the animats’ external and internal state transitions is current work in progress.

6. Conclusion

Evaluating the detailed behavior and interactions of organisms in a simulated swarm is an open field of research. In this work, we addressed the effect of swarm size during evolution in a 2-dimensional spatial navigation task in a framework in which animats with Markov Brains were trained using a Genetic Algorithm. We, moreover, evaluated to what extent the resulting animats would be generalizable (testing animats in swarm sizes different from the one they evolved in). Furthermore, we focused on the evaluation of the animats’ swarm behavior and its flexibility when faced with swarms of different sizes. We found that swarm size matters in the evolution of swarm behavior. Even if the task did not require cooperation, animats reacted to other animats non-egoistically in their decisions and formed swarm behavior. Our observation is that animats evolved in very large or very small swarms were less generalizable to other swarm sizes and showed less flexibility in their behavior. We assume that individuals in large swarms primarily acted to avoid collisions and the associated penalty, while animats in small swarms had less incentive to develop proper reactions to encountering other animats. Overall, our results suggest that animats evolved at intermediate swarm sizes may have adaptive advantages due to their more generalizable and flexible behavior, which is also reflected in their higher relative brain complexity.

We would like to thank Arend Hintze and Clifford Bohm at Michigan State University for their early advice on this project, sharing their extensive experience with the evolution of artificial organisms, and, in particular, their help with the MABE framework.


  • (1)
  • Albantakis et al. (2014) Larissa Albantakis, Arend Hintze, Christof Koch, Christoph Adami, and Giulio Tononi. 2014. Evolution of Integrated Causal Structures in Animats Exposed to Environments of Increasing Complexity. PLoS Computational Biology 10, 12 (dec 2014), e1003966.
  • Brown (1982) Jerram L. Brown. 1982. Optimal group size in territorial animals. Journal of Theoretical Biology 95, 4 (apr 1982), 793–810.
  • Clifford Bohm, Nitash C. G. (2017) Arend Hintze Clifford Bohm, Nitash C. G. 2017. MABE (Modular Agent Based Evolver): A framework for digital evolution research. In Proceedings of the European Conference on Artificial Life. MIT Press, 76–83.
  • Dorigo et al. (1996) Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. 1996. Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 26, 1 (1996), 29–41.
  • Dorigo et al. (2004) Marco Dorigo, Vito Trianni, Erol Şahin, Roderich Groß, Thomas H. Labella, Gianluca Baldassarre, Stefano Nolfi, Jean-Louis Deneubourg, Francesco Mondada, Dario Floreano, and Luca M. Gambardella. 2004. Evolving Self-Organizing Behaviors for a Swarm-Bot. Autonomous Robots 17, 2/3 (sep 2004), 223–245.
  • Edlund et al. (2011) Jeffrey A Edlund, Nicolas Chaumont, Arend Hintze, Christof Koch, Giulio Tononi, and Christoph Adami. 2011. Integrated Information Increases with Fitness in the Evolution of Animats. PLoS Computational Biology 7, 10 (oct 2011), e1002236.
  • Franklin et al. (1998) Stan Franklin, Arpad Kelemen, and L. McCauley. 1998. IDA: a cognitive agent architecture. In SMC’98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218), Vol. 3. IEEE, 2646–2651.
  • Franklin and Patterson (2006) Stan Franklin and F.G. Patterson. 2006. The LIDA architecture: Adding new modes of learning to an intelligent, autonomous, software agent. Integrated Design and Process Technology (2006), 1–8.
  • Franklin et al. (2012) Stan Franklin, Steve Strain, Javier Snaider, Ryan McCall, and Usef Faghihi. 2012. Global Workspace Theory, its LIDA model and the underlying neuroscience. Biologically Inspired Cognitive Architectures 1 (jul 2012), 32–43.
  • Garnier et al. (2007) Simon Garnier, Jacques Gautrais, and Guy Theraulaz. 2007. The biological principles of swarm intelligence. Swarm Intelligence 1, 1 (oct 2007), 3–31.
  • Hamann (2014) Heiko Hamann. 2014. Evolution of Collective Behaviors by Minimizing Surprise. 14th Int. Conf. on the Synthesis and Simulation of Living Systems (ALIFE 2014) (2014), 344–351.
  • Hintze et al. (2017) Arend Hintze, Jeffrey A. Edlund, Randal S. Olson, David B. Knoester, Jory Schossau, Larissa Albantakis, Ali Tehrani-Saleh, Peter Kvam, Leigh Sheneman, Heather Goldsby, Clifford Bohm, and Christoph Adami. 2017. Markov Brains: A Technical Introduction. (sep 2017). arXiv:1709.05601
  • Ilie and Badica (2013) Sorin Ilie and Costin Badica. 2013. Multi-agent approach to distributed ant colony optimization. Science of Computer Programming 78, 6 (2013), 762–774.
  • Karpov et al. (2015) Igor V. Karpov, Leif M. Johnson, and Risto Miikkulainen. 2015. Evaluating team behaviors constructed with human-guided machine learning. In 2015 IEEE Conference on Computational Intelligence and Games (CIG). IEEE, 292–298.
  • König et al. (2009) Lukas König, Sanaz Mostaghim, and Hartmut Schmeck. 2009. Decentralized evolution of robotic behavior using finite state machines. International Journal of Intelligent Computing and Cybernetics 2, 4 (nov 2009), 695–723.
  • Marstaller et al. (2013) Lars Marstaller, Arend Hintze, and Christoph Adami. 2013. The Evolution of Representation in Simple Cognitive Networks. Neural Computation 25, 8 (aug 2013), 2079–2107. arXiv:1206.5771
  • Miikkulainen et al. (2012) Risto Miikkulainen, Eliana Feasley, Leif Johnson, Igor Karpov, Padmini Rajagopalan, Aditya Rawal, and Wesley Tansey. 2012. Multiagent Learning through Neuroevolution. In

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    . Vol. 7311 LNCS. 24–46.
  • Oizumi et al. (2014) Masafumi Oizumi, Larissa Albantakis, and Giulio Tononi. 2014. From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0. PLoS Computational Biology 10, 5 (may 2014), e1003588.
  • Olson et al. (2012) Randal S Olson, Arend Hintze, Fred C Dyer, David B Knoester, and Christoph Adami. 2012. Predator confusion is sufficient to evolve swarming behavior. Journal of the Royal Society, Interface / the Royal Society 10 (sep 2012), 20130305. arXiv:1209.3330
  • Olson et al. (2013) Randal S. Olson, David B. Knoester, and Christoph Adami. 2013. Critical interplay between density-dependent predation and evolution of the selfish herd. Proceeding of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13 (2013), 247.
  • Olson et al. (2016) R. S. Olson, D. B. Knoester, and C. Adami. 2016. Evolution of Swarming Behavior is Shaped By How Predators Attack. Artificial Life 22 (2016), 299–318. arXiv:1411.7267
  • Pacala et al. (1996) Stephen W. Pacala, Deborah M. Gordon, and H. C. J. Godfray. 1996. Effects of social group size on information transfer and task allocation. Evolutionary Ecology 10, 2 (mar 1996), 127–165.
  • Reynolds (1987) Craig W. Reynolds. 1987. Flocks, herds and schools: A distributed behavioral model. In SIGGRAPH.
  • Stanley et al. (2005) Kenneth O Stanley, Ryan Cornelius, Risto Miikkulainen, Thomas D Silva, and Aliza Gold. 2005. Real-time Learning in the NERO Video Game. Proceedings of the First Artificial Intelligence and Interactive Digital Entertainment Conference 2003 (2005), 2003–2004.
  • Trianni et al. (2003) Vito Trianni, Roderich Groß, Thomas Halva Labella, Erol Şahin, and Marco Dorigo. 2003. Evolving Aggregation Behaviors in a Swarm of Robots. Lecture Notes in Computer Science, Vol. 2801. Springer Berlin Heidelberg, Berlin, Heidelberg. 865–874 pages.
  • Wilson (1985) S. W. Wilson. 1985. Knowledge Growth in an Artificial Animal. Proceedings of an International Conference on Genetic Algorithms and Their Appliations (1985), 16–23.

Appendix A Supplementary Material

a.1. Mabe

The following settings were used to configure the GA:

  1. Settings for the Genome

    1. Type: Circular

    2. Alphabet Size: 256

    3. Sites Type: char

    4. Initial Size: 5,000

    5. Mutation Point Rate: 0.005

    6. Mutation Copy/Delete Rate: 0.00002

    7. Minimal Mutation Copy/Delete Size: 128

    8. Maximum Mutation Copy/Delete Size: 512

    9. Minimal Size: 2,000

    10. Maximal Size: 20,000

  2. Settings for the Markov Brain:

    1. Type of Gates: Deterministic

    2. Range of Inputs/Outputs per Gate: 1-4

  3. Settings for the Optimizer:

    1. Type of Optimizer: Tournament

    2. Tournament Size: 5

    3. Population Size: 100

    4. Elitism: No

a.2. Significance Tests

Our conclusions are mainly formed using the evolution of fitness values, the fitness values of final evolved animats performing in different group sizes and the brain structures of final evolved animats. The data in the evolution of fitness contains fitness values of 30 random evolutions. For the significance test we only took the last generation into account. The data in the generalizabilty of the animats contains fitness values of final evolved animats, which were tested in groups of different size (see Figure 6). The data to calculate the brain complexity results from the MB’s connectivity matrix of all final evolved animats. In this section we present the significance tests of the analysis on this data.

To test the significance, we first performed a Kruskal-Wallis test between all groups, followed by a Mann-Whitney-Test to test each pair of group. Since the Kruskal-Wallis was significant throughout all groups, we only provide the results of the Mann-Whitney-Test.

p-value 0.0001
p-value 0.0000 0.0000
50 68
p-value 0.0000 0.0000 0.0012
101 118 225
p-value 0.0000 0.0000 0.0000 0.0000
0 0 64 168
Table 2. Fitness Evolution Significance: p-values for the Mann-Whitney-Test testing the significance of the fitness of the last generation between the groups ().
p-value 0.0000
p-value 0.0000 0.0000
121,207 138,487
p-value 0.0000 0.0000 0.1548
149,688 170,764 191,888
p-value 0.0000 0.0000 0.0000 0.0000
84,366 74,146 66,428 69,566
Table 3. Significance of group size generalizability using the fitness of the best animats, which were tested in different group sizes (see Figure 6): p-values for the Mann-Whitney-Test testing the significance of the fitness generalizabilty between the groups ().
p-value 0.2464
p-value 0.0253 0.0042
328 282
p-value 0.3033 0.1099 0.0734
416 370 360
p-value 0.0614 0.2262 0.0003 0.0197
350 401 226 316
Table 4. Significance of the average LSCC values of the animat’s MB in the last generation: p-values for the Mann-Whitney-Test testing the significance of the average LSCC values between the groups ().

a.3. Alternative Graph Theory Measures

We evaluated the brain complexity by using four different graph theoretic measures: LSCC, the average shortest path between any of the elements in the MB, the betweeness centrality (a measure of centrality based on the shortest path) and the average degree of the elements in the MB (the degree is the number of connected neighbors). This section displays the plots and significance values for the last three mentioned measures.

Figure 12. Fitness plotted against the average shortest paths of the animat’s brain mechanisms in the last generation. One data point is the average shortest path length over 30 experiments per generation evaluated at every generation).
p-value 0.2006
p-value 0.2814 0.0475
411 337
p-value 0.2227 0.0377 0.4970
398 330 449
p-value 0.1604 0.4793 0.0150 0.0181
383 446 303 308
Table 5. Significance of the average shortest path values of the animat’s MB in the last generation: p-values for the Mann-Whitney-Test testing the significance of the average shortest path values between the groups ().
Figure 13. Fitness plotted against the average degree of all nodes in an animat’s brain. One data point is the mean of the average degree over 30 experiments per generation evaluated at every generation).
p-value 0.2954
p-value 0.2624 0.0789
403 378
p-value 0.3183 0.4612 0.0518
394 361 424
p-value 0.0038 0.0292 0.0001 0.0076
417 426 336 337
Table 6. Significance of the average degree of the animat’s MB in the last generation: p-values for the Mann-Whitney-Test testing the significance of the average degree per mechanism values between the groups ().
Figure 14. Fitness plotted against the average betweeness centrality of all nodes in an animat’s brain. One data point is the mean of the average betweeness centrality over 30 experiments per generation evaluated at every generation).

Since the Kruskal-Wallis test shows that there is no significant difference between the groups () we do not provide the values of the Mann-Whitney U test.