Increasing Behavioral Complexity for Evolved Virtual Creatures with the ESP Method

10/27/2015 ∙ by Dan Lessin, et al. ∙ The University of Texas at Austin IT University of Copenhagen 0

Since their introduction in 1994 (Sims), evolved virtual creatures (EVCs) have employed the coevolution of morphology and control to produce high-impact work in multiple fields, including graphics, evolutionary computation, robotics, and artificial life. However, in contrast to fixed-morphology creatures, there has been no clear increase in the behavioral complexity of EVCs in those two decades. This paper describes a method for moving beyond this limit, making use of high-level human input in the form of a syllabus of intermediate learning tasks--along with mechanisms for preservation, reuse, and combination of previously learned tasks. This method--named ESP for its three components: encapsulation, syllabus, and pandemonium--is presented in two complementary versions: Fast ESP, which constrains later morphological changes to achieve linear growth in computation time as behavioral complexity is added, and General ESP, which allows this restriction to be removed when sufficient computational resources are available. Experiments demonstrate that the ESP method allows evolved virtual creatures to reach new levels of behavioral complexity in the co-evolution of morphology and control, approximately doubling the previous state of the art.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 3

page 4

page 6

page 7

page 8

page 11

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

(a)
(b)
Fig. 1: The two versions of ESP described in this paper. This paper explores the ESP method for increasing behavioral complexity in evolved virtual creatures. Two implementations of this method are presented: Fast ESP and General ESP. (a) Fast ESP allows the open-ended development of behavioral complexity for EVCs with only a linear increase in computational time. See results produced using Fast ESP at http://youtu.be/dRLNnJlT8rY . (b) When sufficient computational resources are available, General ESP permits greater body adaptation to multiple tasks, while preserving Fast ESP’s ability to produce behavioral complexity. See results produced using General ESP at http://youtu.be/fyVr7gdGEPE .

Defining behavioral complexity as the number of discriminable behaviors in a creature’s repertoire, most evolved virtual creatures have minimal complexity, employing repertoires that contain only a single behavior. The original examples by Sims [26] of locomotion on land and in water, as well as jumping, fall into this category, as does much of the work that others have since completed. For example, locomotion in air [24], a specialized form of ground-locomoting EVCs that can be converted into functional real-world robots [17], soft-bodied virtual creatures [10, 8], and many other variations have this level of complexity [4, 9, 2, 13, 12, 14]. The highest level of behavioral complexity demonstrated by Sims—creatures with the ability to follow a target or a path by switching between perhaps up to five discriminable behaviors—has since been matched multiple times [21, 25, 18], but never clearly exceeded (Figure 2).

Fig. 2: Behavioral complexity in EVCs. Defined as the number of discriminable behaviors in a creature’s repertoire, the behavioral complexity achieved by Sims in 1994 has not been clearly exceeded in later work. In contrast, the ESP method described in this paper approximately doubles it.

Yet more complex behaviors would clearly be useful. Numerous examples of valued creature content from the real world—nature documentaries, animal and human combat, even internet cat videos111e.g., “THE BEST CAT VIDEO YOU’LL EVER SEE” [sic], http://www.youtube.com/watch?v=20mrEtabOLM—feature more complex behaviors than what has been demonstrated in EVCs to date. Perhaps if we can bring greater behavioral complexity to EVCs, they can begin to approach the entertainment value of their non-virtual counterparts.

In fact, there is suggestive evidence in support of this proposition. Cognitive science and psychology describe a striking effect in which the right kinds of relatively complex behaviors—even by the simplest of geometric figures—lead to the perception of intentionality and desires (perceptual animacy[22, 7]. For a particularly clear non-academic example of this same effect, consider the academy-award-winning animated short “The Dot and the Line” [11]. In much of this film, the only elements added to a simple dot and line to transform them into the protagonists of a compelling love story are their movements–their behavioral complexity.

Motivated by this potential, this paper describes a method designed to increase behavioral complexity in evolved virtual creatures. The primary elements of this method, ESPencapsulation, syllabus, and pandemonium—are defined as follows:

  1. A human-designed syllabus breaks the development of a complex behavior into a sequence of smaller learning tasks.

  2. Once each of these subskills is learned, it is encapsulated to preserve it throughout future evolution, and also to allow future skills to incorporate its function more easily.

  3. A mechanism inspired by Selfridge’s pandemonium [23] is used to resolve disputes between competing skills or drives within the increasingly complex brain.

ESP is presented in this paper in two complementary versions: Fast ESP and General ESP. By placing some key limitations on morphological changes after the first skill in a syllabus is learned, Fast ESP eliminates the need for retesting of previous skills as new skills are added. This approach allows computational time to grow approximately linearly as behavioral complexity is increased. In contrast, when sufficient computing resources are available and full morphological adaptation to multiple skills is important, General ESP removes the morphological constraints of Fast ESP through a process of retesting and reconciliation. This paper offers a unified treatment of both the Fast and General ESP versions, which were first described by Lessin et al. in conference papers [15] and [16], respectively.

In the remainder of this paper, Section II presents the background on EVCs and their typical behaviors. Section III reviews the mechanics of the EVC system underlying the ESP implementation. In Sections IV and V of this paper, Fast ESP is described, and it is employed to approximately double the state of the art in behavioral complexity for evolved virtual creatures. In Sections VI and VII, General ESP is described, and it is applied to demonstrate a significant increase in the useful variety and quality of evolved creatures, while still incrementally developing complex behaviors from a sequence of simpler learning tasks.

Ii Background

This section provides a review of relevant background material, including EVCs, typical EVC behaviors and their complexity, and task decomposition strategies related to ESP.

Ii-a Evolved Virtual Creatures (EVCs)

The first and most influential examples of evolved virtual creatures are due to Sims [26]

. The genotypes for Sims’ creatures were directed graphs, able to encode complex body structures. The bodies of these creatures were composed of boxes, connected by joints with varying degrees of freedom and evolvable limits to their revolution. Actuation was provided by implicit joint motors, able to apply force at every degree of freedom of every joint. Sims’ brains were composed of nodes computing simple functions, with signals carried between nodes by evolved connections. In his implementation, brain elements may be embedded within body segments, where they can take advantage of the same kinds of repetition and recursion as the creature’s morphology. Evolution in Sims’ system made use of fitness-proportionate selection, crossover, mutation, and elitism.

Using this system, Sims demonstrated impressive results in multiple tasks. A variety of creatures were evolved for locomotion, both on land and in water, and creatures with the ability to jump off the ground were produced. Most impressively, Sims demonstrated creatures evolved for phototaxis (light seeking) behavior. Many of these behaviors have since become benchmarks for EVCs.

Ii-B Locomotion

The standard benchmark task for an EVC system is locomotion. Sims presented locomotion on land and water, and this result has been repeated for many different purposes by many different researchers.

Lipson and Pollack evolved creatures for locomotion in a system that allowed the results to be 3-D printed and activated in the real world [17]. Creatures composed of rigid segments and linear actuators were evolved for locomotion in physical simulation. The body parts (including joints) could then be 3-D printed, and only the fitting of actuators required special attention during assembly. Notably, these creatures were required to maintain static stability at all times (i.e., have their center of gravity always supported by the body). In this manner, a consistent transition to the real world was guaranteed, where dynamics might differ from simulation, but geometry should not.

Shim and Kim evolved virtual creatures for another type of locomotion—flight [24]. Auerbach and Bongard tested the environment’s influence on morphology when evolving locomotion [1]. Lehman and Stanley used locomoting EVCs as the subject for an investigation of novelty promotion [14]. Cheney et al. demonstrated their new encoding for soft-bodied EVCs by applying them to the locomotion benchmark [5].

Ii-C Phototaxis

Phototaxis (the ability to move to a light source) was the most complex behavior demonstrated by Sims. By testing the ability to move toward a light target placed at multiple positions, creatures were developed with a generalized ability to perform phototaxis. This remained the most complex EVC behavior for almost two decades until the work described here.

Pilat and Jacob reproduced the behavioral complexity of Sims’ phototaxis approximately in their 2010 work [21]

, although their implementation differed in some respects. While Sims’ photoreceptors were embedded in each body segment and produced signals relative to the segment’s orientation, Pilat and Jacob used a single sensor for the entire creature, and that sensor’s signals were preprocessed to give one output for heading to the light and another for the light’s elevation angle. Also, Pilat and Jacob’s creatures had simpler morphology, allowing only single-degree-of-freedom hinge joints between segments. Unlike the control networks of Sims, with nodes computing a variety of predefined functions, Pilat and Jacob’s creatures used a more conventional artificial neural network (ANN). Shim and Kim also achieved a similar result in 2004 with flying creatures able to follow paths 

[25]. The EVC system presented here demonstrates phototaxis as an intermediate step on the path to more complex behaviors.

Miconi’s work [18] is a particularly interesting case, as he is the first to produce a form of real combat between EVCs, but with respect to behavioral complexity as defined above, his creatures do not differ significantly from those of Sims, as their combat can be essentially viewed as target following with damage assignment layered on top—the target following leads to collisions, and these collisions produce a score interpreted as damage, but no additional behavioral complexity is required or produced.

Ii-D Task Decomposition

It is important to note that task-decomposition strategies similar to ESP have been employed in multiple related fields, but always in conjunction with a fixed morphology. Selfridge’s pandemonium, Minsky’s society of mind [19], and Brooks’ subsumption architecture [3]

are prominent examples from artificial intelligence and robotics. And in reinforcement learning and evolutionary computation, work such as layered learning and hierarchical task decomposition 

[28, 29, 6] explores similar concepts. In EVCs, however, with the particular challenges of simultaneously evolved morphology and control, no previous system has demonstrated the use of such an approach to increase behavioral complexity.

Iii Underlying EVC System

The underlying EVC system described here is largely derived from the work of Karl Sims [26]. This section briefly sets out the components of this system, which—while not the primary focus of this paper—are nevertheless fundamental to its comprehension. A representative sample of results is shown in Figure 3.

Fig. 3: Typical results from the underlying basic EVC system. These examples were all evolved to complete a forward locomotion task—a common baseline result for EVCs.

Iii-a Evolutionary Algorithm

The specifics of the evolutionary algorithm are conventional, making use of elitism, fitness-proportionate selection, and rank selection 

[20]. In addition, the most challenging tasks employ some degree of shaping [27]. Fitness is evaluated in a physically simulated virtual environment implemented with NVIDIA PhysX.

Iii-B Morphology

(a) Simple topology.
(b) Multiple edges for repeated substructures.
(c) Reflexive edge for recursive structure.
(d) Multiple and reflexive edges together.
Fig. 4: Hand-designed genotype/phenotype pairs (as in [26]) demonstrate the encoding power inherited from Sims’ original EVC system.

As in Sims’ original work [26], creature morphology is described by a graph-based genotype, with graph nodes representing body segments, and graph edges representing joints between segments. By starting at the root and traversing the graph’s edges, the phenotype is expressed. Reflexive edges as well as multiple edges between the same node pair are allowed, making it possible to easily define recursive and repeated body substructures, as illustrated in Figure 4. In addition, as in Sims’ work, reflection of body parts as well as body symmetry are made easily accessible to evolution. In this implementation, all PhysX primitives are made available for use as body segments: boxes, spheres, and capsules. Joints between segments may be of most of the types offered by PhysX, specifically: fixed, revolute, spherical, prismatic, and cylindrical. In contrast to the typical technique of separately evolving explicit joint limits, most limitations on joint movement in this system are provided implicitly by creature structure through natural collisions between adjacent segments.

Iii-C Control

Again in a manner very similar to that of Sims, creature control is provided by a brain composed of a set of nodes connected by wires (as in Figure 6(a)). Nodes receive varying numbers of input wires, and use their inputs to compute an output value (always in the range [0,1]) which may be sent to other wires. Signals originate from sensors in the body as well as certain types of internal brain nodes, travel through the network of internal nodes and wires, and ultimately control the operation of actuators (muscles) in the physically simulated body. For each step of physical simulation, control signals move one step through the brain.

In addition to special node types for muscles and photoreceptors (described below) and one special type used in encapsulation (see Section IV-B), the following node types are allowed: sinusoidal, complement, constant, scale, multiply, divide, sum, difference, derivative, threshold, switch, delay, and absolute difference.

Iii-D Photoreceptors

For tasks involving light sensing, creatures are allowed to develop simple photoreceptors ((a) in Figure 5), defined only by a direction from the center of their parent segment. This direction indicates a location on the creature’s surface as well as an orientation for the receptor. The signal produced by the receptor is determined by light strength, distance, occlusion, and the difference between the direction to the light and the sensor’s orientation. Multiple lights are allowed. For each photoreceptor in the body, a corresponding brain node is added which makes the receptor’s output signal available to the rest of the brain.

Fig. 5: Photoreceptors (a) and muscles (b) bring sensing and actuation to creatures in the underlying EVC system. For both, function depends upon placement, so creature form develops meaningfully as capabilities are evolved.

Iii-E Muscles

In a break with traditional EVC systems, which typically use forces exerted directly at joints, this system uses simulated muscles as actuators. Each muscle ((b) in Figure 5) is defined by two attachment points on adjacent segments, along with a maximum strength value. In simulation, the muscle is implemented as a spring, with muscle activation modifying the spring constant. In addition to acting as an effector, each muscle also produces a proprioceptive feedback signal based on its current length. For each muscle, one node is added to the brain which accepts an input to set the muscle’s activation, and another node is added which makes the muscle’s proprioceptive output signal available to the rest of the brain. Muscle drives bring the following potential benefits to EVCs: flexibility (they can be used even on creatures without joints), efficiency (effectors need only exist where useful, not at every degree of freedom of every joint), and aesthetic appeal (by tapping into the human affinity for elegant, functional body structure).

Iv Fast ESP

Being simpler than General ESP, the Fast ESP method will be described first. Note that with the exception of limiting morphological changes, as described in Section IV-D, all elements of the method described here are also a part of General ESP (Section VI).

The Fast ESP method [15] consists of three elements added to the underlying EVC system: a syllabus, encapsulation, and pandemonium. In the beginning of this section, each of these components is described in detail. The section ends with a description of the morphological limitations specific to Fast ESP.

Iv-a Syllabus

While it is certainly possible for human students to learn a complicated topic independently, their development is typically faster and surer with the benefit of an expert-designed syllabus. The syllabus acts as a sequence of landmarks through the space of possible solutions, decomposing the larger learning task into a succession of more manageable steps between these waypoints.

In the ESP system, the syllabus consists of an ordered sequence of fitness goals used to reach the ultimate, larger goal. This collection of intermediate goals (each one defined by a fitness function) is designed by a human expert with the aim of making attainable goals more reliably learnable, and bringing previously unattained goals within reach.

For example, assume that you want to evolve a virtual creature with some of the behavioral complexity demonstrated in an internet cat video. Rather than simply drifting smoothly toward a target, this creature might run to the target, then strike it, and perhaps even run away if the target is perceived as threatening. Without a syllabus, a single fitness test evaluating all of these skills might be constructed, but evolutionary progress would be unlikely.

Consider, instead, how this complex behavioral goal could be broken down into an ordered sequence of smaller learning tasks. The clearly achievable goal of locomotion will be the first target. The ability to turn left and the ability to turn right are of a similarly manageable difficulty, and will be attempted next. Then, with left and right turns mastered, and the ability to develop photoreceptors, it would seem relatively straightforward to maintain orientation toward a light source. And with the ability to face a light and the ability to move forward, navigating to that light might be a similarly achievable goal. Proceeding in this manner, a knowledgeable human designer might produce the following sequence of subskills to be learned, in which each subskill is probably attainable with basic EVC methods, and in which earlier subskills serve to make later skills easier to learn:

Fig. 6: An example syllabus as a graph. In this depiction, graph nodes represent individual subskills to be learned, directed edges indicate dependency between subskills, and the numbering indicates a proposed learning order which satisfies the dependency requirements. Pandemonium relationships are indicated by dashed red lines.
  1. forward locomotion

  2. left turn

  3. right turn

  4. turn to light (using left turn and right turn)

  5. move to light (using turn to light and forward locomotion)

  6. strike

  7. attack light (using move to light and strike)

  8. turn from light (using left turn and right turn)

  9. retreat from light (using turn from light and forward locomotion)

  10. fight or flight (switching between attack light and retreat from light based on external circumstances)

This information may be conveniently summarized in a graph, encompassing subskills to be learned, dependency between subskills, learning order, and pandemonium (Section IV-C), as seen in Figure 6.

At this point, using high-level human knowledge, a previously impractical learning task has been broken into a sequence of potentially attainable subgoals. But how can a single evolving creature learn new skills while retaining and making use of the ones it already has?

Iv-B Encapsulation

(a) Before encapsulation.
(b) After encapsulation (with new nodes shaded).
Fig. 7: The automated encapsulation of an evolved skill—in this case, forward locomotion—ensures that it will persist throughout future evolution, while also allowing it to be easily activated as a unit by future skills.

The second important element of the ESP system is a mechanism to encapsulate previously learned skills. This accomplishes two important goals: It ensures that previously learned skills (and the body components they rely on) are preserved, and it makes these skills easily accessible to future evolutionary development. Both of these goals are achieved through the automated encapsulation process illustrated in Figure 7.

Figure 6(a) depicts a brain evolved for forward locomotion, and Figure 6(b) shows the result of encapsulation. Note the following aspects of this new brain. The nodes that compute the old skill have been preserved and locked (meaning that they have been marked so as to disallow any changes by future evolution). Also, a new multiply node has been inserted into every output wire leaving the encapsulated skill. The internals of the skill will continue to function as before, always trying to perform their forward locomotion task, but now, a second signal sent to each new multiply node will modify those outgoing forward-locomotion control signals, scaling them by a number in [0,1]. Finally, a single controlling node (called a sigma node for its function as a summation of zero or more inputs) is added, which sends its output to all of the new multiply nodes. So for each signal leaving a node in the forward locomotion skill (such as the complement node), the new signal after encapsulation () is computed as where is the output of the controlling sigma node.

Now, with encapsulation complete, the entire forward locomotion skill can be activated and deactivated as a unit by using the controlling sigma node just as if it were a single muscle in the creature’s body. (Incidentally, note that this brain’s actual muscle nodes have been hidden behind additional sigma nodes to allow future evolution to share control over them when appropriate.) As progress through the syllabus continues and the next skill after forward locomotion is evolved, its newly added nodes will be the only ones in the brain that are not already locked, and will therefore be easily identifiable when it is their turn to be encapsulated.

At this point, we have seen a system in which a complex skill can be broken into smaller subskills, and those subskills can be cumulatively acquired, but a potential problem still remains: How will competing signals from the multiple sub-brains within a single creature be resolved?

Iv-C Pandemonium

Consider the following example based on the syllabus graph of Figure 6. A creature evolved through this syllabus will ultimately have parts of its brain devoted to both left and right turns. But it seems unlikely that both of these abilities should ever be used at the same time. So the syllabus designer might place the left and right-turn skills in a pandemonium

relationship with each other, meaning that whichever one is most active at any given moment will be allowed to send its output at full strength, and the other will have its output entirely suppressed. Under a system like this, sub-brains within the creature can compete for overall control, with little risk of sabotaging the usefulness of the entire brain. In Figure 

6, a full set of pandemonium relationships is indicated by red dashed lines between subskill nodes.

With this final component of the ESP system described, it is now possible to consider a full example, in which previously achieved levels of behavioral complexity are first matched, then exceeded.

Iv-D Morphological Limitations in Fast ESP

In order to obviate retesting of previously learned skills, Fast ESP limits morphological changes afer the first skill in the syllabus is complete. Because of this, only the new skill being learned must be evaluated during evolution, leading to an approximately linear growth in computation time with respect to the number of skills learned. Body changes with no significant impact on existing skill function—the addition of eyes and muscles—are permitted throughout development, but changes which might invalidate existing control abilities–those to the skeleton segments and joints—are prohibited after the acquisition of the first skill is complete.

V Fast ESP Results

(a)
(b)
Fig. 8: Better fitness development with ESP. Fitness graphs for a skill that controls the body directly and for a skill constructed hierarchically from existing skills. (a) Fitness graphs for all five runs of the left turn skill. Since this behavior must develop full morphological control from scratch, progress may be irregular and inconsistent. (b) Fitness graphs for all five runs of the turn to light skill. By taking advantage of ESP’s ability to re-use existing encapsulated behaviors, creatures can often acquire such abilities quickly and consistently. These plots demonstrate ESP’s ability to make complex skills easier to acquire.

The primary result of this paper is an application of the ESP method, using the syllabus of Figure 6, to evolve a virtual creature through a sequence of ten learning tasks, the first five of which approximately match the previously demonstrated behavioral-complexity limit for EVCs, and the second five of which approximately double it. In this section, this is demonstrated with the Fast ESP implementation. (These results are best viewed in the first accompanying video222http://youtu.be/dRLNnJlT8rY.)

Skill 1: FORWARD LOCOMOTION

A forward locomotion result from the basic EVC system has been chosen, and its control abilities have been encapsulated, as shown in Figure 9. This creature was evolved through traditional EVC techniques, including the use of shaping, with the ultimate fitness being defined by the interleaving of an efficiency score into a discretized score for speed. Specifically, if is the creature’s speed, is the maximum speed, is the discretization step, and is a measure of the creature’s efficiency (in [0, 1]), the combined fitness is defined as

This is intended to ensure that speed is the primary factor in fitness, but increased efficiency (while maintaining approximate speed) is also rewarded.

At this point, the creature has developed the rigid body segments, muscles, and control system it needs for successful locomotion and, as a part of the Fast ESP algorithm, these elements will be preserved as evolution continues.

Fig. 9: forward locomotion encapsulated.

Skill 2: LEFT TURN

With the locomotion skill preserved, a new run of evolution begins, this time with the fitness function rewarding the ability to rotate counterclockwise while largely maintaining position. The addition of new muscles is allowed during this process. The resulting completed skill is shown (after encapsulation) in Figure 10.

Fig. 10: left turn added.

Skill 3: RIGHT TURN

With the first two skills preserved, a clockwise turn is evolved in the same way as the counterclockwise turn, and the result is encapsulated (Figure 11). At this point, the creature has all of the low-level skills that it will need to reach any point on the ground, with the majority of future skills relying ultimately on reapplications of forward locomotion, left turn, and right turn.

Fig. 11: right turn added.

Skill 4: TURN TO LIGHT

At this point, the creature is allowed to develop photoreceptors, while being tested on its ability to orient itself to a target (which is perceived as a point light source) using the previously encapsulated left turn and right turn skills. The fitness evaluation is an average over four runs, each with a fixed light source at a different heading from the creature. Figure 12 shows the completed and encapsulated result, which is able to consistently aim its locomotion direction at a user-controlled target.

Note that skills such as this—which take advantage of ESP’s ability to reuse previously encapsulated abilites—can for some tasks produce results very quickly and consistently. This is in contrast to skills such as left turn, which must solve the potentially much harder problem of full morphological control from scratch. This contrast is illustrated in the fitness graphs for the two skills, as seen in Figure 8.

Fig. 12: turn to light has been added, which keeps the locomotion direction (black dashed arrow) oriented toward a target.

Skill 5: MOVE TO LIGHT (Matches the Previous State of the Art)

Now, with turn to light and forward locomotion available, and with the evolution of new photoreceptors allowed, the creature is evaluated on its ability to navigate to a light source. As with turn to light, fitness is averaged over multiple runs (in this case five), again with a fixed light source at a different relative angle each time. The result (Figure 13) is a creature whose behavioral complexity approximately matches the previous state of the art.

Fig. 13: move to light has been added, allowing the creature to follow a target along a complex path, catching the target when it finally stops.

Skill 6: STRIKE

In anticipation of the upcoming attack task (see Figure 6), the creature must first learn to deliver a strike to the ground underneath it. For this creature, that goal is accomplished with a vertical jump, as seen in Figure 14. To facilitate the evolution of this new low-level skill, the development of new muscles is allowed.

Fig. 14: This creature’s strike solution employs a vertical jump.

Skill 7: ATTACK (Surpassing the Previous State of the Art)

Having learned move to light and strike, it is now possible to produce an ability slightly more complex than simply moving to a target. By first moving to the target, then striking, this creature (Figure 15) clearly surpasses the previous state of the art, and takes another small step toward the behavioral complexity of compelling creature content from the real world. For this task, fitness is an average across four directions of distance from the target when the first sufficiently strong ground impact occurs (with a penalty for producing such an impact when the scene contains no light).

Fig. 15: In the newly added attack, the creature navigates to the target, then strikes it.

Skill 8: TURN FROM LIGHT

Now, in preparation for the upcoming retreat skill (see Figure 6), the creature must learn to turn away from a light source (as shown in Figure 16). Although obviously similar to turn to light, this task also required a fitness term to discourage an initial wrong-direction turn, so as to achieve reasonable results for targets near the creature’s front. Also, significantly more evaluation directions (thirteen) were used (particularly near the front) to similarly motivate appropriate reactions in these cases.

Fig. 16: turn from light has been added, which keeps the locomotion direction (black dashed arrow) oriented away from the target.

Skill 9: RETREAT

Fig. 17: retreat added.

At this point, using turn from light and forward locomotion, the creature learns to maximize its average distance from a light target. As with turn from light, penalties for initial wrong-direction moves and multiple tests with targets near the front are combined to discourage inappropriate initial reactions. With this skill complete (Figure 17), the necessary components are in place for the final top-level skill of the syllabus.

Skill 10: FIGHT OR FLIGHT

Fig. 18: fight or flight has been added, completing the progression through the syllabus.

The task of this final, highest skill is to choose between attack and retreat based on the perceived environment. For this evaluation, the creature is either confronted with a vulnerable target (a single disc on the ground), which the creature should attack, or a dangerous target (a spinning vertical stack of three such discs), which will destroy the creature if touched.

The fitness score is again the result of averaging over initial light directions, but in this case there is some additional complexity. At each direction, one evaluation is made with a vulnerable target, and one with a dangerous target. While the proper reaction in a single case (vulnerable vs dangerous) should be rewarded, the real challenge is to motivate a discrimination between the two, so that the right action can be taken in both cases. To accomplish this, a small fraction of the final score is based on the average maximum of the two component scores (to motivate any development, especially initially), and a much larger fraction of the final score is based on the average minimum of the two component scores (to reward the ultimate goal of finding the proper reaction in both cases). The weighting is chosen so that a single perfect result for a minimum component will be worth more than perfect scores in all of the maximum components. So if is the average maximum score across all test directions, and is the average minimum score across all test directions, then the final overall fitness is computed as

Without these additional motivations, solutions emerged which chose a single (higher-scoring) hard-coded reaction to be used for each light position—regardless of target type—without making the leap to the increased scores available if discernment between the two types of target could be developed.

Figure 18 shows a successful result for this task, marking the completion of the full syllabus and the acquisition of its highest, most complex skill. This result demonstrates that the ESP system can enable evolved virtual creatures to achieve a level of behavioral complexity which is a clear advance on the state of the art.

Vi General ESP

The Fast ESP method achieves the goal of breaking the upper limit on behavioral complexity previously demonstrated in EVCs, but it does so at the cost of constrained morphological changes after the first skill is complete. While that method remains useful due to its efficiency, when sufficient time or computational resources are available, it may be desirable to relax that strict morphological constraint. The General ESP method [16] makes this possible, enabling full morphological adaptation to multiple tasks, while maintaining Fast ESP’s ability to incrementally develop complex behaviors.

Vi-a Replacing Morphological Constraints with Retesting

Fast ESP enforced strict limits on morphological changes after the completion of the first skill: Although changes to muscles and photoreceptors were allowed, segments and joints were fixed. Due to this constraint, previously learned skills could be expected to work reliably throughout the syllabus-based construction. On the other hand, this limitation makes it difficult to develop certain abilities later. For example, a creature may succeed in developing forward locomotion and the ability to turn left, but—due to the construction of a certain joint evolved for locomotion—be unable to learn to turn right, even after many generations of evolution.

Luckily, when sufficient computational resources exist, this limitation can be removed simply by expanding and modifying the fitness evaluations applied during learning: Instead of locking segments and joints after the first skill is developed, successive skills could be allowed to change these attributes, as long as new testing shows that such changes will not invalidate earlier abilities.

However, such an increase in testing threatens to make an already computationally demanding problem significantly more difficult, especially because the system is intended to be open ended. Assuming skills and one independent test for each skill, full retesting of all previous skills at each step of the syllabus would produce an growth in the required testing, instead of Original ESP’s linear growth.

Fig. 19: In this presentation of the previous section’s syllabus graph, leaf nodes (shaded) affect only the body, rather than other nodes, and constitute the focus of the Extended ESP system.

Fortunately, the retesting can be considerably reduced by focusing it where it matters. Consider again the previous syllabus graph, as presented again in Figure 19. The skills that have a direct influence on the creature’s body are shaded, and will be referred to as leaf skills. These are: forward locomotion, left turn, right turn, and strike. Once these skills are successfully established, the remaining non-leaf skills can be evolved independently (in an order that meets dependency requirements), without the need for any retesting. This approach stops the growth in testing requirements significantly earlier than would otherwise be possible–in the syllabus of Figure 19, for example, after four skills instead of 10 (assuming all leaf skills are learned first).

Vi-B New Elements of the General ESP Algorithm

Fig. 20: The fitness graph for the evolutionary run that produced the creature of Figure 21(f). This graph illustrates the two stages of new-skill evolution in the General ESP algorithm. First, the new skill (in this case  strike) is allowed to freely evolve both body and brain to its own ends, so long as any extant skills (in this case  forward locomotion) are maintained to prescribed levels. The fitness for the new skill during this stage is graphed as (a). Second, morphology is preserved, while each previously developed skill is given a fixed number of generations to adapt to the new body. The fitness for the previous skill’s reconciliation to the body during this stage is graphed as (b). These two stages work together to allow morphological adaptation for new skills, while ensuring that old skills are not lost.

This section describes the elements added to Fast ESP to produce the General ESP algorithm, taking advantage of the previous section’s observation about leaf skills. The novel portion of General ESP is employed during the evolution of each new skill and consists of two stages. The first stage consists of a fixed number of generations during which the new skill’s control and body evolves, as described in Algorithm 1. During this stage, existing encapsulated skills in the brain do not change, but if any morphological changes reduce these skills’ fitness beyond a preset limit, the creature will be marked as unfit. In this way, the new skill is given free rein to adapt the body to its needs, provided that sufficient ability in all existing skills is retained (Figure 20a).

1 foreach generation do
2       foreach individual in the population do
3             mutate morphology;
4             mutate control for new skill ;
5             foreach existing skill  do
6                   evaluate fitness for ;
7                   if fitness for has decreased significantly then
8                         set individual fitness to 0;
9                         proceed to next individual;
10                        
11                   end if
12                  
13             end foreach
14            evaluate fitness for ;
15             set individual fitness to fitness for ;
16            
17       end foreach
18      produce new population from existing one;
19      
20 end foreach
Algorithm 1 Full evolution of morphology and control for new skill .
1 foreach existing skill  do
2       foreach generation do
3             foreach individual in the population do
4                   mutate control for skill ;
5                   evaluate fitness for ;
6                   set individual fitness to fitness for ;
7                  
8             end foreach
9            produce new population from existing one;
10            
11       end foreach
12      
13 end foreach
Algorithm 2 Reconciling existing skills to body changes made for new skill .

The second stage (Algorithm 2) runs for a fixed number of generations for each of the old skills, during which the morphology is temporarily locked–ensuring that the abilities achieved by the new primary skill are preserved–and each of the already existing skills gets a chance to reconcile itself to the new body (Figure20b). Since the morphology is fixed, these skills can develop completely independently–each skill can adapt to the new body, without degrading any of the other skills in the brain.

Proceeding in this manner, General ESP allows new leaf skills to seek their own adaptations to morphology as well as control, with a reasonable expectation that–as in the old system–existing skills will be maintained, allowing abilities to accumulate incrementally, just as in Fast ESP.

Vii General ESP Results

(a) Initial locomoting creature.
(b) Heavy smashing arms.
(c) Smashing flail arms.
(d) Jump with anti-tip limbs.
(e) Smashing tail, stabilizers.
(f) Jump with heavier body.
Fig. 21: Adapting EVC morphology to multiple tasks. (a) A creature adapted for locomotion. From this creature, creatures (b) through (f) were evolved using the General ESP method described in this paper. Each of them has developed a new technique (with corresponding morphological changes) for accomplishing an additional task–in this case, delivering a strike to the ground–while still maintaining the ability to perform the initial skill (locomotion) to prescribed levels. With Fast ESP, these secondary adaptations would have been impossible.

Experiments demonstrate the advantages of the continuing morphological evolution enabled by the General ESP algorithm. In the first subsection (Strike Results), an experiment from the Fast ESP system is reproduced in the General ESP system, with dramatically different results. In the second subsection (High-Reach Results), a learning challenge designed to highlight General ESP’s advantages is presented, and detailed benefits are described. Note that, while General ESP maintains Fast ESP’s ability to construct complex hierarchical behaviors, that ability is inherited largely without modification. Therefore, the experiments in this section are used instead to demonstrate General ESP’s success in more challenging applications that would be impossible with Fast ESP. Video illustrating both of these result sections can be viewed online.333http://youtu.be/fyVr7gdGEPE

Vii-a Strike Results

An important part of the Fast ESP system’s primary experimental result was to add a strike behavior to a locomoting creature (toward the larger goal of developing a complex fight-or-flight behavior). In this section, that portion of the old experiment is reproduced using General ESP, and a broad range of novel strategies and morphological changes is observed.

Vii-A1 Strike in Fast ESP

Figure 21a depicts the creature evolved for locomotion from Section V. Using Fast ESP, that creature consistently solved the challenge of producing a striking behavior by using its existing skeletal structure to either jump up and down or smash the ground with its limbs (Section V), without any opportunity to explore the potential for new strategies or better adaptation that might result from continuing full morphological development.

Vii-A2 Strike in General ESP

When the morphology is allowed to continue to evolve, however, new strategies become possible, and even old strategies may be better executed with morphological changes adapted to their specific needs. The General ESP system develops a variety of such solutions, as can be seen in Figures 21b through 21f.

Vii-B High-Reach Results

The second experiment was designed to highlight the potential benefits of General ESP over Fast ESP. Specifically, a selection of three different locomoting creatures was evolved to learn the additional skill of reaching for a high target, and the subsequent differences of results for the Fast and General ESP implementations were examined in detail. General ESP led to two types of improvements: 1) greater variety of results, and 2) better fitness.

Vii-B1 Greater Variety

(a) Tipping, long new limbs.
(b) Push-up, extended limbs.
(c) Telescoping limbs.
(d) Telescoping, anti-tip limbs.
(e) Tip with enlarged limbs.
(f) Jump, swing extensions up.
Fig. 22: Greater variety through continued morphology evolution. The locomoting creature of Figure 21a was further evolved using the General ESP system to adapt to a high-reach task. The results demonstrate the potential of continued morphology evolution to produce a great degree of useful variety.

The locomoting creature of Figure 21a was evolved toward the new high-reach goal, using both the Fast and the General ESP implementations.

With fast ESP, only two strategies were observed, within which the results were extremely uniform. Using morphology unchanged from the original locomotion result, all such creatures developed to either jump as high as possible, or reach one limb up by tipping over onto the other limb. In both cases, the results were limited by the inability of skeletal morphology to adapt to this new task.

With General ESP, in contrast, a wide variety of results was observed, in which a number of novel strategies were used, often to great effect. These solutions are illustrated in Figure 22 (a) through (f).

Vii-B2 Better Fitness

(a) Original ESP result.
(b) Result in new ESP.
Fig. 23: Improved fitness via continued morphology evolution. These results demonstrate how the General ESP system (b) can produce better fitness values (i.e., a higher reach) than the Fast ESP system (a) by allowing the addition of new body segments.

Another successful solution to the locomotion task produced by Fast ESP is shown in Figure 23a. This snake-like creature achieved a high reach by extending one end of its long morphology (best fitness in 10 runs: 0.174), while the rest of the body maintained balance.

General ESP improved upon this creature by changing its morphology for the secondary task, while its strategy remained unchanged (Figure 23b). With General ESP, the creature was able to develop an additional body segment that enabled the higher reach (fitness 0.267), while allowing it still to perform locomotion to acceptable standards.

Vii-B3 Greater Variety and Better Fitness

(a) Initial locomoting creature.
(b) Subtle body changes.
(c) More obvious body changes.
(d) Dramatic changes in morphology.
Fig. 24: Greater variety and improved fitness. The initial locomoting quadruped (a) is evolved for high reach in the General ESP system (b)-(d). Through a variety of strategies, each of the General ESP creatures scores better on this new task than any creature from the Fast ESP system.

The relatively complex quadruped seen in Figure 24a was a third solution developed by the underlying EVC system for the locomotion task. In continued evolution of the high-reach task in the Fast ESP system, this creature’s results were again extremely uniform in approach and fitness. They all reached up with a single limb, and all with approximately equal success (best fitness in 10 runs: 0.164). In the General ESP system, the ability to continue to adapt morphology to this new task led to a diverse set of useful results, with all presented being more fit than any produced with Fast ESP.

Figure 24b depicts a creature that pursues the same strategy as the creature in Figure 24a, yet does so more effectively (fitness 0.209) due to subtle morphological adaptations. In Figure 24c, more obvious morphological adaptations have been added to further exceed the uniform performance limit experienced by this creature in Fast ESP, while still employing the same basic technique (fitness 0.294). In Figure 24d, even more dramatic changes to morphology provide a new way of solving the high reach: This creature (fitness 0.314) employs a new pair of tall, dedicated limbs to even further exceed the performance of Fast ESP.

Viii Discussion and Future Work

In this section, outstanding issues related to ESP are discussed, and potential avenues for future development are presented.

Viii-a ESP’s Requirement for Human Input

While it is true that some human input is required by the ESP system, it is important to note that the human input utilized by this method in the form of the syllabus is at a usefully abstract level—on a par with the kind of input employed by human learners. This syllabus, along with the opportunity for human selection among high scorers at the end of each subskill stage, offers great potential value as a mechanism for exerting relatively high-level creative control over creature development. In addition, specifying even a single fitness function in a traditional EVC system arguably places even greater demands on the human experimenter than the creation of such a syllabus.

Viii-B Benefits of Evolving Creature Content

Numerous benefits accrue from the fact that this system’s results are evolved and that this evolution takes place in physical simulation. Thanks to evolution, the creatures this system produces are unceasingly novel, developing new solutions for morphology, muscle and eye placement, and mechanism and style of movement each time the process is restarted. And the fact that these solutions are evolved to operate in a physically simulated environment adds a particular level of realism, demonstrating results that are convincingly physically plausible, and even include some of the subtle imperfections of action that bring so much character to creatures in the real world. Note, also, that creating controllers for bodies like these by hand would be impractical, but that this difficulty is in this case handled entirely by the evolutionary algorithm.

Viii-C ESP is Open-Ended

One of the most important aspects of the ESP system is that it is designed to be open ended. While a significant increase in behavioral complexity has been demonstrated, there are no obvious barriers to continued reapplication of this technique to achieve results of still greater complexity in the future. Regardless of the work it took to achieve the top-level fight-or-flight behavior described above, once complete and encapsulated, that entire skill can be easily utilized as a unit by future evolution. For example, it might next be useful to add a tip-crisis behavior: Any time the creature finds itself tipped over, it would work to right itself before continuing. This tip-recovery action could be learned by a creature which has completed the example fight-or-flight syllabus above, then a new top-level ability could be evolved that simply chooses between tip-recovery and normal fight-or-flight behaviors based on whether or not the creature is upright. The ESP system is designed to make this step (and those beyond it) equally straightforward to evolve solutions to.

Viii-D Beyond General ESP

Although the General ESP algorithm has removed Fast ESP’s explicit limitations on body changes after the first skill, development of morphology throughout the acquisition of complex skills is still not fully general and completely unlimited. First, the retesting requirements would make morphological development impractical if continued through too many steps of leaf skill addition. To mitigate this issue in the future, it may be possible to do the retesting periodically rather than universally, and run the tests in parallel. Also, the more leaf skills there are, the more likely it is that the morphological change required by one skill is harmful to the others. This limitation is more difficult to overcome, and indeed it reflects the conflicting demands that any creature faces when dealing with complex environments.

Viii-E Decomposition of Perception

Just as the ESP system decomposes complex actions into simpler ones for piece-by-piece learning, an analogous process might decompose perception for a similar benefit. As part of a more complex syllabus, a human expert could develop a sequence of sensing tasks leading to useful perceptual abilities that might be difficult or impossible to achieve otherwise. This, in turn, could make possible greater behavioral abilities overall.

Viii-F Combat

While Miconi has already produced one limited form of combat for EVCs [18], there is a great deal more that can be done in this area. The ESP method, in combination with the future-work topics described above (and the ability to vary body-part materials, the importance of which was recognized by Miconi), could potentially produce a far richer and more compelling form of combat for evolved virtual creatures than what has been seen to date.

Viii-G Fauna on Demand

Finally, a more refined and automated version of the ESP system could make it possible to populate virtual worlds with continually novel creature content (especially with the help of techniques such as those seen in [14]). As virtual boundaries are pushed back, human users could (subject to limitations of computing power) continually encounter never-before-seen creatures, all developed from a single high-level human-designed syllabus.


Future work examples such as these illustrate that, beyond the demonstrated advances due to ESP, significant avenues of new research are made possible by this technique’s introduction.

Ix Conclusion

The ESP system described in this paper allows evolved virtual creatures to achieve a level of behavioral complexity (as defined in the Introduction) which is approximately double the state of the art. In contrast to related techniques for fixed morphologies, this advance applies when morphology is evolved as well as control, demonstrating the first clear increase for that application in the past two decades.

Two versions of the ESP system were presented, Fast and General, distinguished by the relative importances of computation time and extended morphological adaptation. In exchange for some limitations on continuing morphological changes, Fast ESP makes it possible to increase behavioral complexity with only a linear increase in computational time. When computational resources permit it, General ESP in contrast makes it possible to adapt morphology fully beyond the initial skill. It results in a greater variety of solutions and solutions with higher fitness, while still permitting the same open-ended development of complex behaviors as Fast ESP.

These advances demonstrate that the potential for behavioral complexity in evolved virtual creatures has not yet been exhausted, and in fact suggests that it may continue to increase so as to one day match the behavioral complexity of creatures from the real world—with all of the promise for content creation that such complexity might bring.

Acknowledgments

This research was supported by NSF grants DBI-0939454 and IIS-0915038, and by equipment donations from Intel’s Visual Computing Program.

References

  • [1] J. E. Auerbach and J. C. Bongard. On the relationship between environmental and morphological complexity in evolved robots. In Proceedings of the Fourteenth International Conference on Genetic and Evolutionary Computation Conference, GECCO ’12, pages 521–528, New York, NY, USA, 2012. ACM.
  • [2] J. C. Bongard and R. Pfeifer. Repeated structure and dissociation of genotypic and phenotypic complexity in artificial ontogeny. pages 829–836. Morgan Kaufmann, 2001.
  • [3] R. Brooks. A robust layered control system for a mobile robot. Robotics and Automation, IEEE Journal of, 2(1):14–23, 1986.
  • [4] N. Chaumont, R. Egli, and C. Adami. Evolving virtual creatures and catapults. Artificial Life, 13(2):139–157, 2007.
  • [5] N. Cheney, R. MacCurdy, J. Clune, and H. Lipson. Unshackling evolution: Evolving soft robots with multiple materials and a powerful generative encoding. In Proceeding of the Fifteenth Annual Conference on Genetic and Evolutionary Computation Conference, GECCO ’13, pages 167–174, New York, NY, USA, 2013. ACM.
  • [6] J. Doucette, P. Lichodzijewski, and M. Heywood. Hierarchical task decomposition through symbiosis in reinforcement learning. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 97–104. ACM, 2012.
  • [7] F. Heider and M. Simmel. An experimental study of apparent behavior. The American Journal of Psychology, pages 243–259, 1944.
  • [8] J. Hiller and H. Lipson. Automatic design and manufacture of soft robots. Robotics, IEEE Transactions on, 28(2):457–466, 2012.
  • [9] G. Hornby, J. Pollack, et al. Body-brain co-evolution using l-systems as a generative encoding. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 868–875, 2001.
  • [10] M. Joachimczak and B. Wróbel. Co-evolution of morphology and control of soft-bodied multicellular animats. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 561–568. ACM, 2012.
  • [11] C. Jones. The Dot and the Line. Metro-Goldwyn-Mayer, 1965.
  • [12] M. Komosinski. The Framsticks system: versatile simulator of 3D agents and their evolution. Kybernetes: The International Journal of Systems & Cybernetics, 32(1/2):156–173, 2003.
  • [13] N. Lassabe, H. Luga, and Y. Duthen. A new step for artificial creatures. In Artificial Life, 2007. ALIFE’07. IEEE Symposium on, pages 243–250. IEEE, 2007.
  • [14] J. Lehman and K. Stanley. Evolving a diversity of virtual creatures through novelty search and local competition. In Proceedings of the 13th annual conference on Genetic and evolutionary computation, pages 211–218. ACM, 2011.
  • [15] D. Lessin, D. Fussell, and R. Miikkulainen. Open-ended behavioral complexity for evolved virtual creatures. In Proceeding of the Fifteenth Annual Conference on Genetic and Evolutionary Computation Conference, GECCO ’13, pages 335–342, New York, NY, USA, 2013. ACM.
  • [16] D. Lessin, D. Fussell, and R. Miikkulainen. Adapting morphology to multiple tasks in evolved virtual creatures. In Proceedings of The Fourteenth International Conference on the Synthesis and Simulation of Living Systems (ALIFE 14) 2014, 2014.
  • [17] H. Lipson and J. B. Pollack. Automatic design and manufacture of robotic lifeforms. Nature, 406(6799):974–978, Aug 2000.
  • [18] T. Miconi. In silicon no one can hear you scream: Evolving fighting creatures. Genetic Programming, pages 25–36, 2008.
  • [19] M. Minsky. Society of mind. Simon & Schuster, 1988.
  • [20] M. Mitchell.

    An Introduction to Genetic Algorithms

    .
    MIT Press, Cambridge, MA, USA, 1998.
  • [21] M. L. Pilat and C. Jacob. Evolution of vision capabilities in embodied virtual creatures. In Proceedings of the 12th annual conference on Genetic and evolutionary computation, GECCO ’10, pages 95–102, New York, NY, USA, 2010. ACM.
  • [22] B. Scholl and P. Tremoulet. Perceptual causality and animacy. Trends in cognitive sciences, 4(8):299–309, 2000.
  • [23] O. G. Selfridge. Pandemonium: a paradigm for learning in Mechanisation of Thought Processes. In Proceedings of a Symposium Held at the National Physical Laboratory, pages 513–526, London, Nov. 1958. HMSO.
  • [24] Y. Shim and C. Kim. Generating flying creatures using body-brain co-evolution. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, pages 276–285. Eurographics Association, 2003.
  • [25] Y. Shim, S. Kim, and C. Kim. Evolving flying creatures with path-following behavior. In ALife IX: Proceedings of the 9th International Conference on the Simulation and Synthesis of Living Systems, pages 125–132, 2004.
  • [26] K. Sims. Evolving virtual creatures. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques, SIGGRAPH ’94, pages 15–22, New York, NY, USA, 1994. ACM.
  • [27] B. Skinner. The behavior of organisms: An experimental analysis. The Century psychology series, 1938.
  • [28] P. Stone and M. Veloso. Layered learning. In Machine Learning: ECML 2000: 11th European Conference on Machine Learning Barcelona, Catalonia, Spain May, 31-June 2, 2000 Proceedings, volume 1810, page 369. Springer, 2000.
  • [29] S. Whiteson, N. Kohl, R. Miikkulainen, and P. Stone. Evolving keepaway soccer players through task decomposition. Machine Learning, 59(1):5–30, 2005.