It is well known that many complex systems, both in technology and nature, exhibit modularity: independent modules, each of them providing a certain function, are combined together to perform more complex functions . Additionally, modular systems are also organized in a hierarchical way: smaller modules are used within larger modules recursively . Examples of such systems exist in a wide range of environments: in natural systems, it is believed that hierarchical modularity enhances evolvability (the ability of the system to adapt to new environments with minimal changes) and robustness (the ability to maintain the current status in the presence of internal or external variations) [26, 31]. In the technological world, hierarchically modular designs are preferred in terms of design and development cost, easier maintenance and agility (e.g. less effort in producing future versions of a software), and better abstraction of the system design .
There are many hypotheses in the literature regarding the factors that contribute to either the hierarchy or modularity properties. Local resource constraints in social networks and ecosystems , modularly varying goals [15, 23, 24], selection for more robust phenotypes [12, 38], and selection for lower connection costs in a network  are some of the mechanisms that have been previously explored and shown to lead to hierarchically modular systems. The main hypothesis that we follow in this paper is along the lines of , which assumes that systems in both nature and technology care to minimize the cost of their interconnections or dependencies between modules.
An additional focus of our work is the hourglass effect in hierarchical systems. Across many fields, such as in computer networking 
, deep neural networks, embryogenesis , metabolism , and many others , it has been observed that hierarchically modular systems often exhibit the architecture of an hourglass. Informally, an hourglass architecture means that the system of interest produces many outputs from many inputs through a relatively small number of highly central intermediate modules, referred to as the “waist” of the hourglass (Fig. 1). The waist of the hourglass (also referred to as “core” in  as well as in this paper) includes critical modules of the system that are also sometimes more conserved during the evolution of the system compared to other modules [1, 31]. Despite recent research on the hourglass effect in different types of hierarchical systems [1, 2, 17, 31], one of the questions that is still open is to identify the conditions under which the hourglass effect emerges in hierarchies that are produced when the objective is to minimize the cost of interconnections.
In this paper, we present Evo-Lexis, a modeling framework for the emergence and evolution of hierarchical structure in complex systems. To develop Evo-Lexis, we extend a previously proposed optimization framework, called Lexis , that was designed for structure discovery in sequential data. Lexis models the most elementary modules of the system as symbols (“sources”) and the modules at the highest level of the hierarchy as sequences of those symbols (“targets”). Evo-Lexis is a dynamic or evolving version of Lexis, in the sense that the set of targets changes over time through additions (births) and removals (deaths) of targets. Evo-Lexis computes an (approximate) minimum-cost adjustment of a given hierarchy when the set of targets changes over time (a process we refer to as “incremental design”). For comparison purposes, Evo-Lexis also computes the (approximate) minimum-cost hierarchy that generates a given set of targets from a set of sources in a static (non-evolving) setting (referred to as “clean-slate design”). The premise behind the incremental design approach is that in practice systems are rarely designed from scratch – instead, they are incrementally modified over time to accommodate the changes (e.g. provide new outputs and potentially to support new inputs every time there is a change).
In general, a system interacts with its environment in a bidirectional manner: the environment imposes various constraints on the system and the system also affects its environment. To capture this co-evolutionary setting in Evo-Lexis, we study how changes in the set of targets affect the resulting hierarchy but also how the current hierarchy affects the selection of new targets (i.e. whether a new candidate target is selected or not depends on its fitness or cost – and that depends on how easily that target can be supported by the given hierarchy). By incorporating well-known evolutionary mechanisms, such as tinkering (mutation), recombination, and selection, Evo-Lexis can capture such co-evolutionary dynamics between the generation of new targets and the hierarchy that supports them.
The questions we focus on are:
How do key properties of the emergent hierarchies, e.g. depth of the network, reuse or centrality of each module, complexity (or sequence length) of intermediate modules, etc., depend on the evolutionary process that generates the new targets of the system?
Under what conditions do the emergent hierarchies exhibit the so called “hourglass effect”? Why are few intermediate modules reused much more than others?
Do intermediate modules persist during the evolution of hierarchies? Or are there “punctuated equilibria” where the highly reused modules change significantly?
Which are the differences in terms of cost and structure between the incrementally designed and the corresponding clean-slate designed hierarchies?
The structure of the paper is as follows: In Section 2, we present an overview of Lexis, the static optimization framework that serves as the main building block in Evo-Lexis.111The static (i.e., non-evolving) version of the proposed modeling framework is referred to as ”Lexis” and it has been published at the ACM KDD 2016 conference . In Section 3, we present the components of the Evo-Lexis framework, along with the metrics that we use for the analysis of evolving hierarchies. In Section 4, we evaluate the evolution of hierarchies under different target generation models. Sections 5 and 6 present further analysis regarding the evolvability and major transitions in hierarchies produced using the most full-fledged (MRS) target generation model. Finally, Section 7 focuses on the comparison between clean-slate and incremental design in terms of cost and structure. In Section 8, we review related work in the context of Evo-Lexis. Section 9 discusses the results and presents some future research possibilities.
Blue, green and red nodes show source, intermediate and target nodes, respectively. Colored dots represent an instance of a source node and are used to show the extent of diversity among target nodes.
2 Lexis Background
In this section, we present an overview of Lexis , the optimization framework that we use as the main building block of the Evo-Lexis framework.
Given an alphabet and a set of “target” strings over the alphabet , we need to construct a Lexis-DAG. A Lexis-DAG is a directed acyclic graph , where is the set of nodes and the set of edges, that satisfies the following three constraints:222To simplify the notation, even though is a function of and , we do not denote it as such.
First, each node in a Lexis-DAG represents a string of characters from the alphabet . The nodes that represent characters of are referred to as sources, and they have zero in-degree. The nodes that represent target strings are referred to as targets, and they have zero out-degree. also includes a set of intermediate nodes , which represent substrings that appear in the targets . So, .
Second, each node in of a Lexis-DAG represents a string that is the concatenation of two or more substrings, specified by the incoming edges from other nodes to that node. Specifically, an edge from node to node is a triplet such that the string appears as substring of at index (the first character of a string has index 1). Note that there may be more than one edges from node to node . The number of incoming and outgoing edges for a node is denoted by and , respectively.
Third, a Lexis-DAG should only include intermediate nodes that have an out-degree of at least two, . In other words, every intermediate node in a Lexis-DAG should be such that the string is re-used in at least two concatenation operations. Otherwise, is either not used in any concatenation operation, or it is used only once and so the outgoing edge from can be replaced by re-wiring the incoming edges of straight to the single occurrence of . In both cases node can be removed from the Lexis-DAG, resulting in a more parsimonious hierarchical representation of the targets. Fig. 3 illustrates the concepts introduced in this subsection.
2.2 The Lexis Optimization Problem
The Lexis optimization problem is to construct a minimum-cost Lexis-DAG for the given alphabet and target strings . In other words, the problem is to determine the set of intermediate nodes and all required edges so that the corresponding Lexis-DAG is optimal in terms of a given cost function . This problem can be formulated as follows:
The selection of an appropriate cost function is somewhat application-specific. A natural cost function, as investigated in previous work , is the number of edges in the Lexis-DAG. More general cost formulations, such as a variable edge cost or a weighted average of a node cost and an edge cost, are interesting but they are not pursued in this paper. The edge cost to construct a node is defined as the number of incoming edges required to construct from its in-neighbors, which is equal to . The edge cost of source nodes is obviously zero. The edge cost of Lexis-DAG is defined as the edge cost of all nodes, which is equal to the number of edges in ,
We solve the Lexis optimization problem in Eq. (1
) with a greedy heuristic, calledG-Lexis. G-Lexis starts with the trivial flat Lexis-DAG, and at each iteration it chooses the substring that maximally reduces the edge cost, when it is added as a new intermediate node to the Lexis-DAG and the corresponding edges are rewired by its addition. The algorithm terminates when there are no more substrings that reduce the cost of the Lexis-DAG. An example of application of the G-Lexis algorithm is shown in Fig. 4. More details regarding the efficient implementation and complexity of the algorithm can be found in .
2.3 Path-Centrality and the Core of a Lexis-DAG
After constructing a Lexis-DAG, an important question is to rank the constructed intermediate nodes in terms of significance or centrality. In a Lexis-DAG, a path that starts from a source and terminates at a target represents a dependency chain in which each node depends on all previous nodes in that path. Thus, the higher the number of such source-to-target paths traversing an intermediate node is, the more important is in terms of the number of dependency chains it participates in. More formally, let be the number of source-to-target paths that traverse node ; we refer to as the path centrality of intermediate node . Path centrality can be computed as:
where is the number of paths from any source to , and is the number of paths from to any target. 333A similar metric, called stress centrality of a vertex, is studied in . It is easy to see that is equal to the number of times the string that corresponds to is used in the set of targets . Similarly, is equal to the number of times any source node is used in the string of , which is simply the length of that string. Hence, the path centrality of a node is simply the product of the length of the string of (proxy for complexity) and its number of appearances (proxy for generality).
An important follow-up question is to identify the core of a Lexis-DAG, i.e., a set of intermediate nodes that represent, as a whole, the most important substrings in that Lexis-DAG. The core set is the representative set of nodes that summarizes the structure of the targets. Intuitively, we expect that the core should include nodes of high path centrality, and that almost all source-to-target dependency chains of the Lexis-DAG should traverse at least one of these core nodes.
More formally, suppose is a set of intermediate nodes and is the set of source-to-target paths after we remove the nodes in from . The core of is defined as the minimum-cardinality set of intermediate nodes such that the fraction of remaining source-to-target paths after the removal of is at most :444To simplify notation, we do not denote the core set as function of .
where is the number of source-to-target paths in the original Lexis-DAG, without removing any nodes.555It is easy to see that is equal to the cumulative length of all target strings. Fig. 5 shows an example defining the concepts regarding the core of a Lexis-DAG.
Note that if the core identification problem in Eq. (4) becomes equivalent to finding the min-vertex-cut of the given Lexis-DAG. In practice, a Lexis-DAG often includes some tendril-like source-to-target paths traversing a small number of intermediate nodes that very few other paths traverse. These paths can cause a large increase in the size of the core. For this reason, we prefer to consider the case of a positive, but potentially small, value of the threshold .
We solve the core identification problem with a greedy algorithm referred to as G-Core. This algorithm adds in each iteration the node with the highest path-centrality value to the core set, updates the Lexis-DAG by removing that node and its edges, and recomputes the path centralities of the remaining nodes before the next iteration. The algorithm terminates when the desired fraction of source-to-target paths is achieved.
2.4 Hourglass score
Intuitively, a Lexis-DAG exhibits the hourglass effect if it has a small core. To make this intuition more precise, we compare the size of the core of a Lexis-DAG with the core size of a derived Lexis-DAG which maintains the source-target paths of the original Lexis-DAG but that is not presenting the hourglass structure by construction.
We use a metric, named as Hourglass Score, or H-Score, in our study for measuring the “hourglass-ness” of a network. This metric was originally presented in .
To calculate the H-score, we create a flat Lexis-DAG containing the same targets as the original Lexis-DAG . Note that preserves the source-target dependencies of : each target in is constructed based on the same set of sources as in . However, the dependency paths in are direct, without forming any intermediate modules that could be reused across different targets. So, by construction, the flat Lexis-DAG cannot have a non-trivial core since it does not have any intermediate nodes.
We define the H-score as follows:
Where and are the core sets of and for a given threshold , respectively. Note that can include a combination of sources and targets, and it would never be larger than either the set of sources or targets, i.e.,
Clearly, . The H-score of is approximately one if the core size of the original Lexis-DAG is negligible compared to the the core size of the corresponding flat Lexis-DAG. Fig. 5 illustrates the definition of this metric. An ideal hourglass-like Lexis-DAG would have a single intermediate node that is traversed by every single source-to-target path (i.e., ), and a large number of sources and targets none of which originates or terminates, respectively, a large fraction of source-to-target paths (i.e., a large value of ). The H-score of this Lexis-DAG would be approximately equal to one.
3 Evo-Lexis Framework and Metrics
The Evo-Lexis framework includes a number of components that are described below. A general illustration of the framework is shown in Fig. 6.
Lexis-DAG: The network that encodes the system’s architecture at a given point in time. The inputs of the system are the sources of the DAG and the outputs are the targets.
Target Generation Model: This model specifies the evolutionary process that creates new targets. For simplicity, we consider the addition of only new targets, not new sources. The generation of new targets can be either independent of the current hierarchy (exogenous target generation) or it can depend on that hierarchy (endogenous target generation).
Target Removal Model: Models the removal of older targets. The total number of targets remains constant during the evolution of the network.
Hierarchy Design Algorithm: This is how the Lexis-DAG is adjusted whenever we introduce new targets. This procedure can be as simple as building a Lexis-DAG from scratch (by running the G-Lexis algorithm) on the set of existing targets. We refer to this approach as Clean-Slate design. On the contrary, the algorithm can be incremental, starting with the previously constructed hierarchy and incorporating new targets in a way that minimizes the adjustment cost. We refer to this algorithm an Incremental design, and it is described next.
3.1 Incremental Design Algorithm
The Evo-Lexis algorithm generates an optimized hierarchy for the given set of targets in every evolutionary iteration. As mentioned previously, the Clean-Slate design approach is to discard the existing hierarchy and redesign from scratch a new Lexis-DAG for the given set of targets using the G-Lexis algorithm. Such a design methodology is not realistic however in either technological or natural evolution. A more realistic approach is to adjust the existing Lexis-DAG incrementally, as described below.
In incremental design, given a Lexis-DAG with a set of targets , a set of new targets to be added, and a set of old targets to be removed, the problem is to construct a Lexis-DAG that supports the set of targets , and that minimizes the cost difference with respect to :
If (i.e., there is no initial Lexis-DAG), , and is the entire target set, the incremental design problem becomes equivalent to the original Lexis Optimization Problem in Eq. (1).
The incremental design problem is NP-Hard (as the original Lexis design problem in which and ), and so we rely on a heuristic that we refer to as Inc-Lexis. The algorithm proceeds in two phases: first, in the “expansion phase”, it adds the set of new targets attempting to reuse as much as possible existing intermediate nodes. Second, in the “pruning phase”, the algorithm removes the set of old targets , and it also removes any intermediate nodes that are left with zero or one outgoing edges.
In more detail, the expansion phase of Inc-Lexis consists of two stages: in stage-1, we reuse intermediate nodes present in to cover with minimum cost. In stage-2 of the expansion phase, we further optimize the hierarchy that supports the targets in by building an optimized Lexis-DAG for them using G-Lexis. The resulting new intermediate nodes and edges are added in the existing DAG.
Note that stage-1 relates to the well-known Optimal Parsing problem, which is: given a set of target strings , a set of substrings and the corresponding alphabet , what is the minimum number of substrings and letters that can construct from the elements of ? The optimal parsing problem can be formulated as a shortest-path problem in directed graphs . If the length of the targets is , it can be optimally solved in as the corresponding directed acyclic graph has nodes and unweighted edges.
In the pruning phase, we remove the oldest batch of targets. We also ensure that there is no redundant node in the Lexis-DAG, as implied by the constraint: . This ensures that the Lexis-DAG does not include two types of redundancies: nodes with zero out-degree and nodes that are only reused once.
3.2 Target Generation Models
The targets are generated through well-known evolutionary mechanisms, such as tinkering/mutation, recombination and selection:
The generation of new targets from minor changes in earlier targets is similar to Tinkering/Mutation. Tinkering is common in technological evolution: small “upgrades” in a software or hardware artifacts are the most common example of this process. In biological systems, it is well-known that mutation is basically “the engine of evolution” . In Evo-Lexis, tinkering/mutation is performed by replacing one character of a given target with a randomly chosen character.
In the technological world, Recombination is known to be one of the central mechanisms for the creation of new technologies . Technological design is often considered to be a search over a space of combinatorial possibilities . In fact, many breakthroughs in the history of technology were in fact just a new combination of existing modules. A recent example is the first version of the iPhone in 2007, which was introduced to be “a phone, an internet communicator and an iPod”. In biology, it is well known that recombination and crossover is essential as it produces highly diverse genotypes, compared to mutations.
Selection is an essential mechanism in evolution. In natural systems, selection determines whether a new genotype can survive the competition with existing genotypes (i.e., the incumbents) by evaluating the phenotypic fitness of the former relative to the latter. In the technological world, selection is the process of evaluating the functionality and cost of a new product, perhaps during an R&D cycle . In the Evo-Lexis framework, selection is performed to decide whether a candidate target can be accepted, by evaluating the cost of adding that target in the current hierarchy. In other words, selection creates an endogenous target generation process in which the existing hierarchy determines the cost of the potential new targets and thus, whether each new target is cost-competitive compared to the targets it evolved from.
3.2.1 MRS Model
The main target generation model we consider is based on Mutation, Recombination and Selection, thus called MRS model. The mechanism for this model is illustrated in Fig. 9. In detail:
Two distinct targets and (referred to as “seeds”) are chosen randomly from the existing set of targets. Their cost is denoted by and , respectively, and it is equal to the number of incoming edges that form from the intermediate nodes in the current Lexis-DAG.
A randomly chosen “crossover index” is chosen (recall that is the length of the targets) and the following recombinations are generated:
where the numbers in braces show string indices, and is a randomly chosen character that represents the mutated element. In other words, each recombination also includes a single-character mutation.
For each of the four recombinations, we calculate its cost when it is added as a new target to the current Lexis-DAG. This cost can be seen as the marginal overhead that introduces when added to the current hierarchy :
where is the new hierarchy after adding to using the INC-Lexis algorithm.
The model selects a newly generated recombination if it satisfies the following selection constraint:
Suppose is formed by recombining the fragments (from ) and (from ), where the length of these target fragments are and .
The selection ratio is defined as:
If , we definitely accept .
If , we accept probabilistically with selection probability .
If none of the recombinations passes the previous selection constraint, the target generation process is repeated. However, if one or more recombinations pass the selection constraint, the model chooses one of them randomly and adds it as an accepted target in the batch of new targets.
determines how strongly the current hierarchy influences the selection of new targets. The larger the parameter is, the less likely it becomes that a new target that is more costly than its seeds (i.e. ) will be selected. For large , we get Strong Selection and refer to the model as MRS-strong. A small implies Weak Selection, and the model is referred to as MRS-weak. We use and for weak and strong selection, respectively. Fig. 10 shows the difference of the two values for typical values of (when ).
To analyze the effect of each evolutionary mechanism, we also consider target generation models by removing certain elements from the MRS model – hence the name “ablation study”.
3.2.2 MS Model
The MS model is derived from MRS by removing recombination (hence the name Mutations+Selection Model or MS Model). The model generates new targets as follows:
A target seed is chosen from the existing set of targets. Suppose the cost of is in the current Lexis-DAG .
The seed is mutated (single character mutation), as in MRS model, to .
We calculate the cost of adding to the current Lexis-DAG. This cost can be seen as the marginal overhead that introduces when it is added to the current Lexis-DAG:
The model will select the newly generated target if it satisfies the following constraint:
If , accept .
if , accept probabilistically where selection probability .
Otherwise, the newly generated target is rejected and the target generation repeats.
3.2.3 M Model
This is derived from the MS model by removing the Selection constraint. Note that with this change the target generation process is not influenced by the current Lexis-DAG and it operates “exogenously” to the hierarchy. This model is referred to as Mutation Model (or M Model) and it generates targets as follows:
Among the targets that exist in the current Lexis-DAG, a seed target is chosen randomly.
The seed target is mutated to through a random single character mutation.
If the newly generated target is a duplicate of one of the existing targets, the new target is rejected and the target generation repeats. If not, the generated target is added to the batch of new targets.
3.2.4 RND Model
We also consider a random target generation process, referred to as RND, where tinkering/mutation are removed from Mutation model. In this model, a new target is randomly generated using random and independent choices among the sources.
3.3 Key Metrics
3.3.1 Cost Metrics
Normalized Cost: This is the cost of the Lexis-DAG (the Lexis-DAG for the target set ) normalized by the total length of the targets, . We denote the normalized cost by :
Penalty of Incremental Design (PID): This measure evaluates the cost overhead of incremental design relative to a clean-slate design:
where is the incremental design for the target set , and is the clean-slate design for the same set of targets. The value of PID is bounded as follows:
because an incremental design cannot be more efficient than a clean-slate design (at least when the two design problems are optimally solved), and the maximum cost of incremental design is ).
3.3.2 Topological Metrics
Average Depth: This metric is an indicator of how deep a Lexis-DAG hierarchy is. For each target , we calculate the average length of all source-target paths ending on that target: . The average across all is defined as the average depth of the hierarchy:
Core Stability: We have already defined the core size and the H-score (Section 2). Here we define an additional metric, related to the stability of the core across time.
We track the stability of the core set by comparing two core sets at two different times. A direct comparison of the core sets via the Jaccard index leads to poor results. The reason is that often the strings of the two sets are similar to each other but not completely identical.
Thus, we define a generalized version of Jaccard similarity that we call Levenshtein-Jaccard Similarity:
The Levenshtein distance between two strings and is the number of deletions, insertions, or substitutions required to transform one string to another. The higher the number of required operations, the more distant two strings are from each other .
Suppose we aim to compute the similarity of two sets A and B of strings. We define the mapping where every element is mapped to the most similar element . We also define the mapping from every element to the most similar element :
where is the similarity of to and is calculated as:
Notice that is the maximum value of Levenshtein distance between and . This ensures that if then , and if and have the maximum distance then .
Considering both and , we get the union of the two mappings and define the Levenshtein-Jaccard similarity as follows:
We can see that if (all weights are equal to one) then . Also if none of the elements in are similar to (all the element pairs take zero similarity value), then .
For example, suppose that and . The similarity of the most similar pairings is shown next:
Hence, we have:
3.3.3 Target Diversity Metric
Suppose we have a set of strings . The goal is to provide a single number that quantifies how dissimilar these elements are to each other.
We first identify the medoid within the set , i.e., the element that has the lowest average distance from all other elements. We use Levenshtein distance:
To compute how diverse the elements are with respect to each other, we average the distance of all elements from the medoid. We call this measure , the Diversity of set . The bigger the diversity metric, the more diverse the set of strings is (because the distance of each target from the medoid is the number of single-character operations needed to convert any element within the set to the medoid):
4 Computational Results
4.1 Parameter Values and Evolutionary Iteration
We can summarize an evolutionary iteration of the Evo-Lexis framework as follows:
Initially, we start with a small number of randomly constructed targets. Each target has the same length , and the number of possible sources is . An initial Lexis-DAG is constructed using the G-Lexis algorithm.
In every evolutionary iteration, the following steps are performed:
A new batch of targets is generated via a target generation model.
In the Incremental Design approach, the Evo-Lexis algorithm adjusts the existing hierarchy minimizing the marginal cost of adding each new target in the existing hierarchy.
If the total number of targets that are present in the system have reached a steady-state (the number of targets is ), we also remove the oldest batch of targets from the Lexis-DAG. This target removal process may also trigger the removal of intermediate nodes that are not reused by at least two other nodes in the hierarchy. The total number of targets remains constant () because the number of target additions is equal to the number of removals ().
The evolutionary process is repeated for a user-specified number of iterations. The parameters , and do not change during this process. We run each model ten times for a total of 5,000 iterations. We take the mean value of each metric.
The parameters used in the following experiments are presented in Table 1.
|Number of initial targets||10|
|Number of sources||100|
|Target length (characters)||200|
|Batch size for new targets birth/old targets death||10|
|Steady-state number of targets present in Lexis-DAG||100|
Emergence of low-cost hierarchies due to tinkering/mutation and selection
In Fig. (a)a and (b)b, we observe a significant reduction in the normalized cost between the RND model and all other models. The main reason for this reduction is that in all other models, we generate targets that are similar to earlier targets and not randomly constructed. Further, we observe that endogenous models (MS-strong and MRS-strong) further reduce the cost of the resulting hierarchies. The reason is the large bias for selecting targets that can be constructed with lower (or comparable) cost than the seed targets they evolved from. Thus, introducing tinkering/mutation and selection both contribute to the emergence of more efficient hierarchies in the Evo-Lexis framework.
Low-cost design resulting in deeper hierarchies and reuse of more complex modules
Having a lower cost hierarchy also means that intermediate nodes are reused more frequently and/or that those intermediate nodes are more complex (i.e., longer strings). We observe this across models in Fig. (c)c, (e)e, (d)d and (f)f – models with lower normalized cost have deeper Lexis-DAGs and higher intermediate node length. These longer re-used nodes further decrease the cost of the hierarchy. Hence, tinkering/mutation and selection also develop deeper hierarchies with longer intermediate nodes. These two outcomes are ubiquitously observed in both natural and technological systems. Examples include call-graphs and metabolic networks. For instance, for the OpenSSH call-graph and the monkey metabolic network, it has been reported that the underlying dependency networks have an average depth of and , respectively .
The recombination mechanism creates target diversity
Realistic hierarchies should support a diverse set of requirements or outputs. For example, in network protocol stacks, many different functionalities at the top level of the hierarchy (application layer) are supported by the same hierarchical infrastructure. In our framework, this translates to having a set of targets with high diversity. In Fig. (g)g and (h)h, we show the target diversity across different models. The RND model produces the highest target diversity as there are no correlations among the generated targets. In Fig. (h)h, we observe that the tinkering/mutation in the M model results in 50% to 70% decrease in target diversity. Strong selection in the MS-strong model further decreases the diversity to the point that the targets are almost identical, with only minor variations of the same main string. Such low target diversity is not realistic in natural and technological systems. The reason that the MS-strong model behaves in this manner is that it generates new targets only through single-character mutations and only when the resulting mutants can be constructed using the existing intermediate nodes (otherwise they would have much higher cost and they would not be selected). Hence, the set of accepted new targets gets very narrow and quite similar to its seed targets.
In biological systems, the evolution of complex species required recombination and sexual reproduction (i.e., crossover). Similarly in the Evo-Lexis framework, the addition of recombination in the MRS model results in increased target diversity (Fig. (g)g) while keeping the earlier properties of the Lexis-DAGs (i.e., low-cost, large depth, long intermediate nodes).
Reuse of complex modules in the core set by strong selection
Looking at the contents of the core at the 5,000th iteration of all models in Fig. 13, shows that in models without selection, or with weak selection, the core includes only a small number of intermediate nodes. The reason is that random mutations make the reuse of longer intermediate nodes unlikely. Note that this does not mean that long intermediate nodes do not exist in Lexis-DAGs under the M & MS-weak & MRS-weak models – such nodes are less likely however to be reused often. As a result, shorter nodes and mostly sources are more likely to appear in the new targets, and end up in the core set.
On the other hand, models with strong selection (MS and MRS) limit the locations where the seed(s) can be mutated when generating new targets. This constraint results in reusing longer intermediate nodes. Thus, selection creates a bias towards the reuse of longer intermediate nodes. In the long run, this results in some long nodes dominating the core set in the MS-strong and MRS-strong models (Fig. (d)d & (f)f).
Emergence of hourglass architecture due to the heavy reuse of complex intermediate modules in models with strong selection
Appearance of longer and heavily reused intermediate nodes in the models with strong selection means that the architecture exhibits the hourglass effect. Indeed, we observe in Fig. (a)a & (b)b that the core size gets significantly smaller in the presence of strong selection (MS and MRS models). Additionally, Fig. (c)c & (d)d show that the MS-strong and MRS-strong models also result in higher H-score values (0.4 and 0.65 on average, respectively). Lexis-DAGs with high H-score values have a small core size with respect to the equivalent flat Lexis-DAG whose core is made up of sources and targets only.
Overall, the reuse of longer intermediate nodes caused by selection results in hierarchies with an hourglass architecture. This observation is consistent with a mechanism (known as Reuse-Preference ) that was proposed earlier for the emergence of the hourglass effect in general dependency networks.
Stability of the core set due to selection
Selection also promotes the stability of the core set, as shown in Fig. (h)h for the MS-strong model. We see an increase in core stability (i.e. similarity of the core during evolution) compared to the MS-weak and M models whose cores mostly consist of sources. Similarly, a stable core is also observed in the MRS-weak and MRS-strong models in Fig. (g)g. We have already seen that long intermediate nodes appear more often in the core set of models with strong selection. Hence the core stability results show that selection not only contributes to the emergence of a small core, consisting of few highly reused intermediate nodes, but it also promotes the conservation of these core nodes during evolution. This is in agreement with the properties of several systems in which the waist of the hourglass architecture includes critical modules of the system that are highly conserved [1, 31]. We return to this point, where we further show that this core stability is occasionally interrupted by major transitions and punctuated equilibria.
Fragility caused by stronger selection
Fig. (e)e and (f)f show how the generated hierarchies perform in terms of robustness, when we remove the most central nodes in the system, i.e., the members of the core. Robustness generally relates to the ability to maintain a certain function even when there are internal or external perturbations . Fig. (f)f and (e)e show how the removal of one or more core nodes, in order of importance, contributes to cutting source-target paths in each of the Lexis-DAGs produced (at the 5,000th iteration of each model).
In hourglass architectures (MS-strong and MRS-strong model), core nodes contribute much more significantly to the overall hierarchy by covering many more source-target paths. Hence, such architectures are fragile if the core nodes are perturbed. This is similar to the concept of removal of hub nodes in scale-free network . Weakening selection, reduces the H-score (as in Fig. (c)c) and hence, reduces the contribution of core nodes in covering source-target paths.
Fig. 14 summarizes the properties of the hierarchies that emerge in the models we described in this section.
5 Evolvability and the Space of Possible Targets
As shown in the previous section, the MRS-strong model leads to hourglass hierarchies, maintaining at the same time significant target diversity. In this section, we further show that hourglass architectures have two important properties. On the positive side, they are more evolvable in the sense that new targets can be constructed at a low cost, mostly reusing the intermediate modules in the core of the hierarchy. On the negative side however, hourglass architectures only accept a small fraction of the candidate new targets, restricting what a biologist would refer to as the “phenotypic space” of the system. This interplay between evolvability and the space of feasible system phenotypes or functions is an important issue in both biological and technological systems (e.g. Internet architecture ).
We first look at the cost of targets produced with and without selection. For this purpose, we compare two models: one is the MRS-strong model that acts as an “endogenous” target generation process. The other is a variation of MRS without selection that we call MR model (only mutations and recombination) – this is an “exogenous” target generation process that does not depend on the current state of the hierarchy. The MR model allows us to examine how selection affects the cost and space of acceptable targets with and without the selection constraint.
In Fig. (a)a, we calculate the ratio between the average cost of accepted targets per batch in the MRS-strong model over the corresponding cost in the MR model – we refer to this as MRS-over-MR per-batch cost-ratio. The average and median values of this ratio are 0.53 and 0.52, respectively. This suggests that the targets generated under stronger selection are of much lower cost (around half) compared to the targets generated without selection. So, the presence of strong selection allows the system to construct new targets at a much lower cost because those selected targets can be constructed mostly reusing the intermediate nodes present in the hierarchy.
As a result of strong selection, the acceptance-likelihood of new targets generated by the MRS-strong model is much lower than that with the MR model. Specifically, the acceptance-likelihood in Fig. (b)b is defined as the fraction of accepted targets generated per-batch. The mean and median of this likelihood in the MRS-strong model are equal to 0.2. In other words, about 80% of the new targets generated through mutations and recombination are not selected because their cost, given the existing architecture, would be prohibitively high.
It should be also noted that the MRS-weak model behaves quite similar to the MR baseline in terms of both the MRS-over-MR cost ratio and the target acceptance likelihood.
Overall, the results in this section show that despite having the benefit of lower cost new targets, and thus higher evolvability, selection restricts significantly the phenotypic space of accepted new targets. Given that the MRS-strong model generates hourglass architectures, we can summarize as follows: hourglass-like hierarchies under the MRS-strong model allow the construction of new functions (accepted targets) at a low cost, by mostly reusing core modules, but at the same time such architectures significantly restrict which of these functions can be supported. Targets that are quite different than the intermediate modules of the existing hierarchy would most likely not be selected.
6 Major Transitions
Major transitions have been an important and interesting phenomenon in both natural and technological evolution. Such transitions create significant shifts in evolutionary trajectories, ecosystems and “keystone species” . There are many examples of such events in natural systems, such as the “invention” of sexual reproduction and evolution of multicellularity . In technological evolution, innovations occasionally lead to the emergence of disruptive new technologies, such as the steam engine in the 19th century or air transportation in the 20th century. In the context of computing, the evolution of programming languages has gone through punctuated equilibria, interrupted by new languages that were developed by tinkering or combining different structural components of older languages .
The results of Fig. (g)g suggest that the structure of the core is locally stable, when comparing core nodes in adjacent iterations. To further investigate the stability of the core during evolution, we focus on the most central node in the core of the Lexis-DAGs, i.e., the core node that covers the largest fraction of source-target paths. We refer to this node of the Lexis-DAG as top-1 core node.
First, we track the variability of this node locally, by comparing its normalized Levenshtein distance to the top-1 core node in the next iteration. Fig. 16 shows the results of this analysis for both MRS-strong and MRS-weak. In the MRS-strong model, we observe that in most iterations the top-1 core node does not change significantly. Even though there are some spikes in which the Levenshtein distance is larger than 0.2, in 82.6% of the evolutionary iterations the variability of the top-1 core node is less than that. Further, there are several stasis periods in which the top-1 core node is practically the same (Levenstein distance lower than 0.1 or even 0). In Fig. 16 we highlight with red vertical lines a small number of stasis periods in which the top-1 core node remains exactly the same for tens of hundreds of iterations. On the other hand, the MRS-weak model has significantly higher variability in the top-1 core node, and fewer/shorter stasis periods. This suggests that selection is the key factor in generating these long periods of stability in the core of the hourglass architecture.
To further quantify this point, we focus on stasis periods that last at least 100 iterations (recall that the entire evolutionary paths in these results consist of 5000 iterations). Fig. 17 shows that there are fewer and shorter stasis periods in MRS-weak model than in MRS-strong. The fraction of iterations that account for stasis conditions is in MRS-weak, and in MRS-strong, when the minimum Levenshtein distance is (also in MRS-weak and in MRS-strong when ).
The presence of stasis periods under strong selection suggests that the most central intermediate nodes at the waist (or core) of the hourglass architecture can be quite stable and time-invariant. What happens however across different stasis periods? Does that stability persist across different stasis periods, or does the architecture exhibit major transitions and punctuated equilibria?
To answer this question, we focus again on the top-1 core node and measure its variability across successive stasis periods. In Fig. 18, we consider three different stasis periods (one curve for each initial stasis period), and calculate the normalized Levenstein distance between the top-1 core node in its initial stasis period and the top-1 core node in subsequent stasis periods. Note that the top-1 core node changes significantly across stasis periods. In fact, the Levenshtein distance is so high (often close to 1), suggesting that these are completely different core nodes. This observation gives more evidence that the top contributors to the core can lose their importance during evolutionary time scales, causing major transitions in both the core set and, consequently, in the overall hierarchy. We have confirmed that this is even more common for lower centrality core nodes too, and it is certainly even more true under weak selection.
7 Overhead of Incremental Design
In this section, we compare the cost and structural characteristics of Incremental design (INC) relative to Clean-Slate (CS) design, i.e., the ideal case in which a new Lexis hierarchy is designed from scratch every time the set of targets is changed. Of course such clean-slate designs are rare or infeasible in practice, especially in biological evolution. CS design is still valuable however as a baseline for evaluating the cost efficiency of INC, and the hierarchy that is produced by the latter.
In the Evo-Lexis framework, a key factor that quantifies the difference between INC and CS design is the batch size. If the batch size is equal to the total number of targets in steady state , INC and CS are equivalent because the set of targets completely changes in each iteration. At the other extreme, if the batch size is only one target and , INC performs a minimal adjustment of the hierarchy to support the new target while CS still redesigns the complete hierarchy. In other words, the fraction controls the degree of change in each evolutionary iteration. Both in natural and technological systems, evolution proceeds rather slowly – for this reason we only consider the lower range of this ratio, between 1/100 and 25/100.
In the following we only consider the MRS-strong model (based on the results of the earlier sections). Fig. 19 compares INC and CS in terms of four key metrics. The first metric relates to cost: recall that the Penalty of Incremental Design (PID) is the ratio of the cost of an evolving INC hierarchy over the cost of the corresponding CS hierarchy for the same set of targets. With the exception of the minimum possible batch size (=1), it is interesting that INC does not lead to much less efficient hierarchies than CS. The PID metric shows that INC is typically around 30% more costly than CS for a wide range of batch sizes, suggesting that INC is able to often reuse intermediate nodes in constructing the given targets, despite the fact that it cannot redesign the complete hierarchy. The PID is substantially higher when =1 however. The reason is that when the INC-Lexis algorithm is given only one new target in every iteration, it is unlikely to identify segments of that single target that repeat more than once. This means that, when =1, INC rarely adds new intermediate nodes in the hierarchy even though successive targets can be quite similar. CS, on the other hand, exploits the similarity of the set of targets in each iteration constructing more intermediate nodes, and reducing cost through their reuse.
Interestingly, even when the INC and CS designs have similar costs, they are very different in terms of the nodes that form the core. This is shown in Fig. (b)b: the similarity of the two cores according to the Levenshtein-Jaccard similarity is around 0.1. This implies that the two design approaches lead to substantially different architectures in terms of the actual intermediate nodes they reuse.
Additionally, the average hierarchical depth of CS architectures is larger (see Fig. (c)c) because this design approach is able to identify more and longer intermediate nodes that can be reused to construct the entire set of targets. INC, on the other hand, is constrained to not adjust the existing portion of the hierarchy, and it can only form new intermediate nodes when it detects fragments in the set of new targets that are repeated more than once. So, the INC hierarchies are typically not as deep as those in CS.
Despite their differences, both design approaches lead to hourglass architectures when the targets are created with the MRS-strong model. This is shown in Fig. (d)d, and it suggests that even though INC is constrained, as described above, it is still able to identify few intermediate nodes that can be reused many times to construct the time-varying set of targets.
8 Discussion and Prior Work
The Evo-Lexis model is primarily related to three research themes: first, the emergence of modularity and hierarchy in complex systems; second, the hourglass architecture in hierarchical networks; and lastly, the comparison between offline (or “clean slate”) design and online (or incremental) design.
8.1 Modularity and Hierarchy
. By applying incremental changes in logic circuits and evolving neural networks for pattern recognition tasks, they show that modularity in the goals (what we refer to as “targets”) leads to the emergence of modularity in the organization of the system, whereas randomly varying goals do not lead to modular architectures. Similarly, Arthur et al. focus on the evolution of technology using a simple model of logic circuit gates. Each designed element is a combination of simpler existing elements. Their simulation model results in a modularly organized system, in which complex functions are only possible by first creating simpler ones as building blocks. These models are similar to Evo-Lexis in the following way: when the system targets are not randomly constructed but they are generated through an evolutionary process that involves mutations, recombination and selection, the target functions are computed through deep hierarchies that reuse common intermediate components.
Clune et al. show that modularity is a key driver for the evolvability of complex systems . The authors demonstrate that selection mechanisms that minimize the cost of connections between nodes in a networked system result in a modular architecture. This is shown by evolving networks that solve pattern recognition tasks and Boolean logic tasks. The inputs sense the environment (e.g. pixels) and produce outputs in a feed-forward manner (e.g. the existence of patterns of interest). In other words, the networks that have evolved for optimizing both performance (accuracy in recognition) and cost (network connections) are more modular and evolvable (in the sense of being adaptable to new tasks) than those optimized for performance only. In a follow-up study by Mengistu et al. in , it is shown that the minimization of the cost of connections also promotes the evolution of hierarchy, the recursive composition of sub-modules. When not modeling the cost of connections, even for tasks with hierarchical structure (e.g. a nested boolean function), a hierarchical structure does not emerge. These modeling frameworks are similar to Evo-Lexis because the latter also aims to minimize the number of connections in the resulting hierarchical network, and it is this cost minimization that provides the incentive for reuse of intermediate components.
At the empirical side, prior work has established that technology evolves similarly to biological evolution, through tinkering, new combinations of existing components, and selection. For instance, a study of USPTO data gives evidence for the combinatorial evolution of technology . The authors find that the rate of new technological capabilities is slowing down but a huge number of combinations allows for a “practically infinite space of technological configurations”. By considering technology as a combinatorial process,  uses USPTO data to investigate the extent of novelty in patents. They propose a likelihood model for assessing the novelty of combinations of patent codes. Their results show that patents are becoming more conventional (rather than novel) with occasional novel combinations.
8.2 Hourglass Architecture
A property of many hierarchical networks is the hourglass effect, which means that the system receives many inputs and produces many outputs through a relatively small number of intermediate modules that are critical for the operation of the entire system . This property is also one of the main themes investigated in our work.
Akhshabi et al. studied the developmental hourglass which is the pattern of increasing morphological divergence towards earlier and later embryonic development . The authors conclude that the main factor that drives the emergence of the hourglass architecture in that context is that the developmental gene regulatory networks become increasingly more specific, and thus sparser, as developmemt progresses. Earlier, the same authors in  were inspired by the hourglass-resemblence of the Internet protocol stack in which the lower and higher layers tend to see frequent innovations, while the protocols at the waist of the hourglass appear to be “ossified”. The authors present an abstract model, called EvoArch, to explain the survival of popular protocols at the waist of the protocol stack. The protocols which provide the same functionality in each layer compete with each other and, just as in , the increasing specificity and sparsity is what causes the network to have an hourglass architecture. The Evo-Lexis model is neither layered, nor probabilistic, and so it is fundamentally different than EvoArch, but it also generates hierarchies in which the nodes that represent shorter strings (equivalent to lower-layer nodes in EvoArch) are reused more frequently and so they have a higher out-degree.
Friedlander et al. focus on layered networks that perform a linear input-output transformation  and show that in such systems the hourglass architecture emerges when that transformation is compressible. In their model, this is interpreted as rank-deficiency of the input-output matrix that describes the function of the system. A further requirement is that there should be a goal to reduce the number of connections in the network, similar to Evo-Lexis. This rank-deficiency in the input-output matrix resembles the case in which Evo-Lexis targets are not constructed independently but through an evolutionary process that generates significant correlations between different targets.
The hourglass architecture has been also investigated in general (non-layered) hierarchical dependency networks, similar to Evo-Lexis, by Sabrin and Dovrolis . That analysis is based on identifying the core of a dependency network, as the minimum set of nodes that cover at least a fraction of all source-to-target dependency paths. We have adopted that approach, as well as the hourglass metric proposed in . Their study shows the presence of the hourglass property in various technological, natural and information systems. The authors also present a model called Reuse-Preference, capturing the bias of new modules to reuse intermediate modules of similar complexity instead of connecting directly to sources or low complexity modules.
Despite this prior work, the interplay between the emergence of hourglass architectures and cost optimization in hierarchical networks has not been explored in previous research. Evo-Lexis identifies the conditions under which the hourglass property emerges in optimized dependency networks.
8.3 Interplay of Design Adaptation and Evolution
A main theme in our study is the interplay between changes in the environment (the targets that the system has to support) and the internal architecture of the system.
Bakhshi et al. investigate a network topology design scenario in which the goal is to design a valid communication network between a set of nodes . The authors formulate and compare the consequences of two different optimization scenarios for that goal: incremental design in which the modification cost between the two last snapshots of the design is minimized, and optimized design in which the total cost of the network is minimized in every increment. Focusing on the case of ring networks, even though the incremental designs are more costly, the relative cost overhead is shown to not increase as the network grows. In a follow-up study, focused on mesh networks, the same observation is made and further, the incremental design is shown to be producing larger density, lower average delay and more robust topologies .
Incremental design approaches are also considered in other contexts, such as in deep neural networks (DNNs). Specifically, an important problem in machine learning is how to transfer learned features of a deep network from one task to another
. Transfer learning can be considered analogous to the way in which new targets are added in an Evo-Lexis hierarchy: new targets (output functions) are incrementally included in the Lexis-DAG (incrementally learned), by re-using previously constructed intermediate nodes (features of intermediate complexity) and then optimizing the part of the DAG between those nodes and the new targets (learning the weights between the existing features and the new outputs).
The incremental design policies that we consider in this paper are studied in computer science under the umbrella of online algorithms : an online algorithm finds a sequence of solutions based on the inputs it has seen so far, without knowing the entire input sequence in advance. The main emphasis of research in online algorithms is to perform competitive analysis, i.e., to derive worst-case theoretical bounds between of the quality (or cost) of the solution of an online algorithm relative to its offline counterpart that knows the entire input sequence . The Incremental Design approach in Evo-Lexis is an online algorithm but our focus is quite different: we compare empirically the cost and topological structure of the hierarchies produced by incremental design relative to an optimized (“clean-slate”) algorithm that designs a minimum-cost hierarchy for the input sequence that has been seen so far.
8.4 From abstract modeling to specific evolving systems
The Evo-Lexis model is a quite general and abstract model and it does not attempt to capture any domain-specific aspects of biological or technological evolution. As such, it makes several assumptions that can be criticized as unrealistic, such as that all targets have the same length, their length stays constant, the fitness of a sequence is strictly based on its hierarchical cost, etc. We believe that such abstract modeling is still valuable because it can provide insights about the qualitative properties of the resulting hierarchies under different target generation models. Having said that however, we also believe that the predictions of the Evo-Lexis model should be tested using real data from evolving systems in which the outputs can be well represented by sequences.
One such system is the iGEM synthetic DNAs dataset . The target DNA sequences in the iGEM dataset are built from standard “BioBrick parts” (more elementary DNA sequences) that collectively form a library of synthetic DNA sequences. These sequences are submitted to the Registry of Standard Biological Parts in the annual iGEM competition. Previous research in [10, 34] has provided some evidence that these synthetic DNA sequences are designed by reusing existing components, and as such, it has a hierarchical organization. In ongoing work, we investigate how to apply the Evo-Lexis framework in the timeseries of iGEM sequences, and whether the resulting iGEM hierarchies exhibit the same qualitative properties we observed in this study through abstract target generation models.
We presented Evo-Lexis, an evolutionary framework for modeling the interdependency between an incrementally designed hierarchy and a time-varying set of output functions, or targets, constructed by that hierarchy. We leveraged the Lexis optimization framework, proposed in earlier work , which allows the design of an optimized hierarchical network for a given set of sequences.
We developed the optimization framework, evolutionary target generation processes, and evaluation metrics needed to study the emergence and evolution of optimized hierarchies. We summarize the results of our study as follows:
Tinkering/mutation in the target generation process is found to be a strong initial force for the emergence of low-cost and deep hierarchies. The presence of selection, however, intensifies these properties of the emergent hierarchies.
Selection is also found to enhance the emergence of more complex intermediate modules in optimized hierarchies. The bias towards reuse of complex modules results in an hourglass architecture in which almost all source-to-target dependency paths traverse a small set of intermediate modules.
The addition of recombination in the target generation process is essential in providing target diversity in optimized hierarchies.
Hourglass-shaped optimized hierarchies are found to be fragile if the core nodes (i.e. nodes with highest centrality) are perturbed, similar to the concept of removal of hub nodes in scale-free networks.
We show that an hourglass architecture introduces a trade-off between the cost of introducing new targets and the diversity between selected targets: hourglass architectures are evolvable in the sense that they allow the introduction of new targets at a low cost but they only explore a small part of the “phenotypic space” of all possible targets. These are targets that can be constructed at a low cost reusing the larger intermediate modules in the hierarchy.
Our results suggest the existence of major transitions and punctuated equilibria in the evolutionary trajectory of hourglass-shaped hierarchies. The “extinction” of central modules is found to be the main factor behind this effect.
The comparison between incremental design and clean-slate shows that although the former is much more constrained, it has similar cost and it also exhibits the hourglass effect under the proposed evolutionary scenarios. Despite these similarities, each of these design policies results in a very different set of core modules.
This research was supported by the National Science Foundation under Grant No. 1319549. We would also like to thank Matthias Gallé for his comments.
-  S. Akhshabi and C. Dovrolis. The evolution of layered protocol stacks leads to an hourglass-shaped architecture. SIGCOMM Comput. Commun. Rev., 41(4):206–217, August 2011.
-  S. Akhshabi, S. Sarda, C. Dovrolis, and S. Yi. An explanatory evo-devo model for the developmental hourglass. F1000Research, 3(156), 2014.
-  W. B. Arthur. The Nature of Technology: What It is and How It Evolves. Free Press, 2009.
-  W. B. Arthur and W. Polak. The evolution of technology within a simple computer model. Complexity, 11(5):23–31, 2006.
-  S. Bakhshi and C. Dovrolis. The price of evolution in incremental network design (the case of ring networks). In Bio-Inspired Models of Networks, Information, and Computing Systems: 6th International ICST Conference, BIONETICS 2011, York, UK, December 5-6, 2011, Revised Selected Papers, pages 1–15. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
-  S. Bakhshi and C. Dovrolis. The price of evolution in incremental network design: The case of mesh networks. In 2013 IFIP Networking Conference, pages 1–9, May 2013.
-  C. Y. Baldwin and K. B. Clark. Design Rules: The Power of Modularity Volume 1. MIT Press, Cambridge, MA, USA, 1999.
-  A.L. Barabási and M. Pósfai. Network Science. Cambridge University Press, 2016.
-  T. C. Bell, J. G. Cleary, and I. H. Witten. Text Compression. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1990.
-  J. Blakes, O. Raz, U. Feige, J. Bacardit, P. Widera, T. Ben-Yehezkel, E. Shapiro, and N. Krasnogor. Heuristic for maximizing DNA re-use in synthetic DNA library assembly. ACS Synthetic Biology, 3(8):529–542, 2014.
-  A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, New York, NY, USA, 1998.
-  W. Callebaut and D. Rasskin-Gutman. Modularity: Understanding the Development and Evolution of Natural Complex Systems. Vienna series in theoretical biology. MIT Press, 2005.
-  T. Casci. Hourglass theory gets molecular approval. Nature Reviews Genetics, 12:76 EP –, Dec 2010.
-  M. Charikar, E. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Sahai, and A. Shelat. The Smallest Grammar Problem. IEEE Trans. on Inf. Theory, 51(7), 2005.
-  J. Clune, J. Mouret, and H. Lipson. The evolutionary origins of modularity. Proceedings of the Royal Society of London B: Biological Sciences, 280(1755), 2013.
-  T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.
-  T. Friedlander, A. E. Mayo, T. Tlusty, and U. Alon. Evolution of bow-tie architectures in biology. PLOS Computational Biology, 11(3):1–19, 03 2015.
-  R. Hershberg. Mutation–The Engine of Evolution: Studying Mutation and Its Role in the Evolution of Bacteria. Cold Spring Harb Perspect Biol, 7(9):a018077, Sep 2015.
-  G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
-  The iGEM Web Portal. \(http://igem.org/main\_page\).
-  V. Ishakian, D. Erdös, E. Terzi, and A. Bestavros. A Framework for the Evaluation and Management of Network Centrality, pages 427–438.
-  S. Jain and S. Krishna. Large extinctions in an evolutionary model: The role of innovation and keystone species. Proceedings of the National Academy of Sciences, 99(4):2055–2060, 2002.
-  N. Kashtan and U. Alon. Spontaneous evolution of modularity and network motifs. Proceedings of the National Academy of Sciences of the United States of America, 102(39):13773–13778, 2005.
-  N. Kashtan, E. Noor, and U. Alon. Varying environments can speed up evolution. Proceedings of the National Academy of Sciences, 104(34):13711–13716, 2007.
D. Kim, D. B. Cerigo, H. Jeong, and H. Youn.
Technological novelty profile and invention’s future impact.
EPJ Data Science, 5(1):8, 2016.
-  H. Mengistu, J. Huizinga, J. Mouret, and J. Clune. The evolutionary origins of hierarchy. PLOS Computational Biology, 12(6):1–23, 06 2016.
-  W. Miller. The hierarchical structure of ecosystems: Connections to evolution. Evolution: Education and Outreach, 1(1):16–24, Jan 2008.
-  C. R. Myers. Software systems as complex networks: Structure, function, and evolvability of software collaboration graphs. Phys. Rev. E, 68:046116, Oct 2003.
-  E. Ravasz and A. L. Barabási. Hierarchical organization in complex networks. Phys. Rev. E, 67:026112, Feb 2003.
-  J. Rexford and C. Dovrolis. Future internet architecture: Clean-slate versus evolutionary research. Commun. ACM, 53(9):36–40, September 2010.
-  K. M. Sabrin and C. Dovrolis. The hourglass effect in hierarchical dependency networks. Network Science, 5(4):490–528, 2017.
-  Johan Schot and Frank W. Geels. Niches in evolutionary theories of technical change. Journal of Evolutionary Economics, 17(5):605–622, Oct 2007.
-  A. M. Sharp. Incremental Algorithms: Solving Problems in a Changing World. PhD thesis, Ithaca, NY, USA, 2007. AAI3276789.
-  P. Siyari, B. Dilkina, and C. Dovrolis. Lexis: An optimization framework for discovering the hierarchical structure of sequential data. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 1185–1194, New York, NY, USA, 2016. ACM.
-  J.M. Smith and E. Szathmary. The Major Transitions in Evolution. OUP Oxford, 1997.
-  R. Tanaka, M. Csete, and J. Doyle. Highly optimised global organisation of metabolic networks. IEE Proceedings - Systems Biology, 2(4):179–184, Dec 2005.
-  Sergi Valverde and Ricard V. Solé. Punctuated equilibrium in the large-scale evolution of programming languages. Journal of The Royal Society Interface, 12(107), 2015.
-  G. P. Wagner, M. Pavlicev, and J. M. Cheverud. The road to modularity. Nature Reviews Genetics, 8:921 EP –, Dec 2007. Review Article.
-  J. Yosinski, J. Clune, Y. Bengio, and H. Lipson. How transferable are features in deep neural networks? In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 3320–3328, Cambridge, MA, USA, 2014. MIT Press.
-  H. Youn, D. Strumsky, L. M. A. Bettencourt, and J. Lobo. Invention as a combinatorial process: evidence from US patents. Journal of The Royal Society Interface, 12(106), 2015.