# Viable Algorithmic Options for Creating and Adapting Emergent Software Systems

Given the complexity of modern software systems, it is of great importance that such systems be able to autonomously modify themselves, i.e., self-adapt, with minimal human supervision. It is critical that this adaptation both results in reliable systems and scales reasonably in required memory and runtime to non-trivial systems. In this paper, we apply computational complexity analysis to evaluate algorithmic options for the reliable creation and adaptation of emergent software systems relative to several popular types of exact and approximate efficient solvability. We show that neither problem is solvable for all inputs when no restrictions are placed on software system structure. This intractability continues to hold relative to all examined types of efficient exact and approximate solvability when software systems are restricted to run (and hence can be verified against system requirements) in polynomial time. Moreover, both of our problems when so restricted remain intractable under a variety of additional restrictions on software system structure, both individually and in many combinations. That being said, we also give sets of additional restrictions that do yield tractability for both problems, as well as circumstantial evidence that emergent software system adaptation is computationally easier than emergent software system creation.

• 5 publications
• 13 publications
05/10/2022

### Exploring Viable Algorithmic Options for Automatically Creating and Reconfiguring Component-based Software Systems: A Computational Complexity Approach (Full Version)

Component-Based Development (CBD) is a popular approach to mitigating th...
05/10/2022

### Exploring Viable Algorithmic Options for Learning from Demonstration (LfD): A Parameterized Complexity Approach

The key to reconciling the polynomial-time intractability of many machin...
05/04/2022

### Creating Teams of Simple Agents for Specified Tasks: A Computational Complexity Perspective

Teams of interacting and co-operating agents have been proposed as an ef...
05/10/2022

### Environmental Sensing Options for Robot Teams: A Computational Complexity Perspective

Visual and scalar-field (e.g., chemical) sensing are two of the options ...
07/01/2017

### A criterion for "easiness" of certain SAT problems

A generalized 1-in-3SAT problem is defined and found to be in complexity...
03/15/2012

### Algorithms and Complexity Results for Exact Bayesian Structure Learning

Bayesian structure learning is the NP-hard problem of discovering a Baye...

## 1 Introduction

Given the complexity of modern software systems and the increasing need to modify such systems to handle both unplanned changes in system requirements and varying operating environments, such systems must be able to autonomously modify themselves, i.e., self-adapt, with minimal human supervision [1, 2, 3]. Over the last 25 years, a great deal of research has been done on self-adaptive software systems, and a number of such systems based on various types of adaptation controllers (e.g., MAPE-K feedback loop [4], model-predictive [5], general control-theoretic [6]) have been created (see also [1, 7, 8]). Certifying that adaptation does not cause the resulting systems to violate functional and/or non-functional system requirements, i.e., validation and verification (V&V), is a major research challenge [8, page 21]. Such V&V must be done quickly enough that the proposed adaptations are still valid (to handle rapidly-changing operating environments). This is further complicated by the requirement that enough computational effort be put into the adaptation space search process to ensure that the adapted system optimizes system performance as much as possible (to satisfy system user expectations).

A valuable complement to algorithm development work would be to establish what the general algorithmic options are for software system creation and adaptation algorithms that are guaranteed to satisfy specified functional and non-functional system requirements while optimizing adapted system performance. This can be done using the tools and techniques of computational complexity analysis [9, 10, 11]. The results of such analyses can be used not only to establish those situations in which known algorithms are the best possible but also to guide the development of new algorithms (by highlighting relative to which types of efficient solvability such algorithms can and cannot exist).

A good test case for the utility of such complexity analyses would be a type of self-adaptive systems called emergent software systems [12, 13]. Such systems initially self-assemble from a provided library of software components to satisfy basic system requirements, and then continuously self-adapt with the assistance of a learning algorithm that uses collected system environment events and performance metrics to optimize a reward function such as system runtime or memory usage. An emergent web server based on a library of 30 components has been constructed under this paradigm and seems to function well [13]. However, as noted on [12, page 14],

…[the] learning algorithm, which is based on an exhaustive exploration phase, is not designed to scale up to large systems with thousands of compositions, but rather serves as a proof of concept and useful baseline against which to compare more sophisticated algorithms. This is an active research area for which we continue to develop more efficient and scalable solutions …

Ongoing work has focused on exploiting such strategies as making components encode single responsibilities and dividing large systems into distributed collections of smaller (and hopefully more manageable) subsystems [14, 15]. It is precisely at this current stage of new algorithm development that computational complexity analyses might be most useful.

### 1.1 Previous Work

Computational complexity analyses have been done previously for component-based software system creation by component selection [16, 17] and component selection with adaptation [18], with [18, 17] having the additional requirement that the number of components in the resulting software system be minimized. Given the intractability of all of these problems, subsequent work has focused on efficient approximation algorithms for component selection. Though it has been shown that efficient algorithms that produce software systems whose number of components is within a constant multiplicative factor of optimal are not possible in general [19], efficient approximation algorithms are known for a number of special cases [20, 21, 19]. All of these analyses assume that any component can be composed with any other component, in the sense that function definitions not given in a component can be obtained by composition of with any other components that have the required function definitions, i.e., component composition is not regulated using component interfaces. Moreover, none of the formalizations used include specification of the internal structure of software systems, system requirements, and components that are detailed enough to allow investigation of restrictions on these aspects that could make component selection or component selection with adaptation tractable.

Only one analysis to date has incorporated a component model which allows investigation of the tractability effects of restrictions on system requirement, component internal, and software system structure, namely that in in [22] (subsequently reprinted as [23]; see also [24] for the full version with proofs of results). The focus in this work was on exact polynomial-time solvability and fixed-parameter tractability of component-based software system creation and adaptation, where system adaptation is in response to changes in the functional system requirements. The authors showed that both of these problems are intractable in general and remain so under a variety of restrictions on system requirement, component, and software system structure. This was done relative to radically restricted components (single if-then-else blocks) and software systems (two-level reactive systems). While components in this model cannot arbitrarily compose in the sense described above, component composition was implicitly regulated by the the combination of two-level system structure and a parameter restricting the number of types of components in a system, and there was no notion of a component interface, let alone interfaces providing state variables or functions with input parameters and/or returned values.

### 1.2 Summary of Results

In this paper, we present the first computational complexity analyses of the problems of emergent software system creation and adaptation. These problems can be stated informally as follows (and are described in more detail in Section 2):

Emergent Software Creation (ESCreate):

Derive (possibly by using a system environment function ) an emergent software system relative to given libraries and of software interfaces and components that satisfies a given set of software requirements and has the best possible value for a specified reward function .

Given an emergent software system relative to given libraries and of software interfaces and components that satisfies a given set of software system requirements , derive (possibly by using a system environment function ) an emergent software system relative to and that satisfies and has the best possible value for a specified reward function .

We consider the following types of efficient solvability (described in more detail in Section 3.1):

1. Polynomial-time exact solvability, such that a polynomial-time algorithm produces the correct output for a given input either (a) all the time [10] or (b) when such an output is known to exist (promise solvability).

2. Polynomial-time approximate solvability, such that a polynomial-time algorithm produces the correct output for a given input either (a) in all but a small number of cases [25]

or (b) with a high probability

[26], or, in the case of a problem that requires an output that optimizes some cost measure, (c) produces an output for a given input whose cost is within some arbitrarily small fraction of optimal [27].

3. Effectively polynomial-time exact restricted solvability (e.g., fixed-parameter (fp-)tractability [9]), such that an algorithm produces the correct output for a given input in what is effectively polynomial time when certain aspects of that input are of restricted value, e.g., the number of components in the given library or any valid assembled software system is small.

Relative to these types of solvability and various conjectures that are widely believed to be true within the Computer Science community, e.g. [28, 10], we prove the following results (Section 3):

• Neither ESCreate nor ESAdapt is solvable by an algorithm (regardless of runtime or memory usage) that gives the correct output for every input, i.e., both of these problems are unsolvable in the same sense as Turing’s classic Halting Problem [29] (Sections 3.2.1 and 3.3.1, respectively).

• Neither ESCreate nor ESAdapt is solvable by a polynomial-time algorithm in the sense (1a) above and ESCreate is not solvable by a polynomial-time algorithm in sense (1b) above, even when software systems are restricted to run in and hence can be verified against requirements in polynomial time (Sections 3.2.2 and 3.3.2, respectively).

• Neither ESCreate nor ESAdapt is solvable by a polynomial-time algorithm in the senses (2a), (2b), or (2c) above, even when software systems are restricted to run in and hence can be verified against requirements in polynomial time (Sections 3.2.3 and 3.3.3, respectively).

• Neither ESCreate nor ESAdapt is fixed-parameter tractable in sense (3) above relative to restrictions of the values of the following aspects of the input:

• The number of software component interfaces in .

• The number of software components in .

• The maximum number of components implementing an interface.

• The maximum number of interfaces provided by a component.

• The maximum number of interfaces required by a component.

• The maximum number of components in a software system.

This fixed-parameter intractability holds both when software systems are restricted to run in and hence can be verified against requirements in polynomial time and relative to many combinations of the aspects listed above, often when aspects are restricted to small constant values. That being said, there are several combinations of aspect-restrictions that do yield fixed-parameter tractability (Sections 3.2.4 and 3.3.4, respectively).

All of the results above hold for ESCreate (except those related to cost- inapproximability) relative to any choices of and and for ESAdapt (and ESCreate for cost-inapproximability) relative to any choice of and one of two specified reward functions, namely (the number of components in a software system) and (the total size of the interface and component code is a software system). This, in combination with the unresolved fixed-parameter status of certain aspect-combinations at this time, suggests that emergent software adaptation may in general be easier to do than emergent software creation (Section 4.2).

The list above may, at first glance, be read as saying that emergent software system creation and adaptation are not possible under any circumstances. However, this bleak interpretation is very much contrary to our intent. We consider only a small subset of possible restrictions on emergent software systems (Table 1), and our analysis, though complete with respect to some of these restrictions (Tables 57), is still incomplete with respect to the whole subset, let alone the universe of possible restrictions. The successes in real-world emergent software systems to date show that tractability is indeed possible in some circumstances. The key issue now is to determine in detail those circumstances in which tractability does and does not hold. Our results should thus be seen not as final statements but rather interim guidelines on how to address this issue. We see the involvement of software engineers in this process as essential (Section 4.3). To this end, we have tried to make the reasoning used to derive our results (both in general (Section 3.1) and in our proofs) accessible to software engineers, to make plain to them the rather limited circumstances under which our results hold and thus enable them to break these results by suggesting additional restrictions relative to which real-world emergent software system creation and adaptation are provably tractable.

Before we close out this subsection, two issues with respect to the results listed above should be noted. First, though certain notations (e.g., our conception of software system requirements) and some of the general ideas underlying proof techniques developed in [22, 24] are re-used in this paper, none of the results for component-based software system creation derived in [22, 24] carry over. This is because the restrictions on overall system structure and the number of types of components in a system critical to the proofs of those results in [22, 24]

have no analogues in problems ESCreate and ESAdapt investigated here. Second, all results in this paper are derived relative to the classic Turing machine model of computation, and hence do not directly address issues of efficient solvability or unsolvability under other models of computation such as quantum computers. That being said, as will be discussed in Section

4.3, our results indirectly imply certain consequences for the efficient solvability of ESCreate and ESAdapt under such alternative models of computation.

### 1.3 Organization of the Paper

Our paper is organized as follows. In Section 2, we summarize the emergent software system model given in [12, 13] and formalize the problems of emergent software system creation and adaptation. In Section 3, we first in Section 3.1 describe several popular conceptions of efficient solvability and then in Sections 3.2 and 3.3 assess the efficient solvability of emergent software system creation and adaptation, respectively, relative to each of these conceptions. In order to focus in the main text on the implications of our results, proofs of several of these results are given in an appendix. Our results are summarized and discussed in Section 4. Finally, our conclusions and directions for future work are given in Section 5.

## 2 Formalizing Emergent Software Creation and Adaptation

In this section, we first review the basic entities in the model of emergent software given in [12, 30, 13] — namely, software system requirements, interfaces and components, component-based software systems, and emergent software systems. We then formalize two computational problems associated with emergent software system creation and adaptation.

The basic entities in our model are formalized as follows:

• Software system requirements: The requirements will be a set of input-output pairs where each pair consists of an input defined by a particular sequence of truth-values , , relative to each of the Boolean variables in set and an output from set . As such, these are functional requirements describing wanted system input-output behaviors and correspond to the pre-specified abstract goal of the system [15, page 3]. An example set of software system requirements is given in part (a) of Figure 1.

• Interfaces and Components: We use the runtime component model underlying the Dana programming language as specified in [30, 13]. In particular, following [13, Page 335], let an interface be a set of function prototypes, each comprising a function name, return type and parameter types, and a set of transfer fields, typed pieces of state that persist across alternate implementations of the interface during runtime adaptation. A component has one or more provided interfaces and zero or more required interfaces; the component has implementations for all functions specified in the provided interfaces, and these implementations in turn call upon functions and transfer fields specified in the required interfaces. We will assume that all available interfaces and components are stored in libraries and , respectively. Example interface and component libraries relative to the software system requirements in part (a) of Figure 1 are given shown in parts (b) and (c) of Figures 1 and 2, respectively.

Note that the interfaces and components in Figures1 and 2 (and indeed all subsequent interfaces and components specified in this paper) are described using a notation approximating that of the Dana programming language, e.g., [13, Figure 2]. This is done to ensure both that the proofs of results given here do not invoke programming language features more powerful than those available in Dana and that these results are hence applicable to emergent software systems as described in [12, 30, 13], which are written in Dana.

• Component-based software systems: We use the model of component-based software systems specified in [12, 30, 13]. In particular, given interface and component libraries and and a base component implementing a function, a valid component-based software system consists of a set of component-choices from including that not only implements all required interfaces of but also recursively implements all required interfaces of those component-choices and their sub-component-choices, if any. The interface-connections between these components are called wirings. In order to allow data transfer fields to hold different values relative to different implementations of an interface by the same component, these implementations are done relative to copies of that component. Moreover, in cases where a component provides multiple interfaces, different implementations of that component relative to two of those interfaces are done relative to reduced copies of that component, both of which only contain code for and thus provide only those services in the component specified by their respective implementing interfaces.

Any valid component-based software system has an associated directed vertex- and arc-labeled tree in which the vertices are components, the arcs are the wirings, and the vertices and arcs are labeled with the names of the associated components and interfaces from and , respectively; let this tree be called the component wiring tree associated with . The component wiring trees of two valid component-based software systems relative to , , , and base component Base given in parts (a)–(c) of Figures 1 and 2 are given in Figure 3. Note that such a has a single root vertex (namely, base component ), exactly one directed path from this root to any vertex, and is labeled such that a component vertex-label does not occur more than once on any directed path from the root to a leaf, i.e., there are no recursive dependencies [12, page 10]. This prevents the possibility of infinite-depth software systems resulting from the interface-component implementation sequence between two same-label components on such a path being repeated an infinite number of times.

Given a set of software system requirements, a valid component-based software system is a working component-based software system relative to if for each input-output pair , the output of given input is . For example, the software system on the left in Figure 3 satisfies all requirements in given in Figure1(a) and hence is a working component-based software system relative to , but the software system on the right is not (because it produces different outputs (3, 1, 1, and 2, respectively) for requirements , , , and ).

• Emergent software system: As defined in [12, 13], an emergent (component-based) software system is one that, given initial functional software systems requirements , interface and component libraries and , and a base component , self-assembles and self-adapts as necessary to optimize system performance as its running environment changes over time. For a software system , a system’s running environment and performance are quantified in terms of events and metrics whose values are sampled at discrete times during system operation [12, pages 10–11]; aspects of system performance used by a learning algorithm to guide both self-assembly and self-adaptation are in turn summarized in a reward function. As currently implemented [12], the initial self-assembly phase creates a list of all working software systems relative to the given , , and , where each system is described by a unique ID string that lists all components in the system and their interconnections. During the subsequent self-adaptation phase, the learning algorithm searches over this list to find appropriate alternatives to the currently-running system that might help optimize system performance [12, Section 3.2.1].

In this paper, we will assume that a system’s running environment and performance are sampled using an environment function that maps a given system onto a collection of events and metric-values and that the reward function maps the latest accumulated performance metric values for a system onto a single positive integer value. For simplicity, we shall further assume that and are computable in time polynomial in the size of , is optimized by minimization, i.e., smaller values of are preferred, and is used to choose among but does not alter the set of possible working software systems relative to a given , , and .

We can now formalize computational problems corresponding to emergent software adaptation as conceived in [12, 13]:

Emergent Software Creation (ESCreate)
Input: Software system requirements , interface and component libraries and , a base component , and reward and environment functions and .
Output: A working component-based software system based on relative to , , and that has the smallest value of over all working systems based on relative to , , and , if any working system exists, and special symbol otherwise.

Input: Software system requirements , interface and component libraries and , a working component-based software system based on component relative to , , and , and reward and environment functions and .
Output: A working component-based software system based on relative to , , and that has the smallest possible value of over all working systems based on relative to , , and .

Two additional notes are in order about our definitions of ESCreate and ESAdapt. First, the given , and in an input of ESCreate may not allow a working software system relative to the given but there is always a working software system for any input of ESAdapt — namely, . Second, and are part of the input for both ESCreate and ESAdapt and must be included in any instance of these problems, i.e., it cannot be the case that and/or are empty. That being said, (as well as in the case of ESAdapt) need not necessarily be used by any algorithm solving these problems but are provided as part of the problem inputs to make results derived here relevant to emergent systems as described in [12, 13] (see Section 4 for further discussion on the latter point).

In this paper, we will consider the following versions of :

• the number of components in .

• the size of the codebase of , i.e., the total number of lines of code in the interfaces and components comprising .

Relative to a particular reward function , we refer to our problems above as ESCreate under and ESAdapt under , respectively. Note that results derived relative to and have broad applicability as the values of these functions correlate with the values of at least some of the reward functions studied to date in emergent software systems, e.g., system response time [12, page 11]. As none of our result proofs rely on , we need not specify its form further.

Let us now illustrate problems ESCreate and ESAdapt relative to the example emergent software system described in Figures 13:

• Valid Software Systems relative to and : As each interface proc1, proc2, and proc3 in software systems based on component System1 can be implemented with any of the components ProcA, ProcB, ProcC, or ProcD, there are valid software systems relative to the general structure on the left in part (d) of Figure 3. By analogous reasoning relative to component System2 and interface proc1, there are 4 valid software systems relative to the general structure on the right. Hence, there are in total 68 valid software systems relative to and .

• Working Software Systems relative to : The reader can verify that (i) there are no working software systems relative to that incorporate component System2 and (ii) all working software systems relative to that incorporate component System1 must implement interface Proc1 with component ProcA and interface Proc3 with either of the components ProcC or ProcD, and that interface Proc2 (as its implementing component code is never executed) can be implemented by any of the components ProcA, ProcB, ProcC or ProcD. Thus, the implementations of Proc1, Proc2, and Proc3 in System1 that yield working software systems are (ProcA, ProcA, ProcC), (ProcA, ProcA, ProcD), (ProcA, ProcB, ProcC), (ProcA, ProcB, ProcD), (ProcA, ProcC, ProcC), (ProcA, ProcC, ProcD), (ProcA, ProcD, ProcC), and (ProcA, ProcD, ProcD), giving in total 8 working software system relative to .

• Output of ESCreate and ESAdapt under different : As all working software systems relative to have 8 components, ESCreate under can return any of them; however, as the system with the smallest codebase implements both Proc2 and Proc3 with ProcC, it is this system that would be selected by ESCreate under . By analogous reasoning, ESAdapt under can return any of the 8 working software systems given system drawn from the same set; however, ESAdapt under must (regardless of the choice of the given ) return the system that implements both Proc2 and Proc3 with ProcC.

This concludes our formalization of emergent software system creation and adaptation. A reasonable conjecture at this point is that ESAdapt will be easier to solve than ESCreate, given that the former is given a working system as part of its input. This will be assessed below.

## 3 Results

In this section, we will use computational complexity analysis to assess viable algorithmic options for efficient emergent software system creation and adaptation. This will be done relative to various desirable types of efficient solvability described in Section 3.1. The results of our analyses for emergent software creation and adaptation are given in Sections 3.2 and 3.3, respectively.

It turns out that both ESCreate and ESAdapt are unsolvable in the most general possible case — that is, neither ESCreate nor ESAdapt have algorithms that always return the correct output for an input incorporating any possible choices of and (in the case of ESCreate) or any possible choice of and either of or (in the case of ESAdapt) and in which there are no restrictions on the form, size, or running times of and , their member interfaces and components, or any software systems created using and (see Sections 3.2.1 and 3.3.1, respectively). Hence, the remainder of our analyses will be done relative to restricted versions of these problems in which candidate component-based software systems run and hence can be verified against given system requirements in polynomial time.

### 3.1 Types of Efficient Solvability

Consider the following desirable forms of solvability:

1. Polynomial-time exact solvability: An exact polynomial-time algorithm is a deterministic algorithm whose runtime is upper-bounded by , where is the size of the input and where and are constants, and is always guaranteed to produce the correct output for all inputs. A problem that has such an algorithm is said to be polynomial-time tractable. Polynomial-time tractability is desirable because runtimes increase slowly as input size increases, and hence allow the solution of larger inputs.

It is possible that the computational difficulty of a problem may be inflated in general by inputs that have no solutions, and hence force any algorithm to exhaustively consider all possible candidate solutions. In such cases, it is useful to assess whether a problem is polynomial-time exact promise solvable — that is, whether that problem is exactly solvable in polynomial time on those inputs which are guaranteed to have solutions, where these guarantees are known as promises.

2. Polynomial-time approximate solvability: A polynomial-time approximation algorithm is an algorithm that runs in polynomial time in an approximately correct (but acceptable) manner for all inputs. There are a number of ways in which an algorithm can operate in an approximately correct manner. Three of the most popular ways are as follows:

1. Frequently Correct (Deterministic) [25]: Such an algorithm runs in polynomial time and gives correct solutions for all but a very small number of inputs. In particular, if the number of inputs for each input-size on which the algorithm gives the wrong or no answer (denoted by the function ) is sufficiently small (e.g., for some constant ), such algorithms may be acceptable.

2. Frequently Correct (Probabilistic) [26]: Such an algorithm (which is typically probabilistic) runs in polynomial time and gives correct solutions with high probability. In particular, if the probability of correctness is (and hence can be boosted by additional computations running in polynomial time to be correct with probability arbitrarily close to 1 [31, Section 5.2]), such algorithms may be acceptable.

3. Approximately Optimal [27]: Such an algorithm runs in polynomial time and gives a solution for an input whose value is guaranteed to be within a multiplicative factor of the value of an optimal solution for , i.e., for any input for some function . A problem with such an algorithm is said to be polynomial-time -approximable. In particular, if is a constant very close to 0 (meaning that the algorithm is always guaranteed to give a solution that is either optimal or very close to optimal), such algorithms may be acceptable.

3. Effectively polynomial-time exact restricted solvability: Even if a problem is not solvable in any of the senses above, a restricted version of that problem may be exactly solvable in close-to-polynomial time. Let us characterize restrictions on problem inputs in terms of a set of aspects of the input. For example, possible restrictions on the inputs ESCreate could be the number of given software requirements, the number of components in , and the maximum number of components in a working software system relative to , , and (see also Table 1 in Section 3.2.4). Let each such aspect be called a parameter.

One of the most popular ways in which an algorithm can operate in close-to-polynomial time relative to restricted inputs is fixed-parameter (fp-) tractability [32]. Such an algorithm runs in time that is non-polynomial purely in terms of the parameters in , i.e., in time where is some function, is the size of input , and is a constant. A problem with such an algorithm for parameter-set is said to be fixed-parameter (fp-)tractable relative to . Fixed-parameter tractability generalize polynomial-time exact solvability by allowing the leading constant of the input size in the runtime upper-bound of an algorithm to be a function of . Though such algorithms run in non-polynomial time in general, for inputs in which all the parameters in have very small constant values and thus collapses to a possibly large but nonetheless constant value, such algorithms (particularly if is suitably well-behaved, (e.g, ) may be acceptable.

In the following two subsections, we shall evaluate the algorithmic options for ESCreate and ESAdapt, respectively, relative to each of these types of solvability. Our unsolvability proofs will use reductions between pairs of problems, where a reduction from a problem to a problem is essentially an efficient algorithm for solving which uses a hypothetical algorithm for solving . Reductions are useful by the following logic:

• If reduces to and is efficiently solvable by algorithm then is efficiently solvable (courtesy of the algorithm that invokes relative to ).

• If reduces to and is not efficiently solvable then is not efficiently solvable (as otherwise, by the logic above, would be efficiently solvable, which would be a contradiction).

We will use the following three types of reducibility:

###### Definition 1

[11, Section 3.1.2] Given decision problems and , i.e., problems whose answers are either “Yes” or “No”, polynomial-time (Karp) reduces to if there is a polynomial-time computable function such that for any instance of , the answer to for is “Yes” if and only if the answer to for is “Yes”.

###### Definition 2

[11, Section 3.1.2] Given search problems and , i.e., problems whose answers are actual solutions rather than just “Yes” or “No”, polynomial-time (Levin) reduces to if there is a pair of polynomial-time functions and such that for any instance of , the answer to for is if and only if the answer to for is .

###### Definition 3

[32]222 Note that this definition given here is actually Definition 6.1 in [33], which modifies that in [32] to accommodate parameterized problems with multi-parameter sets. Given parameterized decision problems and , parameterized reduces to if there is a function which transforms instances of into instances of such that runs in time for some function and constant , for each for some function , and for any instance of , the answer to for is “Yes” if and only if the answer to for is “Yes”.

Our reductions will be from versions of the following problems:

Turing Machine Halting (TM Halting) [29]
Input: A Turing Machine and a binary string .
Question: Does halt when given as input?

Dominating set [10, Problem GT2]
Input: An undirected graph and a positive integer .
Question: Does contain a dominating set of size , i.e., is there a subset , , such that for all , either or there is at least one such that ?

Optimal Dominating set (Dominating set)
Input: An undirected graph .
Output: A dominating set in of minimum size.

For each vertex in a graph , let the complete neighbourhood of be the set composed of and the set of all vertices in that are adjacent to by a single edge, i.e., . We assume below for each instance of Dominating set an arbitrary ordering on the vertices of such that . Note that only the first of the three problems above is provably unsolvable (indeed, unsolvable in the sense that there can be no algorithm period that returns the correct output for every input [34, Section 9.2.4]). Versions of the others are only known to be unsolvable relative to the types of efficient solvability listed at the start of this subsection modulo the conjectures and ; however, this is not a problem in practice as both of these conjectures are widely believed within computer science to be true [9, 28].

As we shall often see in the following two sections, a single reduction may imply multiple results. For example, with respect to the third of the solvability options described above, additional and sometimes stronger fp-tractability and intractability results can often be derived using the following three lemmas.

###### Lemma 1

[35, Lemma 2.1.30] If problem is fp-tractable relative to parameter-set then is fp-tractable for any parameter-set such that .

###### Lemma 2

[35, Lemma 2.1.31] If problem is fp-intractable relative to parameter-set then is fp-intractable for any parameter-set such that .

###### Lemma 3

[35, Lemma 2.1.35]. If problem is -hard when all parameters in parameter-set have constant values then cannot be fp-tractable relative to any subset of unless .

There are a variety of techniques for creating a reduction from a problem to a problem ([10, Section 3.2]; see also [33, Chapters 3 and 6]). One of these techniques is component design, in which an instance of constructed by a reduction is structured as mechanisms that generate candidate solutions for the given instance of and check these candidates to see if any are actual solutions. We have already seen in the example software systems given in Figure 3 how interfaces with different implementing components (in that case, interfaces intSystem and intProc) can be used to generate choices when constructing a component-based software system. In subsequent subsections, we will use this and other features of interfaces and components under the Dana runtime model as described in Section 2 to structure mechanisms that generate candidate solutions (i.e., valid component-based software systems corresponding to vertex-sets of size in a given graph ) and check these candidates to see if they are actual solutions (e.g., working component-based software systems relative corresponding to dominating sets of size in ) in many of the reductions underlying our results for problems ESCreate and ESAdapt.

### 3.2 Results for Emergent Software Creation

Many of the results derived in this section for ESCreate will actually be derived relative to the following problem:

Component-based Software Creation (CSCreate)
Input: Software system requirements , interface and component libraries and , and a base component .
Question: Is there a working component-based software system based on relative to , , and ?

Note that each input for ESCreate has a corresponding input to CSCreate (namely, the input to ESCreate without and ). Moreover, any algorithm that solves ESCreate under some can be also used to solve CSCreate (namely, if run on the given input for CSCreate produces a working system, output “Yes”, otherwise output “No”). This yields the following useful observation.

###### Observation 1

For any choice of and , if there is an algorithm of solvability type for ESCreate under than there is an algorithm of solvability type for CSCreate.

#### 3.2.1 Unsolvability of Unrestricted Emergent Software Creation

We start off by considering if problem ESCreate is solvable in the most general possible case — that is, if ESCreate has an algorithm that always returns the correct output for an input incorporating any possible choices of and and in which there are no restrictions on the form, size, or running times of and , their member interfaces and components, or any software systems created using and . It turns out that such an algorithm cannot exist.

Result A.1

For any choice of and , ESCreate is unsolvable.

Proof: Consider the following polynomial-time Karp reduction from TM Halting to CSCreate: given an instance of TM Halting, construct an instance of CSCreate in which and , there is a single input-output pair in such that for , consists of the single interface

    interface base {
void main(Input I)
}


and consists of the single component

    component Base provides base {
void main(Input I) {
<CODEM(x)>
output 1
}
}


where <CODEM(x)> is the Dana code simulating the computation of on input . As Dana contains both loops and conditional statements, it can readily simulate on input using code that is of size polynomial in the sizes of the given descriptions of and . Finally, let be component Base in . Note that the instance of CSCreate described above can be constructed in time polynomial in the size of the given instance of TM Halting. To conclude the proof, observe that the only possible component-based system for the constructed instance of CSCreate based on is that consisting of Base itself, and that this system satisfies the sole input-output constraint in if and only if halts on input for the given instance of TM Halting. It is known that TM Halting cannot have an algorithm that is correct for all possible instances [34, Section 9.2.4], and hence is unsolvable. Hence, the reduction above implies in turn that CSCreate cannot have an algorithm either. The unsolvability result for ESCreate then follows by contradiction from Observation 1.

This result is especially disconcerting as it holds relative to not just some choices but every possible choice of and (this is because the proof of this result ignores these functions entirely). However, it is ultimately not surprising, given the computational power inherent in the Dana programming language and the folklore result that a number of problems in software engineering, e.g., checking if a software system satisfies a set of given requirements, are known to be unsolvable as a consequence of Rice’s Theorem [34, Section 9.3.3].

That being said, restricted versions of ESCreate may yet have correct and even efficient algorithms. One reasonable such restriction is that any candidate component-based software system created from a given and runs in time polynomial in the input size and hence can be checked against the system requirements in in time polynomial in the size of (as ), i.e., created software systems not only operate but can also be verified quickly. Indeed, such a restriction is implicit in the requirement that emergent software systems be autonomously verifiable at runtime [12, page 5]. In the remainder of our analyses in this paper, we will assume ESCreate and CSCreate to be so restricted, and will denote these restricted versions as ESCreate and CSCreate, respectively.

#### 3.2.2 Polynomial-time Exact Solvability of Restricted Emergent Software Creation

We now consider if ESCreate is efficiently solvable in the first of the senses listed at the start of Section 3.1 — namely, polynomial-time exact solvability and polynomial-time exact promise solvability. One might initially think that, given the somewhat radical nature of the restriction on ESCreate proposed at the end of the previous subsection, ESCreate so restricted is now efficiently solvable in both of these senses. However, this turns out not to the case.

These intractability results are shown using the following reduction. This reduction creates valid component-based systems with component wiring trees of the form shown in Figure 4 in which the multiply implemented interfaces cond1, cond2, …condk are used to create valid software systems corresponding to all possible vertex-sets of size in the graph in the given instance of Dominating set. As each input-output pair in the constructed corresponds to a vertex-neighbourhood in , the code in component Base ensures that working software systems correspond to dominating sets of size in .

###### Lemma 4

Dominating set polynomial-time Karp reduces to CSCreate.

Proof: Given an instance of Dominating set, construct the following instance of CSCreate: Let , i.e., there is a unique Boolean variable corresponding to each vertex in , and . There are input-output pairs in such that for , , if and is otherwise and . Let consist of interfaces broken into two groups:

1. A single interface of the form

    interface base {
void main(Input I)
}

2. A set of interfaces of the form

    interface condJ {
Boolean inSetJ(Input I)
}


for .

Let consist of components broken into two groups:

1. A single component of the form

    component Base provides base
requires cond1, cond2, ..., condk {
void main(Input I) {
if inSet1(I) then output 1
elsif inSet2(I) then output 1
...
elsif inSetk(I) then output 1
else output 0
}
}

2. A set of components of the form

    component InSetJK provides condJ {
Boolean inSetJ(Input I) {
return v_I(x_K)
}
}


for and .

Note that in , there are implementations of each cond-interface. Finally, let be component Base in . Note that the instance of CSCreate described above can be constructed in time polynomial in the size of the given instance of Dominating set; moreover, as there is only a -clause if-then statement block and no loops in the component code and , any candidate component-based software system created relative to , , and runs in time linear in the size of input .

Let us now verify the correctness of this reduction:

• Suppose that there is a dominating set of size at most in the given instance of Dominating set. We can then construct a component-based software system consisting of and the InSet-components corresponding to the vertices in ; the choice of which interface to implement for each vertex is immaterial, and if there are less than vertices in , the final required cond-interfaces can be implemented relative to InSet-components corresponding to arbitrary vertices in . Observe that for each , this software system produces output given input .

• Conversely, suppose that the constructed instance of CSCreate has a working component-based software system based on relative to , , and . In order to correctly accommodate all input-output pairs in , the if-then statements in must implement InSet-components whose corresponding vertices form a dominating set in of size at most . Hence, the existence of a working component-based software system for the constructed instance of CSCreate implies the existence of a dominating set of size at most for the given instance of Dominating set.

This completes the proof.

Result A.2

For any choice of and , if ESCreate is polynomial-time exact solvable then .

Proof: Given the -hardness of Dominating set, the reduction in Lemma 4 implies that CSCreate is -hard, and hence not solvable in polynomial time unless . The polynomial-time intractability result for ESCreate then follows by contradiction from Observation 1.

Result A.3

For any choice of and , if ESCreate is polynomial-time exact promise solvable then .

Proof: Suppose that for some choice of and , ESCreate is polynomial-time promise solvable by an algorithm .333 It may initially seem puzzling why, in light of Observation 1, we here directly evaluate the polynomial-time promise solvability of ESCreate. This is necessary because the promise solvability of any decision problem such as CSCreate is established by the trivial constant-time algorithm which always answers “Yes” (and hence is always correct if a solution exists). Consider the following algorithm for Dominating set:

1. Given an instance of Dominating set, construct an instance Rew()Env(), c of ESCreate using the reduction from Dominating set to CSCreate described in Lemma 4 to create , , , and .

2. Run on to produce output for ESCreate.

3. As specified in the converse part of the proof of correctness of the reduction in Lemma 4 use the invoked if-then components in to derive a candidate solution for the given instance of Dominating set.

4. If is a correct solution for , output “Yes”; otherwise, output “No” (as by the definition of promise solvability, if the answer was “Yes” then would have had to output such that was a correct solution to the given instance of Dominating set).

As all steps in this algorithm run in polynomial time, the above is a polynomial-time algorithm for Dominating set. However, given the -hardness of Dominating set, this would imply that , completing the proof.

#### 3.2.3 Polynomial-time Approximate Solvability of Restricted Emergent Software Creation

We now consider if ESCreate is efficiently approximately solvable in either of the three senses (frequently correct (deterministic), frequently correct (probabilistic), or approximately optimal) listed at the start of Section 3.1. As can be seen below, the polynomial-time exact intractability of ESCreate proved in the previous section rules out all three of these types of efficient approximability.

We start by considering the two types of frequently correct approximability.

Result A.4

For any choice of and , if ESCreate is solvable by a polynomial-time algorithm with a polynomial error frequency (i.e., is upper bounded by a polynomial of ) then .

Proof: That the existence of such an algorithm for CSCreate implies follows from the -hardness of CSCreate (which is established in the proof of Result A.1) and Corollary 2.2. in [25]. The polynomial-time inapproximability result for ESCreate then follows by contradiction from Observation 1.

Result A.5

For any choice of and , if and ESCreate is polynomial-time solvable by a probabilistic algorithm which operates correctly with probability then .

Proof: It is widely believed that [31, Section 5.2] where is considered the most inclusive class of decision problems that can be efficiently solved using probabilistic methods (in particular, methods whose probability of correctness is and can thus be efficiently boosted to be arbitrarily close to one). Hence, if CSCreate has a probabilistic polynomial-time algorithm which operates correctly with probability then CSCreate is by definition in . However, if and we know that CSCreate is -hard by the proof of Result A.2, this would then imply by the definition of -hardness that . The polynomial-time inapproximability result for ESCreate then follows by contradiction from Observation 1.

To assess cost-approximability, we need the following problem.

Optimal Component-based Software Creation (CSCreate)
Input: Software system requirements , interface and component libraries and , a base component , and a reward function .
Output: A working component-based software system based on relative to , , and that has the smallest value of over all working systems based on relative to , , and , if such a system exists, and special symbol otherwise.

Let CSCreate be the version of CSCreate such that any component-based system runs in time polynomial in the input size . Note that each input for ESCreate has a corresponding input to CSCreate (namely, the input to ESCreate without ). Moreover, any algorithm that solves ESCreate under some can also be used to solve CSCreate (namely, return whatever run on the given input for CSCreate produces). This yields the following useful observation.

###### Observation 2

For any choice of and , if there is an algorithm of solvability type for ESCreate than there is an algorithm of solvability type for CSCreate under .

We first give a reduction that will be used to establish the cost-inapproximability of ESCreate under . This reduction builds on that in Lemma 4 by further exploiting the ability of interfaces to be implemented by multiple components to allow a set of BaseJ components that effectively encode all possible candidate dominating sets of size 1 to in (see Figure 5).

###### Lemma 5

Dominating set polynomial-time Levin reduces to CSCreate under such that there is a dominating set of size for the given instance of Dominating set if and only if there is a working component-based software system with reward value for the constructed instance of CSCreate.

Proof: Given an instance of Dominating set, construct the following instance of CSCreate: Let be as in the proof of Lemma 4. Let consist of interfaces broken into three groups:

1. A single interface of the form

    interface topBase {
void main(Input I)
}

2. A single interface of the form

    interface base {
void mainBase(Input I)
}

3. A set of interfaces of the form

    interface condJ {
Boolean inSetJ(Input I)
}


for .

Let consist of components broken into three groups:

1. A single component of the form

    component TopBase provides topBase requires base {
void main(Input I) {
mainBase(I)
}
}

2. A set of components of the form

    component BaseJ provides base
requires cond1, cond2, ..., condJ {
void mainBase(Input I) {
if inSet1(I) then output 1
elsif inSet2(I) then output 1
...
elsif inSetJ(I) then output 1
else output 0
}
}


for .

3. A set of components of the form

    component InSetJK provides condJ {
Boolean inSetJ(Input I) {
return v_I(x_K)
}
}


for and .

Note that in , there are implementations of the base-interface and implementations of each cond-interface. Finally, let be component TopBase in . Note that the instance of CSCreate described above can be constructed in time polynomial in the size of the given instance of Dominating set.; moreover, as there is only an at most -clause if-then statement block and no loops in the component code and , any candidate component-based software system created relative to , , and runs in time linear in the size of input .

Let us now verify the correctness of this reduction:

• Suppose that there is a dominating set of size in the given instance of Dominating set. We can then construct a component-based software system consisting of , component basek, and the Inset-components corresponding to the vertices in ; the choice of which interface to implement for each vertex is immaterial. Observe that for each , this software system produces output given input ; moreover, .

• Conversely, suppose that the constructed instance of CSCreate has a working component-based software system based on relative to , , and such that .444 Note that the existence of at least one such a working system is guaranteed for all instances of CSCreate constructed as described above (namely, the system consisting of components TopBase and Base(|V|) and the components InSetJJ for , which corresponds to the dominating set consisting of all vertices in ). This is necessary for our reduction, as each instance of Dominating set has at least one dominating set (namely, ), and cannot correspond to a constructed instance of CSCreate whose solution is . As is component TopBase which requires a Base component and this Base component requires some number of InSet components, this system is comprised of components TopBase, Base(), and InSet components. In order to correctly accommodate all input-output pairs in , the if-then statements in Base() must implement Inset-components whose corresponding vertices form a dominating set in of size at most . Hence, the existence of a working component-based software system such that for the constructed instance of CSCreate implies the existence of a dominating set of size for the given instance of Dominating set.

To complete the proof, note that the required functions and in the definition of a Levin reduction correspond respectively to the algorithm given at the beginning of this proof for constructing an instance of CSCreate under from the given instance of Dominating set and the algorithm implicit in the converse clause of the proof of reduction correctness above for constructing a dominating set from a valid component-based software system for the constructed instance of CSCreate.

The reduction above can also be used to establish the cost-inapproximability of ESCreate under .

###### Lemma 6

Dominating set polynomial-time Levin reduces to CSCreate under such that there is a dominating set of size for the given instance of Dominating set if and only if there is a working component-based software system with reward value for the constructed instance of CSCreate.

Proof: In the proof of Lemma 5, observe that for a dominating set of size in the given instance of Dominating set, a software system for the constructed instance of CSCreate consists of components TopBase and Basek, InSet components, interfaces topBase and base, and cond interfaces. The total number of lines of code in this system and hence the value of ) is therefore . The result then follows by a slight modification to the proof of correctness of the reduction described in Lemma 5.

Result A.6

For any choice of , if ESCreate under is polynomial-time
-approximable for any constant then .

Proof: Observe that in the proof of the reduction in Lemma 5, the size of a dominating set in in the given instance of Dominating set is always a linear function of the value of in the constructed instance of CSCreate, i.e., . This means that a polynomial-time -approximation algorithm for CSCreate under for any constant , when combined with the reduction from Dominating set to CSCreate described in the proof of Lemma 5, implies the existence of a polynomial-time -approximation algorithm for Dominating set (as for ). However, if Dominating set has a polynomial-time -approximation algorithm for any constant then [36], which means that CSCreate under cannot have a polynomial-time -approximation algorithm for any unless . The polynomial-time inapproximability result for ESCreate under then follows by contradiction from Observation 2.

Result A.7

For any choice of , if ESCreate under is polynomial-time
-approximable for any constant then .

Proof: Observe that in the proof of the reduction in Lemma 6, the size of a dominating set in in the given instance of Dominating set is always a linear function of the value of in the constructed instance of CSCreate, i.e., . This means that a polynomial-time -approximation algorithm for CSCreate under for any constant , when combined with the reduction from Dominating set to CSCreate described in the proof of Lemma 6, implies the existence of a polynomial-time -approximation algorithm for Dominating set (as for ). However, if Dominating set has a polynomial-time -approximation algorithm for any constant then [36], which means that CSCreate under cannot have a polynomial-time -approximation algorithm for any unless . The polynomial-time inapproximability result for ESCreate under then follows by contradiction from Observation 2.

#### 3.2.4 Fixed-parameter Tractability of Restricted Emergent Software Creation

Given the plethora of intractability results in the previous three subsections, we now consider to what extent and relative to which parameters ESCreate is and is not fp-tractable. In our analyses below, we will focus on parameter-sets drawn from the parameters listed in Table 1. These parameters can be divided into four main groups:

1. Parameters characterizing interface and component libraries ();

2. Parameters charactering interfaces ();

3. Parameters charactering components (); and

4. Parameters characterizing component-based software systems ().

We first consider those parameter-sets which yield fp-intractability.

Result A.8

For any choice of and , if -ESCreate is fp-tractable then .

Proof: Given the -hardness of -Dominating set, the reduction in Lemma 4 implies that CSCreate is -hard when , , , and and hence not fp-tractable relative to these parameters unless . The fp-intractability result for ESCreate then follows by contradiction from Observation 1.

The reductions underlying the following three results exploit the tricks previously used to such good effect in Lemmas 4 and 5 as well as other features of our software component model. The reduction underlying Result A.9 reduces the number of InSet components by invoking a larger encoding of candidate dominating sets and more complex but still polynomial-time checking computations in the Base component. The reduction underlying Result A.10 reduces the number of interfaces required by any component to a constant by splitting the creation of the candidate dominating sets in component Base in the reduction in the proof of Result A.9 over multiple components. Finally, the reduction underlying Result A.11 reduces the number of components in to 3 by exploiting the ability of components providing multiple interfaces to provide only the code required by an interface in that interface’s copy of the component. Readers interested in details can consult the full proofs of these results in the appendix.

Result A.9

For any choice of and , if -ESCreate is fp-tractable then .

Result A.10

For any choice of and , if -ESCreate is fp-tractable then .

Result A.11

For any choice of and , if -ESCreate is fp-tractable then .

We now consider those parameter-sets that yield fp-tractability. All of these results are based on the same brute-force solution enumeration algorithm relative to different worst-case runtime analyses.

Result A.12

For any choice of and , -ESCreate is fp-tractable.

Proof: The largest possible component-based software system relative to a given and has a component wiring tree rooted at base component with branching factor and depth . This tree has non-root vertices, each corresponding to an interface required by a component. As each of these interfaces can be implemented by at most components, there are at most possible component-based software systems of depth at most based on (the “+ 1” term at the lowest level denotes labeling a vertex with a special symbol that triggers deletion all descendent-vertices of ).

Consider the algorithm that exhaustively generates all such systems and for each system , (i) determines if is a working system relative to and, if so, (2) computes reward value . The output of this algorithm is the working system with the lowest or highest reward value, depending on the intent of . Given the above and our assumption that a candidate component-based software system can be checked against software system requirements in time polynomial in the sizes of and , this algorithm runs in fp-time relative to , , and , completing the proof of this result.

Result A.13

For any choice of and , -ESCreate is fp-tractable.

Proof: As no path from the root to a leaf in the wiring component trees for our software systems can contain duplicate component vertex-labels, the length of the longest path in such a tree from base component is bounded by ; this means that