On August 1, 2012, the financial technology venture Knight Capital Group, Inc executed a malfunctioning update of their autonomous trading system that caused the large-scale issuing of erroneous orders, leading to losses of more than $450 million within less than one hour [10.1257/jep.27.2.51]. In the software engineering community, the root cause of the error is ascribed to problematic software development processes that do not ensure a sufficient degree of quality assurance automation and testing at different development and deployment stages [Bass2015].
To address this and similar issues111We observe that Knight Capital’s system is one of numerous autonomous software systems already in operation within socio-technically complex organizations [DelaPrieta2019]., new software development practices have emerged during the the last decade, most notably the Developer Operations (DevOps) approach [ebert2016devops]. DevOps aims to reduce the time for deploying high-quality (validated and verified) software artifacts (and their updates) to complex and heterogeneous production environments [Bass2015]. Desirable qualities of DevOps-oriented software engineering are reliability, predictability and security [NicoleForsgren2019]. For example, DevOps facilitates the autonomy of teams and their individual members to prevent, discover, and fix software bugs quickly and effectively [NicoleForsgren2019]
. However, one may say that even applying the best industrial-scale software engineering processes in combination with traditional programing paradigms cannot fully prevent problems like the one that occurred during the Knight Capital incident. Indeed, from an artificial intelligence perspective, an alternative root-cause is thesingle-mindedness and lack of meaningful goal-orientation of the software subsystem (or: agent) that kept issuing orders, without re-assessing over time whether doing so is aligned with the overall objectives of the trading system. From this perspective, it can be questioned whether the application of the current conception of DevOps is sufficient to ensure quality, and to facilitate the fast-paced development of highly autonomous software systems.
Consequently, one may call for the application of approaches to Engineering Multi-Agent Systems (EMAS) that treat the agency of autonomous software artifacts, as well as the environments and organizations these artifacts act in, as first-class abstractions. Along these lines, this paper proposes a bridge between DevOps and EMAS, with the aim to address the need for a robust method for delivering autonomous software artifacts faster and safer. Nevertheless, this paper attempts to maintain a critical perspective on the mainstream-readiness of EMAS. Indeed, the lack of industry-scale tools for engineering autonomous software curbs EMAS adoption in practice [engineering-gsi-article-2019, logan2018agent], and we argue that the application of EMAS should always consider efforts to mature EMAS tooling as a prerequisite.
Developer Operations (DevOps) describes the industry best practices that integrate software development, quality assurance and operations teams, from both organizational and technological perspectives [ebert2016devops]. DevOps can be considered a continuation of the trend towards iterative software development, which started at the turn of the century with the publication of the Agile Manifesto [DINGSOYR20121213]. In particular because iterative software development approaches require a fast-paced transition between requirement adjustments, software changes, tests, and deployments, handovers across traditional organizational and technological boundaries become increasingly impractical. To address this issue, DevOps recommends the integration of software developers, Quality Assurance (QA) engineers, and system administrators into autonomous cross-functional teams that are in charge of developing, testing, deploying, and operating a system or system component [Bolscher2019]. This stands in contrast to traditional approaches that segment functional specializations and hence require frequent handovers between teams or even departments, all of which are in charge of one specific task [Pettigrew2000]. To support cross-functional teams with the broad range of tasks that fall into the DevOps scope, a plethora of tools exists, many of which have found wide-spread adoption. For example, continuous integration tools and services allow for the configuration of automated tests and deployments using simple declarative specification and script languages , whereas containerization [merkel2014docker] and container orchestration tools [10.1145/2806777.2809955] help speed up and automate the deployment and scaling of complex IT systems across heterogeneous infrastructure.
The DevOps development life-cycle (illustrated in Figure 1) can be described as follows:
- Plan and code.
DevOps development teams implement features in fast, incremental iterations, which is facilitated by the organizational structure and technological setup. As a consequence, DevOps reduces the overhead of QA, releases, and deployments.
- Build and test.
Each update of the code base triggers the automated execution of one or several test suites. Ideally, all technical aspects of software artifact generation (build) and quality assurance are executed automatically; passing tests and builds imply that the software artifact works reliably and can be released without concerns. This requires the development team to treat QA as a key responsibility.
- Release and deploy.
After tests and builds have been successfully executed, deployments (for example to cloud environments) and/or releases (e.g., to package management services) are triggered in an automated or semi-automated manner.
- Operate and monitor.
During operations, a key feature of DevOps is the automation of many system administration tasks, like the provision of additional resources if the load on the system increases. To reduce the overhead of system administration, teams often rely on cloud-based service offerings that abstract away technical details.
Figure 1 depicts the DevOps life-cycle.
During the past decades, the EMAS sub-field has emerged as a research direction within the field of artificial intelligence [shehory2016agent]. One of the key lines of work within EMAS is the refinement of the Agent-Oriented Programming (AOP) paradigm, which provides abstractions for implementing autonomous and social software artifacts (agents). However, the scope of EMAS entails more than AOP, in particular because EMAS is concerned with the holistic software engineering perspective and not only programming. With the increase in prevalence of (somewhat) autonomous software systems in distributed information system landscapes [DelaPrieta2019], it was initially a reasonable expectation that EMAS would gain attention from the software engineering mainstream. However, EMAS approaches have not seen wide-spread adoption in practice, neither directly, nor as derivations that are implemented in industry-scale programming language ecosystems.
4 Integrating EMAS with DevOps
Let us highlight that the main objective of DevOps is not automation, which could also be achieved with traditional, homogeneous team constellations, but rather autonomy of teams within a software development organization, which is achieved by relying on automation technologies. From the description provided in Subsection 2, one can see that DevOps is, in the way it is currently practiced, concerned with autonomy on three levels:
On the organization level, DevOps facilitates team autonomy by avoiding the necessity of hand-overs between development, QA, and operations teams.
On the integration level, DevOps allows for continuous integrations and deployments, avoiding manual steps and hand-overs in the pipeline from code check-in to system deployment.
On the operations level, DevOps provides abstractions that allow operators to specify high-level infrastructure requirements and handles lower-level details like exact resource allocation and machine provisioning autonomously.
In contrast, EMAS focuses on the autonomy of the agents, as software artifacts, that a software engineering team or organization creates, i.e., it adds a fourth autonomy level to the three-level perspective DevOps provides. Table 1 shows an overview of the four levels and explains them by example.
|Organization autonomy||Avoid handovers: one team is in charge of all steps in the life-cycle||DevOps|
|Integration autonomy||Avoid manual deployments and QA: run all tests before a merge and auto-deploy if all tests pass||DevOps|
|Operations autonomy||Avoid manual resource provisioning: auto-scale systems when load increases||DevOps|
|Artifact autonomy||Avoid manual low-level business decisions: approve (financial) transactions without humans interference||EMAS|
At this point, it is worth highlighting that even when developing “traditional” software artifacts with little or no autonomy, it is widely acknowledged that total global supervision and coordination of all software design steps is practically not possible, even if the scope of the project is confined to a single organization. Hence, DevOps approaches try to integrate changes frequently in a controlled manner in order to discover unknown dependencies and unexpected behavior early on. The autonomy levels (Table 1) allow teams of engineers to dynamically respond to challenges that arise and to minimize the effect these challenges have on the broader organization. When developing autonomous software artifacts, one can expect that there will be even more problems that cannot be identified at design-time and hence the continuous integration approach requires even more attention; i.e., on the organizational level, the implementation of highly autonomous artifacts implies that the intensity of dependencies between teams that develop different sub-systems is not always apparent before these sub-systems are integrated. These emerging dependencies then need to be managed on the integration and operations levels, for example to ensure that in case of the deployment of sub-systems that have “hidden” incompatibilities, communication failures do not lead to disastrous consequences. Because of their dynamic nature, agents cannot be developed into mature software artifacts without exposing them to the environment they are supposed to act in [Winikoff2015]. Regarding this distinguishing characteristic of agents, the integration of agent-orientation and developer operations can be considered a methodological response to this issue. To allow for a gradual exposure of an agent to a progressively more realistic environment that increases the likelihood of catching critical errors early on, an agent-oriented variant of the DevOps life-cycle may require the following features:
Goal-oriented test-driven development. The behavior of social and goal-oriented software artifacts like agents is typically complex and non-deterministic [Coelho2007]. Hence, the common testing levels (unit tests, functional tests, and integration tests) usually are not satisfactory to cover agents’ possible behaviors. Some approaches have been proposed to address this issue [Coelho2007, Earle2019, Nguyen2009], but a comprehensive solution has still not been devised [Winikoff2018]. Goal-oriented tests can provide an extra test level that should be able to assess whether an agent’s inference process from goals and beliefs to actions (and explanations of these actions) behaves as expected or not.
Sandbox for real-time collaboration. Development teams can move agents that have passed static code analysis, unit tests, goal-oriented tests, and low-level integration tests (which may or may not be goal-oriented)333We assume that these tests can be executed relatively quickly when the developer logs a change, which is – in the case of standard approaches to static code analysis (often called linting), unit tests, and some integration tests like micro-service handler tests – a common capability of development tool-chains. to a sand-boxed environment that allows for the collaborative development of agents and multi-agent systems in (near) real-time. This makes it easier for developers to consider their current development work in the context of other ongoing changes. Each sand-box features a fully-fledged multi-agent system, as well as version control and continuous integration support (automated testing and deployments). From a practical perspective, one can assume that the scope of a sandbox is restricted by organizational boundaries. For example, given a commercial enterprise A and a government organization B who both work on the same multi-agent system, it is safe to assume that the engineers of A cannot align in real-time with the engineers of B; a change made by organization A during development should not immediately (before verification and validation) affect the system organization B is developing against. The EMAS community has presented an initial prototype addressing part of this issue [amaraldemoaamas].
Cross-organizational staging system. To ensure quality across organizational boundaries, stable versions of local agents, artifacts, and environment updates that have been developed and thoroughly tested in a sand-boxed environment can be deployed to cross-organizational staging systems. To these staging systems, organizations that depend on each other’s work in a particularly critical manner (if not all organizations that contribute to the multi-agent system) have access and use it as a second-level testing environment; i.e., any run-time issue that may occur on the staging system does not effect system end-users. Still, errors are potentially more costly when they occur on the staging system and not in the sand-box, as their root-cause needs to be traced back – in a more complex environment – to a particular organization and then to a team. Cross-organizational staging systems can potentially make use of concepts and tools the EMAS community provides for managing multi-agent organizations (e.g., OISE [Hubner2007]).
Beta agents in production environments. When the tests have passed in the cross-organizational staging environment, a step-wise production deployment can be executed. As a first step, agent instances can be exposed to “sense” the production environment, without being able to act upon it. Then, some agent instances can be fully deployed to the production environment, but at limited scale, analogously to the way beta-feature roll-outs are handled in many software-as-a-service environments (so-called canary deployments [Bass2015]). Still, in contrast to typical canary deployments, which only affect a small portion of a system’s users, beta agent deployments are potentially more critical because of the interconnectedness of multi-agent systems. Only if these beta agents pass all tests after extensive monitoring, the full update of the production environment is executed. This step reflects tests on real traffic scenarios used by the automotive industry [Huang2016, Stadler2016]444In Vehicle-in-the-loop (VEHIL) simulations, domain-specific concepts similar to the sandbox for real-time collaboration and the cross-organizational staging system are employed..
Explainable Monitoring. Given the complex and non-deterministic behavior of multi-agent systems, it can be assumed that traditional monitoring facilities provide only limited utility. New ways of filtering and aggregating log entries for human or machine interpretation need to be devised. To address this issue, one can draw from an emerging body of works on explainable agents and multi-agent systems [10.5555/3306127.3331806], and in particular from research that investigates the filtering of event data to generate human-digestible explanations [mualla2020human].
Table 2 list these features and provides an overview of how they relate to mainstream software engineering practices. In the Figure 2 we present a more comprehensive view of the development cycle of agents based on DevOps life-cycle. Besides the mentioned features, the referred picture also illustrates the place for goal-oriented/agent-oriented model-driven development and programming tools, well-covered subjects of a range of studies produced by EMAS community (e.g., [Bellifemine2001, Bordini2007, DeLoach2010, Uez2014, Winikoff2005]).
|Goal-oriented test-driven development||Test-driven development (unit tests)||Testing before/while implementing||Higher declarative abstraction level at the intersection of unit and integration testing|
|Sandbox for real-time collaboration||Sandbox for exploratory development||Rapid prototyping support||Near-real-time interactive and collaborative programming support|
|Cross-organizational staging system||Traditional staging system||Production-like environment||Continuous deployments by different organizations|
|Beta agents in production environments||Beta-features in production environments||Pilot beta-features in production environment||Interaction between beta-agents and stable agents|
|Explainable monitoring||Operations monitoring systems||Explanation/analysis of a running system||System-centered versus agent-centered perspectives|
Let us highlight that the list of features is primarily an initial starting point, and each feature comes with limitations and trade-offs that may only emerge in industrial application scenarios (and may be specific to a given domain, technology stack or DevOps-variant). Consequently, it can make sense to consider a step-wise introduction of agent-oriented approaches to DevOps, focusing on the controlled assessment of a minimally viable agent-oriented abstraction555In his Agent Programming Manifesto, Logan calls for modular approaches to AOP [logan2018agent]. We argue that the notion of a minimally viable abstraction goes a step further, as it suggests a focus on one particular benefit AOP can bring to mainstream software engineering approaches such as DevOps, and hence a radical simplification that may deliberately disregard many aspects of AOP to minimize technology overhead and learning curve when introducing a single abstraction.. For example, autonomous software systems that are not developed using an academic EMAS approach can potentially still be evaluated by goal-oriented tests.
5 Implications for EMAS
Traditionally, EMAS is primarily concerned with the implementation of theoretical perspectives, such as belief-desire-intention reasoning-loops, that the artificial intelligence scientific literature provides on the design of autonomous agents and multi-agent systems. In contrast, the approach outlined in this paper is pragmatically targeted at moving EMAS closer to modern industry practices for software development, and at identifying gaps in mainstream software development approaches and frameworks that EMAS can fill. Hence, the approach depends on the exposure of EMAS and AOP works to the context of mainstream software development tools and pipelines, and in particular to the technology ecosystem that has risen to popularity alongside with DevOps. First prototypes that work towards this goal by treating continuous integration, collaboration features, and distributed version control as first-class citizens in the context of agent-oriented programming exist [amaral2020, amaraldemoaamas].
Consequently, the whole technology ecosystem that makes up the DevOps tool-chains needs to be thoroughly analyzed, and methodologies and re-usable software frameworks (or framework extensions) for identifying and addressing the specific requirements for the DevOps-oriented management of goal-oriented, autonomous software artifacts need to be developed. Logging, monitoring, and debugging facilities need to be devised that address the challenge of identifying anomalous behavior in a highly dynamic and heterogeneous environment, and facilitate the identification of software bugs that may be caused by intractable state and software version dependencies between autonomous software agents that are developed by different organizations.
Nevertheless, let us highlight that the integration of EMAS and DevOps cannot only draw from AOP research, but also apply other fundamental research on autonomous agents and multi-agent systems, for example by considering fundamental theoretical research on topics like belief revision [10.5555/1643031.1643098], goal reasoning [aha2018goal], or agreement technologies [10.5555/2431387]. Still, EMAS and EMAS-related research that is of immediate relevance necessarily has a focus on technologies, software engineering processes and/or practical aspects of socio-technical systems. In contrast, research that primarily provides formal contributions would first need to be implemented as a generic and re-usable abstraction for a particular technology ecosystem, or be presented as a solution to a particular software engineering problem. In this context, the notion of a minimally viable abstractions may – again – serve as a guiding design principle; e.g., when devising a new formal approach to belief revision, it may not be necessary to provide a holistic integration with a full-fledged MAS conceptual meta-model and technology like JaCaMo. Instead, a small library for managing belief revision could be implemented and presented in a way that enables re-usability in software stacks and tool-chains that do not necessarily include other agent-oriented concepts or technologies.
In this paper, we have proposed the integration of approaches to engineering multi-agent systems with the DevOps software engineering practice. The integration expands the scope of the agent-oriented programming paradigm to cover the full life-cycle of modern software engineering, from initial specification via implementation and continuous integration to operation and monitoring. Viewing EMAS from the perspective of modern software engineering approaches that cover the whole engineering life-cycle can facilitate the development of more practice-oriented perspectives on EMAS and AOP. The integration of EMAS and DevOps can draw from the breadth and depth of research on agents and multi-agent systems, and motivate future work at the intersection of theory and practice, for example on goal-oriented testing and goal reasoning.
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation and partially funded by Project AG-BR of Petrobras and by the program PrInt CAPES-UFSC “Automação 4.0”.