Inspired by service-oriented computing, microservices structure software applications as highly modular and scalable compositions of fine-grained and loosely-coupled services . These features support modern software engineering practices, like continuous delivery/deployment  and application autoscaling . A relevant problem in these practices consists of the automated deployment of the microservice application, i.e. the distribution of the fine-grained components over the available computing nodes, and its dynamic modification to cope, e.g. with positive or negative peaks of user requests. Although these practices are already beneficial, they can be further improved by exploiting the interdependencies within an architecture (interface functional dependences), instead of focusing on the single microservice. Indeed, architecture-level deployment orchestration can:
Optimize global scaling - e.g., avoiding the overhead of redundantly detecting inbound traffic and sequentially scale each microservice in a pipeline.
Avoid ”domino” effects due to unstructured scaling - e.g., cascading slowdowns or outages.
In the presented paper, we report results from  and additional work on modeling and simulation, using the probabilistic and timed process algebra Abstract Behavioural Specification (ABS), for a case study: a real-world microservice architecture, inspired by the email processing pipeline from Iron.io . The expressiveness of ABS allows us to devise a quite realistic and complex model of the case study. Moreover, the simulation shows effectiveness of the deployment orchestrations generated with the theory and tools in . In particular, the advantage of performing runtime adaptation via global system reconfigurations w.r.t. local component scaling.
1.1 Summary of Results from 
In  we address the problem of orchestrating the deployment, and re-deployment, of microservice architectures in a formal manner, by presenting an approach for modeling microservice architectures, that allows us to both prove formal properties and realize an implemented solution. We follow the approach taken by the Aeolus component model [12, 11, 9], which was used to formally define the problem of deploying component-based software systems and to prove that, in the general case, such problem is undecidable . The basic idea of Aeolus is to enrich the specification of components with a finite state automaton that describes their deployment life cycle.
In  we modify the Aeolus model in order to make it suitable for formal reasoning on the deployment of microservices. To realize this purpose, we significantly revisit the formalization of the deployment problem, replacing old Aeolus components with a model of microservices. The main difference between our microservices and Aeolus components can be found in the composition of their deployment life cycle. In lieu of using the total power of finite state automata, as Aeolus and other TOSCA-compliant deployment models  do, we consider microservices to have two states: (i) creation and (ii) binding/unbinding. About creation, we use strong dependencies to point out which microservices must be immediately connected to those just created. After that, we use weak dependencies to denote which microservices can be bound/unbound. The rationale behind these changes comes from state-of-the-art microservice deployment technologies like Docker  and Kubernetes . In particular, we take the weak and strong dependencies from Docker Compose , a language for defining multi-container Docker applications, that allows users to specify different relationships among microservices using, e.g. the depends_on (resp. external_links) modalities that impose (resp. do not impose) a specific startup order, in the same way as our strong (resp. weak) dependencies. It is also convenient using weak dependencies to model horizontal scaling, e.g. a load balancer bound to/unbound from many microservice instances during its life cycle.
Moreover, w.r.t. the Aeolus model, we also take into account resource/cost-aware deployments, following the memory and CPU resources found in Kubernetes. The amount of resources microservices need to properly run, is directly added in their specifications. In a deployment, a system of microservices runs within a set of computation nodes. Nodes represent computational units, e.g. virtual machines in an Infrastructure-as-a-Service Cloud deployment. Every node has a cost and a set of resources available to guest microservices.
On the model above, it is possible to introduce the optimal deployment problem as follows: given a starting microservice system, a set of available nodes, and a new target microservice to be deployed, find a set of reconfiguration operations that, once applied to the starting system, drives to a new deployment that includes the target microservice. The abovementioned deployment is supposed to be optimal, in the sense that the overall cost, i.e. the sum of the costs, of the used nodes is is slightest. In  we prove this problem to be decidable  by presenting an algorithm based on the generation of a set of constraints related to microservices distribution over nodes, connections to be established and optimization metrics that minimize the total cost of the computed deployment. In particular, we investigate the possibility to actually solve the deployment problem for microservices by exploiting Zephyrus2 , a configurator optimizer that was originally envisaged for the Aeolus model  but later extended and improved to support a new specification language and the possibility to have preferences on the metrics to optimize, e.g. minimize not only the cost but also the number of microservices.
1.2 Simulation with the Timed Process Algebra Abstract Behavioural Specification (ABS)
We have evaluated the actual exploitability of the approach of  by computing the initial optimal deployment, and run-time global reconfigurations, for a real-world microservice architecture, inspired by the reference email processing pipeline from Iron.io . This architecture is modeled in the Abstract Behavioral Specification (ABS) language, a high-level object-oriented probabilistic and timed process algebra that supports asynchronous communication and deployment modeling . Our technique is then used to compute two types of deployments: an initial one, with one instance for each microservice, and a set of deployments to horizontally scale the system depending on small, medium or large increments in the number of emails to be processed. The experimental results are encouraging in that we were able to compute deployment orchestrations for global reconfigurations that add, at each adaptation step, more than 30 new microservice instances, assuming availability of hundreds of machines of three different types, and guaranteeing optimality.
1.2.1 The Email Pipeline Processing System
The email processing pipeline described in , see Figure 1 (taken from ), is composed by types of microservices and each type has its own load balancer. The architecture can be divided into four pipelines analyzing different parts of an email message. Messages enter the system through the Message Receiver which forwards them to the Message Parser. This component, in turn, extracts data from the email and routes them to a proper sub-pipeline. As expected the processing of each email component entails a specific working time. Each microservice can handle a specific workload, called max computational load - e.g., the Header Analyser can handle a maximal inboud frequency of requests per second, see . In the global adaptation approach scaling actions are provided by three reconfiguration orchestrations, i.e. Scale 1, Scale 2 and Scale 3, which make system capable to deal with an augmented message inboud frequency (, and , respectively) w.r.t the maximum message workload in the base configuration: messages per second, see . As we will show, these reconfiguration orchestrations minimize costs through the coexistence of microservices in the same computing node (virtual machine) and provide an architecture-level scaling making it possible to avoid cascading slowdowns. The procedure governing the choice of the scaling orchestration is greedy. Indeed, taking the current message inbound frequency as a target, it computes the best scaling actions to apply based on minimizing the difference between the target value and the supported workload obtained by the scaling action under examination, until the system supports at least the target inbound frequency. After the target system configuration has been computed, the scaling actions required are executed and the system scales out. On the contrary, the local adaptation approach simply scales every time a microservice constitutes a bottleneck by replicating it. As we will show in our simulation, this produces a chain effect, due also to time needed for deploying components at each step, which slows down the achievement of the target configuration necessary to handle the inbound messages frequency. Furthermore, each replica is hosted by a new node (instead of coexisting with other microservices) increasing costs more and more.
1.2.2 System Modeling for Local and Global Adaptation
Thanks to the expressiveness of the object-oriented probabilistic and timed process algebra ABS it was possible to model the email processing pipeline of Figure 1, including explicit modeling of load balancers, as ABS components/classes. Each ABS component communicates asynchronously with other components (via future return types). Multiple ABS components are, themselves, located at a given deployment component, which is associated with a speed modeling its computational power: the number of computations per time unit it can perform. In our case study, an ABS time unit is set to model milliseconds. As a matter of fact, we built two ABS process algebraic models: one realizing the local adaptation mechanism discussed above and the other one implementing global reconfiguration via scaling actions Scale 1, Scale 2 and Scale 3. In both models, request queues of a fixed maximal size, and consequent message loss, are explicitly represented within load balancers, in order to prevent the system from over-loading. Indeed, implicit queues management (underlying asynchronous communication) leads the system to refuse no messages. Thus, in case finite queues are not explicitly represented, despite the system reaches the target configuration, implicit queues would hardly be emptied so to restore an acceptable latency. Another complex aspect of our modeling with the ABS process algebra concerns setting deployment component’s speed: it must be calculated at run time on the basis of the number of cores of the node (which is represented by the ABS deployment component) that are actually used. This is particularly important in case of host nodes with some cores that are not used by any of their deployed microservices: due to the ABS time model, if speed was not adjusted at run-time reflecting the unused cores, the deployment component would turn out to handle a higher inbound frequency than that provided by the actually used cores, distorting the results.
We executed the two ABS process algebraic models (for local and global adaptation) by means of the Erlang backend provided as part of the ABS toolchain available at . In order to build a complete simulation environment, we modeled (via an ABS data structure) a message inbound frequency, see Figure 2, and the inner structure of messages via probabilistic contents (exploiting the probabilistic features of ABS). As a matter of fact our two ABS models are, to the best of our knowledge, the biggest ones ever built with the ABS process algebra. Both ABS models are publicly available via GitHub at .
1.2.3 Simulation Results
As shown in Figure 2 the simulated message inbound frequency grows rapidly until it reaches a stable situation so that we can test the adaptive responsiveness of the two approaches (local and global adaptation). We first examine the system latency and the message loss. As can be seen in Figures 3 and 4, the global adaptation approach adapts itself much faster than the local one because it takes advantage of the information on the interdependencies within the architecture, thus avoiding all side effects described above that afflict the local adaptation approach. The optimization used to compute the deployment orchestration leads to significant money savings. Indeed, as can be seen in Figure 5 by means of global adaptation, a greater number of microservice instances can be deployed at reduced costs. Figure 6 highlights the chain “domino” effect in local component scaling that causes adaptation delay and makes the system less responsive to increasing workload. The performance and costs comparison show that the global adaptation procedure, used to resolve the optimal and automated deployment problem, is very effective: it reaches higher performance at lower costs than the classical local adaptation approach.
-  (2016) Zephyrus2: On the Fly Deployment Optimization Using SMT and CP Technologies. In SETTA, LNCS, Vol. 9984, pp. 229–245. Cited by: §1.1.
-  ABS toolchain. Note: https://abs-models.org/laboratory/Accessed on May, 2020 Cited by: §1.2.2.
-  Docker compose documentation. Note: https://docs.docker.com/compose/Accessed on May, 2020 Cited by: §1.1.
-  AWS auto scaling. Note: https://aws.amazon.com/autoscaling/Accessed on May, 2020 Cited by: §1.
-  Code repository for the email processing examples. Note: https://github.com/LBacchiani/ABS-Simulations-ComparisonAccessed on May, 2020 Cited by: §1.2.2.
-  (2019) Optimal and Automated Deployment for Microservices. In FASE, Cited by: §1.1.
-  (2020) A formal approach to microservice architecture deployment. In Microservices, Science and Engineering, pp. 183–208. Note: https://doi.org/10.1007/978-3-030-31646-4_8 External Links: Cited by: §1.1, §1.1, §1.1, §1.1, §1.2.1, §1.2, §1.
-  (2015) Modelling and analysing cloud application management. In ESOCC, LNCS, Vol. 9306, pp. 19–33. Cited by: §1.1.
-  (2015) Automatic application deployment in the cloud: from practice to theory and back (invited paper). In CONCUR, LIPIcs, Vol. 42, pp. 1–16. Cited by: §1.1.
-  (2014) Automated synthesis and deployment of cloud applications. In ASE, Cited by: §1.1.
-  (2014) Aeolus: A component model for the cloud. Inf. Comput. 239, pp. 100–121. Cited by: §1.1.
-  (2012) Towards a Formal Component Model for the Cloud. In SEFM 2012, LNCS, Vol. 7504. Cited by: §1.1.
-  (2017) Microservices: yesterday, today, and tomorrow. In PAUSE, pp. 195–216. Cited by: §1.
Thinking Serverless! How New Approaches Address Modern Data Processing Needs.
modern-data-processing-needs-part-1-af6a158a3af1Accessed on May, 2020 Cited by: §1.2.1, §1.2, §1.
-  (2017) Kubernetes: up and running dive into the future of infrastructure. 1st edition, O’Reilly Media, Inc.. External Links: Cited by: §1.1.
-  (2010) Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation. Addison-Wesley Professional. Cited by: §1.
-  (2010) ABS: A Core Language for Abstract Behavioral Specification. In FMCO, Cited by: §1.2.
-  (2014) Docker: lightweight Linux containers for consistent development and deployment. Linux Journal 2014 (239), pp. 2. Cited by: §1.1.