Nowadays, we are witnessing the spread of Information Technologies in several industrial domains (e.g., railways, avionic, automotive). This transforms industrial scenarios in Edge cloud environments populated by many Industrial Internet of Things (IIoT), looking towards the Industry 4.0 (I4.0) vision [iiot_survey, stavdas2022networked]. Thus, these systems must meet not only mandatory regulatory requirements involving functional safety and control timeliness, but also performance scalability, interoperability, low latency, and reconfigurability through fast and efficient deployment.
Virtualization is an enabling technology for I4.0 since it responds to the needs of reconfiguration, modularity, and consolidation through resource partitioning and multiplexing. It allows the execution of heterogeneous Operating Systems (OS) (real-time and general-purpose) on the same system-on-a-chip (SoC), becoming a prominent way for the industry to realize Mixed-Criticality systems due to the current COVID-19-induced silicon shortage phenomenon [bloomberg_chip_shortage, cinque2021virtualizing, cilardo2021virtualization]. Partitioning hypervisors (e.g., Jailhouse [ramsauer2017look], Bao [martins2020bao], Xtratum [crespo2010partitioned]) have gained the attention of both academia and industry [hermes_project, selene] due to the strong isolation provided through static allocation, at the cost of a reduced flexibility of deployment compared to classical virtualization (recently named consolidating hypervisors).
However, in the I4.0 vision, the stress is on the automatic management, reconfiguration, and self-healing of IT systems. Thus, criticality-aware orchestration systems are paramount since they automatically place, deploy, monitor, and migrate the packaged software across the infrastructure[barletta2022achieving]; still being aware of the isolation guarantees required by critical workloads to prevent interferences in terms of faults and attacks from non-critical jobs. Currently, containers are seamlessly integrated into orchestration systems, but ensuring their isolation is still an open issue, threatening the practicability of OS-level virtualization under strict real-time, safety, and security requirements. Unikernels seem to be a solution since they do not share the underlying host kernel, but their portability issues and real-time support are still open issues [chen2022unikernel].
In this position paper, we propose RunPHI, a framework that integrates partitioning hypervisors into container orchestration systems with the aim of leveraging the strong isolation provided by partitioning solutions while taking advantage of orchestration techniques. This project advances the state of the art since i) it simplifies the deployment of critical workloads in edge/cloud environments, useful for maintenance, upgrades, and new deployments; ii) it enables failure mitigation through migration and spawning of new partitions; iii) it is a driving force for the full reconfigurability of I4.0 for workloads with isolation requirements.
Ii Related Work
In the literature, several solutions (summarized in Table I) adapt general-purpose hypervisors with the aim of providing true isolation between containers, aka sandboxed containers. IBM Nabla111https://nabla-containers.github.io/ builds containers on top of unikernels. Google gVisor222https://gvisor.dev/ creates a dedicated guest kernel to run containers. Amazon Firecracker333https://firecracker-microvm.github.io/ is a lightweight hypervisor for sandbox applications. Both KubeVirt444https://kubevirt.io/ and vSphere Integrated Containers (VIC)555https://vmware.github.io/vic-product/ integrate VMs and containers under a single orchestration infrastructure. Kata Containers666https://katacontainers.io/ allows running secure container runtime with lightweight VMs. RunX777https://github.com/Xilinx/runx uses Xen hypervisor to run containers in multiple separate VMs, either with the provided custom-built Linux-based kernel, or with container-specific kernel/ramdisk.
These solutions are mainly based on general-purpose hypervisors, which do not fit well with mixed-criticality real-time requirements. In contrast, current partitioning hypervisor solutions seem to provide enough guarantees about both safety and security isolation. To the best of our knowledge, there are no solutions that support sandboxed containers in conjunction with partitioning hypervisors. The objective of RunPHI is to provide both isolation requirements and flexible orchestration capabilities for next-generation I4.0 scenarios.
Figure 1 shows a first design of RunPHI. Users provide partition descriptions via classical tools inherited from container-based orchestration (e.g., Dockerfiles, Kubernetes manifests). Partitioned container descriptions can be extended with requirements related to physical resources, criticality levels (e.g., low, mid, or high), real-time constraints, etc. RunPHI leverages a partitioning hypervisor to provide strong isolation between containers. In particular, according to partition descriptions, RunPHI, implemented in the privileged partition, tries to allocate physical resources in line with free resources within the host node. The inference engine fills the gaps with predicted values for resources not specified in the partitioned container description, according to requests from the orchestration platforms and current usage of hardware. RunPHI manages the lifecycle of partitioned containers and is designed to be highly flexible with the aim of orchestrating partitioned containers with a: i) low-level criticality (e.g., partition A in Figure 1) that includes a classical container abstraction with several applications running on top of it, a number of virtual CPUs (vCPUs) with no specific affinity on physical CPUs (pCPUs), without specific real-time guarantees; ii) mid-level criticality (e.g., partition B in Figure 1
) that includes running a RTOS/unikernel single-app with strict temporal and memory isolation requirements (e.g., by using 1-to-1 vCPU-pCPU mapping and cache/RAM coloring mechanisms respectively) and use of GPUs accelerators for running machine learning algorithms; iii)high-level criticality (e.g., partition C in Figure 1) that includes same mechanism for mid-level criticality with the addition of running bare-metal tasks on real-time CPUs and use of programmable logic blocks like FPGAs.
Iv Research Questions and Objectives
In the following, we delineate research questions to be considered in the next steps of our project.
[title= RQ1. How I4.0 mixed-criticality systems can be deployed via RunPHI? ]0.95
To support partitioned containers run at different criticality with real-time constraints
To support running bare-metal applications
To support accelerator devices like FPGAs and GPUs
To induce minimal overhead in terms of CPU, memory, and I/O
[title= RQ2. How to quantify the isolation between partitioned containers provided by RunPHI? ]0.95
To support temporal, memory, and fault isolation assessment (e.g., fault injection testing)
To support security isolation assessment (e.g., fuzzing)
[title= RQ3. How to support orchestration for partitioned containers in RunPHI? ]0.95
To support different runtime containers and OCI-compliance
To support partitioned containers description via existing runtime containers API and existing tools for configuration file (e.g., Dockerfile, Kubernetes manifest)
To implement an inference engine to determine (sub)optimal resource allocation for partitioned containers
To support migration, checkpointing, and high-availability mechanisms for partitioned containers
This work has been supported by the project COSMIC of UNINA DIETI.