Probabilistic Dynamic Hard Real-Time Scheduling in HPC

Industry 4.0 is changing fundamentally the way data is collected, stored and analyzed in industrial processes, enabling novel application such as flexible manufacturing of highly customized products. Real-time control of these processes, however, has not yet realized its full potential in using the data collected to drive further development. We believe that modern virtualization techniques, specifically application containers, present a unique opportunity to decouple control functionality from associated plants and fully realize the potential for highly distributed and transferable industrial processes even with real-time constraints arising from time-critical sub-processes. In this paper, we explore the challenges and opportunities of shifting industrial control software from dedicated hardware to bare-metal servers or (edge) cloud computing platforms using off-the-shelf technology. We present a specifically developed orchestration tool that can manage the execution of containerized applications on shared resources without compromising hard real-time execution determinism.



There are no comments yet.


page 1

page 2

page 3


Industrial Control via Application Containers:Maintaining determinism in IAAS

Industry 4.0 is changing fundamentally data collection, its storage and ...

Industrial Control via Application Containers: Migrating from Bare-Metal to IAAS

We explore the challenges and opportunities of shifting industrial contr...

Budget-based real-time Executor for Micro-ROS

The Robot Operating System (ROS) is a popular robotics middleware framew...

Survey of Control-Flow Integrity Techniques for Embedded and Real-Time Embedded Systems

Computing systems, including real-time embedded systems, are becoming in...

A Cognitive Approach to Real-time Rescheduling using SOAR-RL

Ensuring flexible and efficient manufacturing of customized products in ...

Virtualizing Mixed-Criticality Systems: A Survey on Industrial Trends and Issues

Virtualization is gaining attraction in the industry as it promises a fl...

The role of interactive super-computing in using HPC for urgent decision making

Technological advances are creating exciting new opportunities that have...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Industry 4.0 is the outcome of a strategy-project of the German government, intended to increase the computerization of manufacturing. The major technological pillars of Industry 4.0 are Cyber Physical System (CPS), Internet of Things (IoT), Information and Communications Technology (ICT), Enterprise Architecture (EA), and Enterprise Integration (EI) [1]. The four Industry 4.0 design principles [2], (interconnection, information transparency, decentralized decisions and technical assistance), require high levels of interconnection and interdependence. Consequently, Edge Computing including cross-platform operation, third party software and mixed criticality applications are taking the central stage. The computation requirements given the migration of functionality towards the "edge" imply new system architectures [3, 4]. These requirements range from data processing and performing at low criticality to hard-real-time control applications. The allocation of resources on an edge node will thus impact the performance, including response time, and installation cost.

Current trends in industry favor the flexibility of resource sharing through cloud computing, but merging cloud computing with real-time requirements is a challenging task. Garcia-Vallas et al. [5] state that guest OSs have only limited access to physical hardware and thus suffer from unpredictability of non-hierarchical scheduling and thick-stack communications. Abeni et al. [6] extended the standard Completely Fair Scheduler hierarchically with a deadline based algorithm, optimizing latency results for containerized software. While there exist real-time enabled hypervisors with direct access to hardware, such as the paravirtualized RT-Xen, the shared resources still exhibit latencies that may make real-time execution difficult.

Containerizing control applications as an alternative to traditional virtualization has also been addressed in the literature: Moga et al. [7], for instance, presented the concept of containerization for full control applications as a means to decoupling hardware and software life-cycles of industrial automation systems. Telschig et al. [3] explore a platform-independent container architecture for real-time systems; with a dedicated architecture and a prototype agent that manages communication between dependent distributed software, the authors focus on isolation of critical from non-critical tasks and their respective portability. Tasci et al. [8] presented a Linux-based solution as host operating system, including both the single kernel, preemption-focused PREEMPT-RT patch, and the co-kernel oriented Xenomai, where they evaluated both migration feasibility and performance. However, these approaches are based on experimental research and do not consider resource optimization using Commercial-Off-The-Shelf (COTS) components.

In [9], we assessed the feasibility of real-time task execution with COTS products in virtualized environments. In a follow-up paper, [10], we analyzed static real-time task grouping and scheduling in the same environments to optimize resource allocation. In the latter paper, we propose a method for resource management, which actively monitors and improves resource utilization using a probabilistic approach. Using COTS technology, we assess the viability of our probabilistic resource sharing approach for the reduction of operating costs.

Ii Resource management

We use an orchestrator in our experiments; an agent that monitors and manages resources while maintaining determinism. Each container needs three main resources: computing power to execute algorithms, memory to store and manipulate data, and I/O to interact with the environment. The orchestrator

’s scheduling strategy maximizes resource utilization without going over the resource limits (over-subscription). Careful resource management allows efficient resource allocation while mitigating the risk of missing deadlines. To ensure determinism, one can set the Worst Case Execution Time (WCET) of periodic deadline scheduled tasks to the maximum measured or expected amount, accepting to potentially underutilize available resources. We have conceived a solution that enables higher sharing, and thus saving, of low-use time-slices while retaining a low probability of deadline misses. To achieve this, the orchestrator confines the containers and running processes in their execution context, and monitors them to compute the probability of missed deadlines. If these probabilities exceed application specific thresholds, the orchestrator reschedules the respective tasks next in line onto a different, less critical resource. Thus, our orchestrator optimizes utilization for efficient resource allocation, while moderating the ensuing risks.

Fig. 1: Image processing allocation, ideal vs deviation. Deadline misses may occur if images change significantly.

Optimal resource management in full generality is hard to achieve. To ease process allocation and resource assignment, we impose two initial assumptions. First, we assume independence of the real-time software running in containers for which we have neither specifications nor source code. Accordingly, the scheduling algorithm virtually simplifies to multiple single-core units where allocation reaches theoretical utilization rates of for Rate Monotonic (RM) scheduling and up to for Earliest Deadline First (EDF) [11]

. Second, we assume knowledge of all data for all possible real-time tasks. By experimental measurement, we retrieve a task’s probability distribution of execution times and its WCET. Reducing the reserved run-time CPU-budget to values close to the average task run-time increases the probability of exceeding new deadlines but opens part of the reserved slot to shared use. Equation

1 determines the size of this shared buffer.


Figure 1 shows an example of a manufacturing use case application involving a conveyor belt. Two cameras take 8 pictures per second on both sides of a product, requiring an assessment (pass/fail) every . If the image processing time remains below 50% of the sampling time, both cameras can run on the same computing resources, saving one CPU. However, image processing times depend on the image structure. A perfect product will always yield the same processing time, but defects and lighting variations can increase this value and thus cause missed deadlines, as shown in the lower part of the figure. With statically assigned resources, tasks use the resource slot between average and peak value only when images differ from the ideal. The use of the orchestrator allows sharing part of this slot, increasing resource utilization, at a cost of potential deadline misses with probability

. The joint value for normally distributed execution times is given by

resulting in another normal distribution. and

are the arithmetic sums of mean and variance of the resulting probability distribution of each task. The probability of deadline misses is given by the area under the function, left of the maximum utilization factor

(1 for pure EDF scheduling).

Iii Experimental results

We test our approach in two case studies. The first explores the behavior of our approach in varying load conditions, analyzing a group of event driven real-time tasks, called workers, which are fed with information on variable intervals. The second study helps assessing the orchestration behavior and effectiveness in more complex applications with strict real-time requirements. This includes mixed real-time applications, event driven and polling-based, that have different execution periods and real-time behavior.

Fig. 2: Image processing reallocation. The orchestrator tries to avoid a deadline miss by reallocating resources to other, lower critical areas.

We expect to see the amount of I/O performed by the system to be the major influencing factor for resource reuse and re-schedulability. In a static allocation experiment [10], we found that for computation only, the subscription of resources can exceed 90%. Blocking I/O calls and higher priority interrupts influence the execution queue. Software could be optimized for quasi-static execution [12], reducing branching conditions and making the resulting execution time variations dependent only on system and I/O delays. Scheduling tasks on the same resources, we expect a higher number of re-allocations and resource depletion, Fig. 2. How much these effects account for and possible orchestration consequences in an industrial real study case are hard to predict and open for future study.

Iv Conclusions

Using dynamic, probabilistic resource allocation and scheduling, our orchestration approach addresses the demand for economic real-time capable computing. Mixed criticality applications prepare environments for shared execution of tasks with different priorities, enabling us to exploit this variety to bridge peak demands for hard real-time tasks. The resulting resource savings reduce installation, operation and maintenance costs of smart industrial plants, leveraging the use of virtualization technologies for industrial control applications. However, unpredictability of I/O and operating system fluctuations make it hard to estimate the variability and impact in real use cases. Future work will explore application to industrial automation use cases, obtainable savings and the difference in impact between synthetic and real use cases. We will further explore additional orchestration algorithms for non-deadline scheduled tasks and their efficiency.


  • [1] Y. Lu, “Industry 4.0: A survey on technologies, applications and open research issues,” Journal of Industrial Information Integration, vol. 6, pp. 1–10, Jun. 2017.
  • [2] M. Hermann, T. Pentek, and B. Otto, “Design principles for industrie 4.0 scenarios,” in 2016 49th Hawaii International Conference on System Sciences (HICSS), NA, Ed.   IEEE, jan 2016.
  • [3] K. Telschig, A. Schönberger, and A. Knapp, “A real-time container architecture for dependable distributed embedded applications,” in 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).   IEEE, aug 2018.
  • [4] F. Hofer, “Architecture, technologies and challenges for cyber-physical systems in industry 4.0 - a systematic mapping study,” in 12th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2018.
  • [5] M. García-Valls, T. Cucinotta, and C. Lu, “Challenges in real-time virtualization and predictable cloud computing,” Journal of Systems Architecture, vol. 60, no. 9, pp. 726–740, oct 2014.
  • [6] L. Abeni, A. Balsini, and T. Cucinotta, “Container-based real-time scheduling in the linux kernel,” 2018.
  • [7] A. Moga, T. Sivanthi, and C. Franke, “OS-level virtualization for industrial automation systems: Are we there yet?” in Proceedings of the 31st Annual ACM Symposium on Applied Computing - SAC ’16.   ACM Press, 2016.
  • [8] T. Tasci, J. Melcher, and A. Verl, “A container-based architecture for real-time control applications,” in 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC).   IEEE, jun 2018.
  • [9] F. Hofer, M. Sehr, A. Iannopollo, I. Ugalde, A. Sangiovanni-Vincentelli, and R. Barbara, “Industrial control via application containers: Migrating from bare-metal to IAAS,” in 2019 IEEE 11th International Conference on Cloud Computing Technology and Science (CloudCom).   IEEE, dec 2019, p. NA. [Online]. Available:
  • [10] F. Hofer, M. Sehr, A. Sangiovanni-Vincentelli, and R. Barbara, “Industrial control via application containers: Maintaining determinism in IAAS,” 2019. [Online]. Available:
  • [11] G. C. Buttazzo, Hard real-time computing systems: predictable scheduling algorithms and applications.   Springer Science & Business Media, 2011, vol. 24.
  • [12] C. Liu, A. Kondratyev, Y. Watanabe, and A. Sangiovanni-Vincentelli, “A structural approach to quasi-static schedulability analysis of communicating concurrent programs,” in Proceedings of the 5th ACM International Conference on Embedded Software, ser. EMSOFT ’05.   New York, NY, USA: ACM, 2005, pp. 10–16.