MigrOS: Transparent Operating Systems Live Migration Support for Containerised RDMA-applications

09/15/2020
by   Maksym Planeta, et al.
0

Major data centre providers are introducing RDMA-based networks for their tenants, as well as for operating the underlying infrastructure. In comparison to traditional socket-based network stacks, RDMA-based networks offer higher throughput, lower latency and reduced CPU overhead. However, transparent checkpoint and migration operations become much more difficult. The key reason is that the OS is removed from the critical path of communication. As a result, some of the communication state itself resides in the NIC hardware and is no more under the direct control of the OS. This control includes especially the support for virtualisation of communication which is needed for live migration of communication partners. In this paper, we propose the basic principles required to implement a migration-capable RDMA-based network. We recommend some changes at the software level and small changes at the hardware level. As a proof of concept, we integrate the proposed changes into SoftRoCE, an open-source kernel-level implementation of the RoCE protocol. We claim that these changes introduce no runtime overhead when migration does not happen. Finally, we develop a proof-of-concept implementation for migrating containerised applications that use RDMA-based networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2023

UMS: Live Migration of Containerized Services across Autonomous Computing Systems

Containerized services deployed within various computing systems, such a...
research
03/24/2022

Downtime Optimized Live Migration of Industrial Real-Time Control Services

Live migration of services is a prerequisite for various use cases that ...
research
05/22/2023

POSTER: spaceQUIC: Securing Communication in Computationally Constrained Spacecraft

Recent years have seen a rapid increase in the number of CubeSats and ot...
research
01/14/2021

Checkpoint, Restore, and Live Migration for Science Platforms

We demonstrate a fully functional implementation of (per-user) checkpoin...
research
09/20/2023

ElasticNotebook: Enabling Live Migration for Computational Notebooks (Technical Report)

Computational notebooks (e.g., Jupyter, Google Colab) are widely used fo...
research
01/09/2014

Performance Impact of Lock-Free Algorithms on Multicore Communication APIs

Data race conditions in multi-tasking software applications are prevente...
research
03/29/2018

Proof-of-Concept Examples of Performance-Transparent Programming Models

Machine-specific optimizations command the machine to behave in a specif...

Please sign up or login with your details

Forgot password? Click here to reset