RDMAbox : Optimizing RDMA for Memory Intensive Workloads

04/25/2021
by   Juhyun Bae, et al.
0

We present RDMAbox, a set of low level RDMA opti-mizations that provide better performance than previous ap-proaches. The optimizations are packaged in easy-to-use ker-nel and userspace libraries and presented through simple nodelevel abstractions. We demonstrate the flexibility and effec-tiveness of RDMAbox by implementing a kernel remote pag-ing system and a userspace file system using RDMAbox.RDMAbox employs two optimization techniques. First, wesuggest Load-aware Batching to further reduce the total num-ber of I/O operations to the RDMA NIC beyond existing door-bell batching. The I/O merge queue at the same time functionsas a traffic regulator to enforce admission control and avoidoverloading the NIC. Second, we propose Adaptive Pollingto achieve higher efficiency of polling Work Completion thanexisting busy polling while maintaining the low CPU over-head of event trigger. Our implementation of a remote paging system with RDMAbox outperforms existing representative solutions with up to 6.48x throughput improvement and up to 83 average tail latency in bigdata workloads, and up to 83 completion time in machine learn-ing workloads. Our implementation of a user space file system based on RDMAbox achieves up to 6x higher throughput over existing representative solutions.

READ FULL TEXT
research
08/03/2020

Efficient Orchestration of Host and Remote Shared Memory for Memory Intensive Workloads

Since very few contributions to the development of an unified memory orc...
research
08/05/2023

Towards Fast, Adaptive, and Hardware-Assisted User-Space Scheduling

Modern datacenter applications are prone to high tail latencies since th...
research
09/11/2021

A readahead prefetcher for GPU file system layer

GPUs are broadly used in I/O-intensive big data applications. Prior work...
research
11/22/2021

KML: Using Machine Learning to Improve Storage Systems

Operating systems include many heuristic algorithms designed to improve ...
research
10/22/2019

Mitigating the Performance-Efficiency Tradeoff in Resilient Memory Disaggregation

Memory disaggregation has received attention in recent years as a promis...
research
03/19/2022

No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing

Serverless platforms essentially face a tradeoff between container start...
research
03/24/2021

RDMA is Turing complete, we just did not know it yet!

It is becoming increasingly popular for distributed systems to exploit n...

Please sign up or login with your details

Forgot password? Click here to reset