RDMAbox : Optimizing RDMA for Memory Intensive Workloads

04/25/2021

∙

We present RDMAbox, a set of low level RDMA opti-mizations that provide better performance than previous ap-proaches. The optimizations are packaged in easy-to-use ker-nel and userspace libraries and presented through simple nodelevel abstractions. We demonstrate the flexibility and effec-tiveness of RDMAbox by implementing a kernel remote pag-ing system and a userspace file system using RDMAbox.RDMAbox employs two optimization techniques. First, wesuggest Load-aware Batching to further reduce the total num-ber of I/O operations to the RDMA NIC beyond existing door-bell batching. The I/O merge queue at the same time functionsas a traffic regulator to enforce admission control and avoidoverloading the NIC. Second, we propose Adaptive Pollingto achieve higher efficiency of polling Work Completion thanexisting busy polling while maintaining the low CPU over-head of event trigger. Our implementation of a remote paging system with RDMAbox outperforms existing representative solutions with up to 6.48x throughput improvement and up to 83 average tail latency in bigdata workloads, and up to 83 completion time in machine learn-ing workloads. Our implementation of a user space file system based on RDMAbox achieves up to 6x higher throughput over existing representative solutions.

READ FULL TEXT

RDMAbox : Optimizing RDMA for Memory Intensive Workloads

Sign in with Google

Consider DeepAI Pro