High-Performance Distributed RMA Locks

by   Patrick Schmid, et al.

We propose a topology-aware distributed Reader-Writer lock that accelerates irregular workloads for supercomputers and data centers. The core idea behind the lock is a modular design that is an interplay of three distributed data structures: a counter of readers/writers in the critical section, a set of queues for ordering writers waiting for the lock, and a tree that binds all the queues and synchronizes writers with readers. Each structure is associated with a parameter for favoring either readers or writers, enabling adjustable performance that can be viewed as a point in a three dimensional parameter space. We also develop a distributed topology-aware MCS lock that is a building block of the above design and improves state-of-the-art MPI implementations. Both schemes use non-blocking Remote Memory Access (RMA) techniques for highest performance and scalability. We evaluate our schemes on a Cray XC30 and illustrate that they outperform state-of-the-art MPI-3 RMA locking protocols by 81 hashtable that represents irregular workloads such as key-value stores or graph processing.



page 4


BCL: A Cross-Platform Distributed Container Library

One-sided communication is a useful paradigm for irregular parallel appl...

Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided

Modern interconnects offer remote direct memory access (RDMA) features. ...

Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations

Remote memory access (RMA) is an emerging high-performance programming m...

Constellation: A High Performance Geo-Distributed Middlebox Framework

Middleboxes are increasingly deployed across geographically distributed ...

Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages

We propose Atomic Active Messages (AAM), a mechanism that accelerates ir...

Irregular Accesses Reorder Unit: Improving GPGPU Memory Coalescing for Graph-Based Workloads

GPGPU architectures have become established as the dominant parallelizat...

Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters

Finely tuning MPI applications and understanding the influence of keypar...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.