Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

05/17/2023
by   Jiajun Huang, et al.
0

In the exascale computing era, optimizing MPI collective performance in high-performance computing (HPC) applications is critical. Current algorithms face performance degradation due to system call overhead, page faults, or data-copy latency, affecting HPC applications' efficiency and scalability. To address these issues, we propose PiP-MColl, a Process-in-Process-based Multi-object Inter-process MPI Collective design that maximizes small message MPI collective performance at scale. PiP-MColl features efficient multiple sender and receiver collective algorithms and leverages Process-in-Process shared memory techniques to eliminate unnecessary system call, page fault overhead, and extra data copy, improving intra- and inter-node message rate and throughput. Our design also boosts performance for larger messages, resulting in comprehensive improvement for various message sizes. Experimental results show that PiP-MColl outperforms popular MPI libraries, including OpenMPI, MVAPICH2, and Intel MPI, by up to 4.6X for MPI collectives like MPI_Scatter and MPI_Allgather.

READ FULL TEXT
research
05/10/2017

Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory

HPC applications pose high demands on I/O performance and storage capabi...
research
07/29/2019

Improving MPI Collective I/O Performance With Intra-node Request Aggregation

Two-phase I/O is a well-known strategy for implementing collective MPI-I...
research
05/08/2019

Implementing Efficient Message Logging Protocols as MPI Application Extensions

Message logging protocols are enablers of local rollback, a more efficie...
research
10/01/2018

TZC: Efficient Inter-Process Communication for Robotics Middleware with Partial Serialization

Inter-process communication (IPC) is one of the core functions of modern...
research
04/08/2023

C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives

With the ever-increasing computing power of supercomputers and the growi...
research
08/22/2019

Network-Accelerated Non-Contiguous Memory Transfers

Applications often communicate data that is non-contiguous in the send- ...
research
09/05/2022

A Fault Resilient Approach to Non-collective Communication Creation in MPI

The increasing size of HPC architectures makes the faults' presence an e...

Please sign up or login with your details

Forgot password? Click here to reset