Quantifying the Performance Benefits of Partitioned Communication in MPI

08/07/2023
by   Thomas Gillis, et al.
0

Partitioned communication was introduced in MPI 4.0 as a user-friendly interface to support pipelined communication patterns, particularly common in the context of MPI+threads. It provides the user with the ability to divide a global buffer into smaller independent chunks, called partitions, which can then be communicated independently. In this work we first model the performance gain that can be expected when using partitioned communication. Next, we describe the improvements we made to to enable those gains and provide a high-quality implementation of MPI partitioned communication. We then evaluate partitioned communication in various common use cases and assess the performance in comparison with other MPI point-to-point and one-sided approaches. Specifically, we first investigate two scenarios commonly encountered for small partition sizes in a multithreaded environment: thread contention and overhead of using many partitions. We propose two solutions to alleviate the measured penalty and demonstrate their use. We then focus on large messages and the gain obtained when exploiting the delay resulting from computations or load imbalance. We conclude with our perspectives on the benefits of partitioned communication and the various results obtained.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2019

Extending the Message Passing Interface (MPI) with User-Level Schedules

Composability is one of seven reasons for the long-standing and continui...
research
10/09/2018

MPI Windows on Storage for HPC Applications

Upcoming HPC clusters will feature hybrid memories and storage devices p...
research
09/27/2018

Performance of MPI sends of non-contiguous data

We present an experimental investigation of the performance of MPI deriv...
research
11/15/2021

Quo Vadis MPI RMA? Towards a More Efficient Use of MPI One-Sided Communication

The MPI standard has long included one-sided communication abstractions ...
research
09/07/2023

pPython Performance Study

pPython seeks to provide a parallel capability that provides good speed-...
research
01/14/2016

Evaluation of the Partitioned Global Address Space (PGAS) model for an inviscid Euler solver

In this paper we evaluate the performance of Unified Parallel C (which i...
research
05/27/2021

Measuring OpenSHMEM Communication Routines with SKaMPI-OpenSHMEM User's manual

This document presents the OpenSHMEM extension for the Special Karlsruhe...

Please sign up or login with your details

Forgot password? Click here to reset