Lightweight MPI Communicators with Applications to Perfectly Balanced Schizophrenic Quicksort

10/22/2017
by   Michael Axtmann, et al.
0

MPI uses the concept of communicators to connect groups of processes. It provides nonblocking collective operations on communicators to overlap communication and computation. Flexible algorithms demand flexible communicators. E.g., a process can work on different subproblems within different process groups simultaneously, new process groups can be created, or the members of a process group can change. Depending on the number of communicators, the time for communicator creation can drastically increase the running time of the algorithm. Furthermore, a new communicator synchronizes all processes as communicator creation routines are blocking collective operations. We present RBC, a communication library based on MPI, that creates range-based communicators in constant time without communication. These RBC communicators support (non)blocking point-to-point communication as well as (non)blocking collective operations. Our experiments show that the library reduces the time to create a new communicator by a factor of more than 400 whereas the running time of collective operations remains about the same. We propose Schizophrenic Quicksort, a distributed sorting algorithm that avoids any load imbalances. We improved the performance of this algorithm by a factor of 15 for moderate inputs by using RBC communicators. Finally, we discuss different approaches to bring nonblocking (local) communicator creation of lightweight (range-based) communicators into MPI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2019

Integrating Blocking and Non-Blocking MPI Primitives with Task-Based Programming Models

In this paper we present the Task-Aware MPI library (TAMPI) that integra...
research
10/09/2018

Decoupled Strategy for Imbalanced Workloads in MapReduce Frameworks

In this work, we consider the integration of MPI one-sided communication...
research
09/05/2022

A Fault Resilient Approach to Non-collective Communication Creation in MPI

The increasing size of HPC architectures makes the faults' presence an e...
research
09/17/2021

Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm

The collective operations are considered critical for improving the perf...
research
09/13/2019

Tasks Unlimited: Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement

Balancing dynamically adaptive mesh refinement (AMR) codes is inherently...
research
04/20/2020

A Generalization of the Allreduce Operation

Allreduce is one of the most frequently used MPI collective operations, ...
research
10/26/2020

Leveraging MPI RMA to optimise halo-swapping communications in MONC on Cray machines

Remote Memory Access (RMA), also known as single sided communications, p...

Please sign up or login with your details

Forgot password? Click here to reset