Accurate runtime selection of optimal MPI collective algorithms using analytical performance modelling

04/23/2020
by   Emin Nuriyev, et al.
0

The performance of collective operations has been a critical issue since the advent of MPI. Many algorithms have been proposed for each MPI collective operation but none of them proved optimal in all situations. Different algorithms demonstrate superior performance depending on the platform, the message size, the number of processes, etc. MPI implementations perform the selection of the collective algorithm empirically, executing a simple runtime decision function. While efficient, this approach does not guarantee the optimal selection. As a more accurate but equally efficient alternative, the use of analytical performance models of collective algorithms for the selection process was proposed and studied. Unfortunately, the previous attempts in this direction have not been successful. We revisit the analytical model-based approach and propose two innovations that significantly improve the selective accuracy of analytical models: (1) We derive analytical models from the code implementing the algorithms rather than from their high-level mathematical definitions. This results in more detailed models. (2) We estimate model parameters separately for each collective algorithm and include the execution of this algorithm in the corresponding communication experiment. We experimentally demonstrate the accuracy and efficiency of our approach using Open MPI broadcast and gather algorithms and a Grid5000 cluster.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2020

Collectives in hybrid MPI+MPI code: design, practice and performance

The use of hybrid scheme combining the message passing programming model...
research
10/29/2019

Decomposing Collectives for Exploiting Multi-lane Communication

Many modern, high-performance systems increase the cumulated node-bandwi...
research
04/20/2020

A Generalization of the Allreduce Operation

Allreduce is one of the most frequently used MPI collective operations, ...
research
12/12/2022

Collective Vector Clocks: Low-Overhead Transparent Checkpointing for MPI

MPI is the de facto standard for parallel computation on a cluster of co...
research
02/15/2021

Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters

Finely tuning MPI applications and understanding the influence of keypar...
research
04/08/2023

C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives

With the ever-increasing computing power of supercomputers and the growi...
research
09/17/2021

Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm

The collective operations are considered critical for improving the perf...

Please sign up or login with your details

Forgot password? Click here to reset