AITuning: Machine Learning-based Tuning Tool for Run-Time Communication Libraries

09/13/2019
by   Alessandro Fanfarillo, et al.
0

In this work, we address the problem of tuning communication libraries by using a deep reinforcement learning approach. Reinforcement learning is a machine learning technique incredibly effective in solving game-like situations. In fact, tuning a set of parameters in a communication library in order to get better performance in a parallel application can be expressed as a game: Find the right combination/path that provides the best reward. Even though AITuning has been designed to be utilized with different run-time libraries, we focused this work on applying it to the OpenCoarrays run-time communication library, built on top of MPI-3. This work not only shows the potential of using a reinforcement learning algorithm for tuning communication libraries, but also demonstrates how the MPI Tool Information Interface, introduced by the MPI-3 standard, can be used effectively by run-time libraries to improve the performance without human intervention.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2022

SKaMPI-OpenSHMEM: Measuring OpenSHMEM Communication Routines

Benchmarking is an important challenge in HPC, in particular, to be able...
research
09/25/2019

Extending the Message Passing Interface (MPI) with User-Level Schedules

Composability is one of seven reasons for the long-standing and continui...
research
08/21/2022

IAAT: A Input-Aware Adaptive Tuning framework for Small GEMM

GEMM with the small size of input matrices is becoming widely used in ma...
research
04/07/2021

Top Score in Axelrod Tournament

The focus of the project will be an examination of obtaining the highest...
research
03/14/2022

DIAS: A Domain-Independent Alife-Based Problem-Solving System

A domain-independent problem-solving system based on principles of Artif...
research
06/30/2022

JACK2: a new high-level communication library for parallel iterative methods

In this paper, we address the problem of designing a distributed applica...
research
09/27/2019

COUNTDOWN Slack: a Run-time Library to Reduce Energy Footprint in Large-scale MPI Applications

The power consumption of supercomputers is a major challenge for system ...

Please sign up or login with your details

Forgot password? Click here to reset