Measuring Thread Timing to Assess the Feasibility of Early-bird Message Delivery

04/21/2023
by   W. Pepper Marts, et al.
0

Early-bird communication is a communication/computation overlap technique that combines fine-grained communication with partitioned communication to improve application run-time. Communication is divided among the compute threads such that each individual thread can initiate transmission of its portion of the data as soon as it is complete rather than waiting for all of the threads. However, the benefit of early-bird communication depends on the completion timing of the individual threads. In this paper, we measure and evaluate the potential overlap, the idle time each thread experiences between finishing their computation and the final thread finishing. These measurements help us understand whether a given application could benefit from early-bird communication. We present our technique for gathering this data and evaluate data collected from three proxy applications: MiniFE, MiniMD, and MiniQMC. To characterize the behavior of these workloads, we study the thread timings at both a macro level, i.e., across all threads across all runs of an application, and a micro level, i.e., within a single process of a single run. We observe that these applications exhibit significantly different behavior. While MiniFE and MiniQMC appear to be well-suited for early-bird communication because of their wider thread distribution and more frequent laggard threads, the behavior of MiniMD may limit its ability to leverage early-bird communication.

READ FULL TEXT
research
02/23/2022

Improving Scalability with GPU-Aware Asynchronous Tasks

Asynchronous tasks, when created with over-decomposition, enable automat...
research
08/02/2018

Synapse: Synthetic Application Profiler and Emulator

Motivated by the need to emulate workload execution characteristics on h...
research
11/30/2020

Facilitating the Communication of Politeness through Fine-Grained Paraphrasing

Aided by technology, people are increasingly able to communicate across ...
research
07/12/2018

Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications

In real-time systems, the application's behavior has to be predictable a...
research
05/11/2023

GPU-initiated Fine-grained Overlap of Collective Communication with Computation

In order to satisfy their ever increasing capacity and compute requireme...
research
06/26/2023

Multivariate Time Series Early Classification Across Channel and Time Dimensions

Nowadays, the deployment of deep learning models on edge devices for add...
research
11/20/2022

Best-Effort Communication Improves Performance and Scales Robustly on Conventional Hardware

Here, we test the performance and scalability of fully-asynchronous, bes...

Please sign up or login with your details

Forgot password? Click here to reset