A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs

04/11/2020
by   Lingqi Zhang, et al.
0

GPUs are playing an increasingly important role in general-purpose computing. Many algorithms require synchronizations at different levels of granularity in a single GPU. Additionally, the emergence of dense GPU nodes also calls for multi-GPU synchronization. Nvidia's latest CUDA provides a variety of synchronization methods. Until now, there is no full understanding of the characteristics of those synchronization methods. This work explores important undocumented features and provides an in-depth analysis of the performance considerations and pitfalls of the state-of-art synchronization methods for Nvidia GPUs. The provided analysis would be useful when making design choices for applications, libraries, and frameworks running on single and/or multi-GPU environments. We provide a case study of the commonly used reduction operator to illustrate how the knowledge gained in our analysis can be useful. We also describe our micro-benchmarks and measurement methods.

READ FULL TEXT

page 5

page 8

research
10/20/2011

Efficient Synchronization Primitives for GPUs

In this paper, we revisit the design of synchronization primitives---spe...
research
05/19/2017

GPU System Calls

GPUs are becoming first-class compute citizens and are being tasked to p...
research
12/13/2020

Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures

Designing efficient and scalable sparse linear algebra kernels on modern...
research
09/16/2019

Model-Based Warp-Level Tiling for Image Processing Programs on GPUs

The efficient execution of image processing pipelines on GPUs is an area...
research
09/13/2021

Specifying and Testing GPU Workgroup Progress Models

As GPU availability has increased and programming support has matured, a...
research
11/01/2018

R friendly multi-threading in C++

Calling multi-threaded C++ code from R has its perils. Since the R inter...
research
08/31/2022

GGArray: A Dynamically Growable GPU Array

We present a dynamically Growable GPU array (GGArray) fully implemented ...

Please sign up or login with your details

Forgot password? Click here to reset