A high-level characterisation and generalisation of communication-avoiding programming techniques

09/24/2019
by   Tobias Weinzierl, et al.
0

Today's hardware's explosion of concurrency plus the explosion of data we build upon in both machine learning and scientific simulations have multifaceted impact on how we write our codes. They have changed our notion of performance and, hence, of what a good code is: Good code has, first of all, to be able to exploit the unprecedented levels of parallelism. To do so, it has to manage to move the compute data into the compute facilities on time. As communication and memory bandwidth cannot keep pace with the growth in compute capabilities and as latency increases---at least relative to what the hardware could do---communication-avoiding techniques gain importance. We characterise and classify the field of communication-avoiding algorithms. A review of some examples of communication-avoiding programming by means of our new terminology shows that we are well-advised to broaden our notion of `communication-avoiding" and to look beyond numerical linear algebra. An abstraction, generalisation and weakening of the term enriches our toolset of how to tackle the data movement challenges. Through this, we eventually gain access to a richer set of tools that we can use to deliver proper code for current and upcoming hardware generations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2019

Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis

Data movement is the dominating factor affecting performance and energy ...
research
05/30/2022

Closing the Performance Gap with Modern C++

On the way to Exascale, programmers face the increasing challenge of hav...
research
06/11/2021

COSTA: Communication-Optimal Shuffle and Transpose Algorithm with Process Relabeling

Communication-avoiding algorithms for Linear Algebra have become increas...
research
10/16/2020

Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale

Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in ...
research
10/23/2017

Communication-avoiding Cholesky-QR2 for rectangular matrices

The need for scalable algorithms to solve least squares and eigenvalue p...
research
01/15/2018

Improving Communication Patterns in Polyhedral Process Networks

Embedded system performances are bounded by power consumption. The trend...
research
01/16/2018

Inter-thread Communication in Multithreaded, Reconfigurable Coarse-grain Arrays

Traditional von Neumann GPGPUs only allow threads to communicate through...

Please sign up or login with your details

Forgot password? Click here to reset