From array algebra to energy efficiency on GPUs: Data and hardware shapes with dimension-lifting to optimize memory-processor layouts

06/19/2023
by   Lenore M. R. Mullin, et al.
0

We present a new formulation for parallel matrix multiplication (MM) to out-perform the standard row-column code design. This algorithm is formulated in the MoA formalism (A Mathematics of Arrays) and combines an array view of hardware (dimension-lifting) to extend indexing to physical memory/processing units, with a contiguous data layout derived from static transformations. This view of a hardware-software model is thus a bridging model in the sense of Valiant's BSP. OpenACCcode was derived from the MoA expressions's normal form, producing optimal block sizes using the static information of types and shapes. Experiments were run on Nvidia V100 GPUs and reveal energy consumption which is quadratic in N, i.e. linear in the size of matrix. More generally this approach may be an ideal way of formulating, optimizing, and mapping array algorithms to embedded hardware. This work builds upon recently published results of NREL scientists. .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2021

SPOTS: An Accelerator for Sparse Convolutional Networks Leveraging Systolic General Matrix-Matrix Multiplication

This paper proposes a new hardware accelerator for sparse convolutional ...
research
09/06/2023

The Case for Asymmetric Systolic Array Floorplanning

The widespread proliferation of deep learning applications has triggered...
research
10/29/2020

Systolic Computing on GPUs for Productive Performance

We propose a language and compiler to productively build high-performanc...
research
09/27/2017

Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors

In addition to hardware wall-time restrictions commonly seen in high-per...
research
10/13/2019

Modelling Resistive and Phase Change Memory with Passive Selector Arrays – A Matlab Tool

Memristor devices are crucial for developing neuromorphic computers and ...
research
02/15/2022

Fast and Scalable Memristive In-Memory Sorting with Column-Skipping Algorithm

Memristive in-memory sorting has been proposed recently to improve hardw...
research
12/10/2022

Demo: New View on Plasma Fractals – From the High Point of Array Languages

Plasma fractals is a technique to generate random and realistic clouds, ...

Please sign up or login with your details

Forgot password? Click here to reset