Impact of Traditional Sparse Optimizations on a Migratory Thread Architecture

12/14/2018
by   Thomas B. Rolinger, et al.
0

Achieving high performance for sparse applications is challenging due to irregular access patterns and weak locality. These properties preclude many static optimizations and degrade cache performance on traditional systems. To address these challenges, novel systems such as the Emu architecture have been proposed. The Emu design uses light-weight migratory threads, narrow memory, and near-memory processing capabilities to address weak locality and reduce the total load on the memory system. Because the Emu architecture is fundamentally different than cache based hierarchical memory systems, it is crucial to understand the cost-benefit tradeoffs of standard sparse algorithm optimizations on Emu hardware. In this work, we explore sparse matrix-vector multiplication (SpMV) on the Emu architecture. We investigate the effects of different sparse optimizations such as dense vector data layouts, work distributions, and matrix reorderings. Our study finds that initially distributing work evenly across the system is inadequate to maintain load balancing over time due to the migratory nature of Emu threads. In severe cases, matrix sparsity patterns produce hot-spots as many migratory threads converge on a single resource. We demonstrate that known matrix reordering techniques can improve SpMV performance on the Emu architecture by as much as 70 performance gain of no more than 16

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 7

research
09/29/2020

Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX

The A64FX CPU powers the current number one supercomputer on the Top500 ...
research
03/12/2020

Characterizing Optimizations to Memory Access Patterns using Architecture-Independent Program Features

High-performance computing developers are faced with the challenge of op...
research
05/03/2022

Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication

The multiplication of a sparse matrix with a dense vector (SpMV) is a ke...
research
09/07/2018

A Microbenchmark Characterization of the Emu Chick

The Emu Chick is a prototype system designed around the concept of migra...
research
07/24/2021

An FPGA cached sparse matrix vector product (SpMV) for unstructured computational fluid dynamics simulations

Field Programmable Gate Arrays generate algorithmic specific architectur...
research
07/15/2020

Auto Adaptive Irregular OpenMP Loops

OpenMP is a standard for the parallelization due to the ease in programm...
research
06/30/2021

Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction

Sparse matrix-vector and matrix-matrix multiplication (SpMV and SpMM) ar...

Please sign up or login with your details

Forgot password? Click here to reset