Composing Loop-carried Dependence with Other Loops

11/24/2021
by   Kazem Cheshmi, et al.
0

Sparse fusion is a compile-time loop transformation and runtime scheduling implemented as a domain-specific code generator. Sparse fusion generates efficient parallel code for the combination of two sparse matrix kernels where at least one of the kernels has loop-carried dependencies. Available implementations optimize individual sparse kernels. When optimized separately, the irregular dependence patterns of sparse kernels create synchronization overheads and load imbalance, and their irregular memory access patterns result in inefficient cache usage, which reduces parallel efficiency. Sparse fusion uses a novel inspection strategy with code transformations to generate parallel fused code for sparse kernel combinations that is optimized for data locality and load balance. Code generated by Sparse fusion outperforms the existing implementations ParSy and MKL on average 1.6X and 5.1X respectively and outperforms the LBC and DAGP coarsening strategies applied to a fused data dependence graph on average 5.1X and 7.2X respectively for various kernel combinations.

READ FULL TEXT
research
11/24/2021

Vectorizing Sparse Matrix Codes with Dependency Driven Trace Analysis

Sparse computations frequently appear in scientific simulations and the ...
research
03/17/2022

FUSED-PAGERANK: Loop-Fusion based Approximate PageRank

PageRank is a graph centrality metric that gives the importance of each ...
research
03/21/2021

Graph Transformation and Specialized Code Generation For Sparse Triangular Solve (SpTRSV)

Sparse Triangular Solve (SpTRSV) is an important computational kernel us...
research
10/24/2017

High-Performance Code Generation though Fusion and Vectorization

We present a technique for automatically transforming kernel-based compu...
research
05/18/2017

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Sympiler is a domain-specific code generator that optimizes sparse matri...
research
12/28/2019

A Unified Iteration Space Transformation Framework for Sparse and Dense Tensor Algebra

We address the problem of optimizing mixed sparse and dense tensor algeb...
research
07/22/2021

Hyperbolic Diffusion in Flux Reconstruction: Optimisation through Kernel Fusion within Tensor-Product Elements

Novel methods are presented in this initial study for the fusion of GPU ...

Please sign up or login with your details

Forgot password? Click here to reset