DeepAI AI Chat
Log In Sign Up

Vectorizing Sparse Matrix Codes with Dependency Driven Trace Analysis

by   Zachary Cetinic, et al.

Sparse computations frequently appear in scientific simulations and the performance of these simulations rely heavily on the optimization of the sparse codes. The compact data structures and irregular computation patterns in sparse matrix computations introduce challenges to vectorizing these codes. Available approaches primarily vectorize regular regions of computations in the sparse code. They also reorganize data and computations, at a cost, to increase the number of regular regions. In this work, we propose a novel polyhedral model, called the partially strided codelets (PSC), that enables the vectorization of computation regions with irregular data access patterns. PSCs also improve data locality in sparse computation. Our DDF inspector-executor framework efficiently mines the memory accesses in the sparse computation, using an access function differentiation approach, to find PSC codelets. It generates vectorized code for the sparse matrix multiplication kernel (SpMV), a kernel with parallel outer loops, and for kernels with carried dependence, specifically the sparse triangular solver (SpTRSV). We demonstrate the performance of the DDF-generated code on a set of 60 large and small matrices (0.05-130M nonzeros). DDF outperforms the highly specialized library MKL with an average speedup of 1.93 and 4.5X for SpMV and SpTRSV, respectively. For the same matrices, DDF outperforms the state-of-the-art inspector-executor framework Sympiler [1] for the SpTRSV kernel by up to 11X and the work by Augustine et. al [2] for the SpMV kernel by up to 12X.


page 1

page 8

page 9

page 10


Composing Loop-carried Dependence with Other Loops

Sparse fusion is a compile-time loop transformation and runtime scheduli...

Intelligent-Unrolling: Exploiting Regular Patterns in Irregular Applications

Modern optimizing compilers are able to exploit memory access or computa...

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Sympiler is a domain-specific code generator that optimizes sparse matri...

Expressing Sparse Matrix Computations for Productive Performance on Spatial Architectures

This paper addresses spatial programming of sparse matrix computations f...

Graph Transformation and Specialized Code Generation For Sparse Triangular Solve (SpTRSV)

Sparse Triangular Solve (SpTRSV) is an important computational kernel us...

A Graph Transformation Strategy for Optimizing SpTRSV

Sparse triangular solve (SpTRSV) is an extensively studied computational...

SpArch: Efficient Architecture for Sparse Matrix Multiplication

Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous...