Performance Enhancement Strategies for Sparse Matrix-Vector Multiplication (SpMV) and Iterative Linear Solvers
Iterative solutions of sparse linear systems and sparse eigenvalue problems have a fundamental role in vital fields of scientific research and engineering. The crucial computing kernel for such iterative solutions is the multiplication of a sparse matrix by a dense vector. Efficient implementation of sparse matrix-vector multiplication (SpMV) and linear solvers are therefore essential and has been subjected to extensive research across a variety of computing architectures and accelerators such as central processing units (CPUs), graphical processing units (GPUs), many integrated cores (MICs), and field programmable gate arrays (FPGAs). Unleashing the full potential of an architecture/accelerator requires determining the factors that affect an efficient implementation of SpMV. This article presents the first of its kind, in-depth survey covering over two hundred state-of-the-art optimization schemes for solving sparse iterative linear systems with a focus on computing SpMV. A new taxonomy for iterative solutions and SpMV techniques common to all architectures is proposed. This article includes reviews of SpMV techniques for all architectures to consolidate a single taxonomy to encourage cross-architectural and heterogeneous-architecture developments. However, the primary focus is on GPUs. The major contributions as well as the primary, secondary, and tertiary contributions of the SpMV techniques are first highlighted utilizing the taxonomy and then qualitatively compared. A summary of the current state of the research for each architecture is discussed separately. Finally, several open problems and key challenges for future research directions are outlined.
READ FULL TEXT