Auto-SpMV: Automated Optimizing SpMV Kernels on GPU

02/11/2023
by   Mina Ashoury, et al.
0

Sparse matrix-vector multiplication (SpMV) is an essential linear algebra operation that dominates the computing cost in many scientific applications. Due to providing massive parallelism and high memory bandwidth, GPUs are commonly used to accelerate SpMV kernels. Prior studies mainly focused on reducing the latency of SpMV kernels on GPU. However, few attempts have been made to improve the energy efficiency of SpMV kernels, resulting in GPUs being excluded from the range of low-power applications. Furthermore, prior work has primarily focused on optimizing the sparse format of SpMV kernels, the literature ignores evaluating the impact of tweaking compilation parameters. Lastly, Little attention has been paid to preparing a comprehensive training dataset of running SpMV kernels and fine-tuning the learning hyperparameters. To address these limitations, we present a novel framework, dubbed Auto-SpMV, that enables energy-efficient and low-latency SpMV kernels on GPU. To achieve the best run time performance, Auto-SpMV proposes two optimization modes: compile-time and run-time. In the compile-time mode, Auto-SpMV tweaks the compilation parameters, while in the run-time mode, Auto-SpMV selects the best sparse format for the sparse input matrix. To achieve the best classification results, 1) we collect the largest dataset ever having 30 different sparse matrices running with more than 15K different configurations, and 2) we boost classification models by automatically fine-tuning the learning hyperparameters. Experimental results reveal that Auto-SpMV optimizes latency, energy consumption, average power, and energy efficiency in the compile-time mode by up to 51.9 setting. Auto-SpMV optimizes average power and energy efficiency in the run-time mode by up to 34.6 setting.

READ FULL TEXT

page 4

page 7

page 8

page 15

page 18

research
11/14/2022

Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning

Graphics Processing Units (GPUs) have revolutionized the computing lands...
research
12/05/2019

GPU Computing with Python: Performance, Energy Efficiency and Usability

In this work, we examine the performance, energy efficiency and usabilit...
research
07/14/2017

Pushing the Limits of Online Auto-tuning: Machine Code Optimization in Short-Running Kernels

We propose an online auto-tuning approach for computing kernels. Differe...
research
09/14/2017

GREENER: A Tool for Improving Energy Efficiency of Register Files

Graphics Processing Units (GPUs) maintain a large register file to incre...
research
08/21/2022

IAAT: A Input-Aware Adaptive Tuning framework for Small GEMM

GEMM with the small size of input matrices is becoming widely used in ma...
research
09/12/2023

Just-in-Time autotuning

Performance portability is a major concern on current architectures. One...
research
03/09/2023

Optimizing Sparse Linear Algebra Through Automatic Format Selection and Machine Learning

Sparse matrices are an integral part of scientific simulations. As hardw...

Please sign up or login with your details

Forgot password? Click here to reset