Input-Aware Auto-Tuning of Compute-Bound HPC Kernels

02/15/2018
by   Philippe Tillet, et al.
0

Efficient implementations of HPC applications for parallel architectures generally rely on external software packages (e.g., BLAS, LAPACK, CUDNN). While these libraries provide highly optimized routines for certain characteristics of inputs (e.g., square matrices), they generally do not retain optimal performance across the wide range of problems encountered in practice. In this paper, we present an input-aware auto-tuning framework for matrix multiplications and convolutions, ISAAC, which uses predictive modeling techniques to drive highly parameterized PTX code templates towards not only hardware-, but also application-specific kernels. Numerical experiments on the NVIDIA Maxwell and Pascal architectures show up to 3x performance gains over both cuBLAS and cuDNN after only a few hours of auto-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2019

Cross-Platform Performance Portability Using Highly Parametrized SYCL Kernels

Over recent years heterogeneous systems have become more prevalent acros...
research
08/21/2022

IAAT: A Input-Aware Adaptive Tuning framework for Small GEMM

GEMM with the small size of input matrices is becoming widely used in ma...
research
06/19/2018

A model-driven approach for a new generation of adaptive libraries

Efficient high-performance libraries often expose multiple tunable param...
research
03/15/2020

Towards automated kernel selection in machine learning systems: A SYCL case study

Automated tuning of compute kernels is a popular area of research, mainl...
research
04/25/2023

Performance Optimization using Multimodal Modeling and Heterogeneous GNN

Growing heterogeneity and configurability in HPC architectures has made ...
research
08/30/2020

Performance portability through machine learning guided kernel selection in SYCL libraries

Automatically tuning parallel compute kernels allows libraries and frame...
research
08/14/2020

Toward an End-to-End Auto-tuning Framework in HPC PowerStack

Efficiently utilizing procured power and optimizing performance of scien...

Please sign up or login with your details

Forgot password? Click here to reset