High-Performance Code Generation though Fusion and Vectorization

10/24/2017
by   Jason Sewall, et al.
0

We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware. We introduce representations for the abstract relationships and data dependencies of kernels in loop nests and algorithms for manipulating them into more efficient form; we similarly introduce techniques for determining data access patterns for stencil-like array accesses and show how this can be used to elide storage and improve vectorization. We discuss our prototype implementation of these ideas---named HFAV---and its use of a declarative, inference-based front-end to drive transformations, and we present results for some prominent codes in HPC.

READ FULL TEXT
research
05/07/2022

Monte Cimone: Paving the Road for the First Generation of RISC-V High-Performance Computers

The new open and royalty-free RISC-V ISA is attracting interest across t...
research
11/24/2021

Composing Loop-carried Dependence with Other Loops

Sparse fusion is a compile-time loop transformation and runtime scheduli...
research
12/22/2021

Survey the storage systems used in HPC and BDA ecosystems

The advancement in HPC and BDA ecosystem demands a better understanding ...
research
05/10/2017

Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory

HPC applications pose high demands on I/O performance and storage capabi...
research
05/24/2023

Model-Based Performance Analysis of the HyTeG Finite Element Framework

In this work, we present how code generation techniques significantly im...
research
08/20/2017

Fast Access to Columnar, Hierarchically Nested Data via Code Transformation

Big Data query systems represent data in a columnar format for fast, sel...

Please sign up or login with your details

Forgot password? Click here to reset