DeepAI AI Chat
Log In Sign Up

Tiramisu: A Code Optimization Framework for High Performance Systems

by   Riyadh Baghdadi, et al.
Politecnico di Milano

This paper introduces Tiramisu, an optimization framework designed to generate efficient code for high-performance systems such as multicores, GPUs, FPGAs, distributed machines, or any combination of these. Tiramisu relies on a flexible representation based on the polyhedral model and introduces a novel four-level IR that allows full separation between algorithms, schedules, data-layouts and communication. This separation simplifies targeting multiple hardware architectures from the same algorithm. We evaluate Tiramisu by writing a set of linear algebra and DNN kernels and by integrating it as a pass in the Halide compiler. We show that Tiramisu extends Halide with many new capabilities, and that Tiramisu can generate efficient code for multicores, GPUs, FPGAs and distributed heterogeneous systems. The performance of code generated by the Tiramisu backends matches or exceeds hand-optimized reference implementations. For example, the multicore backend matches the highly optimized Intel MKL library on many kernels and shows speedups reaching 4x over the original Halide.


page 1

page 2

page 3

page 4


Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code

This paper introduces Tiramisu, a polyhedral framework designed to gener...

Technical Report about Tiramisu: a Three-Layered Abstraction for Hiding Hardware Complexity from DSL Compilers

High-performance DSL developers work hard to take advantage of modern ha...

Bring the BitCODE – Moving Compute and Data in Distributed Heterogeneous Systems

In this paper, we present a framework for moving compute and data betwee...

HDCC: A Hyperdimensional Computing compiler for classification on embedded systems and high-performance computing

Hyperdimensional Computing (HDC) is a bio-inspired computing framework t...

Preparing Ginkgo for AMD GPUs – A Testimonial on Porting CUDA Code to HIP

With AMD reinforcing their ambition in the scientific high performance c...

AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation

Sequence alignments are fundamental to bioinformatics which has resulted...

Using Deep Neural Networks for Estimating Loop Unrolling Factor

Optimizing programs requires deep expertise. On one hand, it is a tediou...