Deinsum: Practically I/O Optimal Multilinear Algebra

Multilinear algebra kernel performance on modern massively-parallel systems is determined mainly by data movement. However, deriving data movement-optimal distributed schedules for programs with many high-dimensional inputs is a notoriously hard problem. State-of-the-art libraries rely on heuristics and often fall back to suboptimal tensor folding and BLAS calls. We present Deinsum, an automated framework for distributed multilinear algebra computations expressed in Einstein notation, based on rigorous mathematical tools to address this problem. Our framework automatically derives data movement-optimal tiling and generates corresponding distributed schedules, further optimizing the performance of local computations by increasing their arithmetic intensity. To show the benefits of our approach, we test it on two important tensor kernel classes: Matricized Tensor Times Khatri-Rao Products and Tensor Times Matrix chains. We show performance results and scaling on the Piz Daint supercomputer, with up to 19x speedup over state-of-the-art solutions on 512 nodes.

READ FULL TEXT

page 1

page 2

page 12

research
07/28/2022

SpDISTAL: Compiling Distributed Sparse Tensor Computations

We introduce SpDISTAL, a compiler for sparse tensor algebra that targets...
research
05/23/2022

SparseLNR: Accelerating Sparse Tensor Computations Using Loop Nest Restructuring

Sparse tensor algebra computations have become important in many real-wo...
research
01/30/2017

Riemann Tensor Polynomial Canonicalization by Graph Algebra Extension

Tensor expression simplification is an "ancient" topic in computer algeb...
research
11/18/2022

Compiling Structured Tensor Algebra

Tensor algebra is essential for data-intensive workloads in various comp...
research
04/11/2017

Strassen's Algorithm for Tensor Contraction

Tensor contraction (TC) is an important computational kernel widely used...
research
08/20/2021

On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations

Matrix factorizations are among the most important building blocks of sc...
research
06/13/2019

Post-Processing of High-Dimensional Data

Scientific computations or measurements may result in huge volumes of da...

Please sign up or login with your details

Forgot password? Click here to reset