Automated Derivation of Parametric Data Movement Lower Bounds for Affine Programs

11/15/2019
by   Auguste Olivry, et al.
0

For most relevant computation, the energy and time needed for data movement dominates that for performing arithmetic operations on all computing systems today. Hence it is of critical importance to understand the minimal total data movement achievable during the execution of an algorithm. The achieved total data movement for different schedules of an algorithm can vary widely depending on how efficiently the cache is used, e.g., untiled versus effectively tiled matrix-matrix multiplication. A significant current challenge is that no existing tool is able to meaningfully quantify the potential reduction to the data movement of a computation that can be achieved by more effective use of the cache through operation rescheduling. Asymptotic parametric expressions of data movement lower bounds have previously been manually derived for a limited number of algorithms, often without scaling constants. In this paper, we present the first compile-time approach for deriving non-asymptotic parametric expressions of data movement lower bounds for arbitrary affine computations. The approach has been implemented in a fully automatic tool (IOLB) that can generate these lower bounds for input affine programs. IOLB's use is demonstrated by exercising it on all the benchmarks of the PolyBench suite. The advantages of IOLB are many: (1) IOLB enables us to derive bounds for few dozens of algorithms for which these lower bounds have never been derived. This reflects an increase of productivity by automation. (2) Anyone is able to obtain these lower bounds through IOLB, no expertise is required. (3) For some of the most well-studied algorithms, the lower bounds obtained by are higher than any previously reported manually derived lower bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2018

Communication-Optimal Convolutional Neural Nets

Efficiently executing convolutional neural nets (CNNs) is important in m...
research
05/15/2021

Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs

Determining I/O lower bounds is a crucial step in obtaining communicatio...
research
07/21/2022

Communication Lower Bounds and Optimal Algorithms for Multiple Tensor-Times-Matrix Computation

Multiple Tensor-Times-Matrix (Multi-TTM) is a key computation in algorit...
research
02/28/2020

Communication-Optimal Tilings for Projective Nested Loops with Arbitrary Bounds

Reducing communication - either between levels of a memory hierarchy or ...
research
01/16/2018

Lower bounds for Combinatorial Algorithms for Boolean Matrix Multiplication

In this paper we propose models of combinatorial algorithms for the Bool...
research
04/29/2019

The I/O complexity of hybrid algorithms for square matrix multiplication

Asymptotically tight lower bounds are derived for the I/O complexity of ...
research
10/11/2020

Early Abandoning PrunedDTW and its application to similarity search

The Dynamic Time Warping ("DTW") distance is widely used in time series ...

Please sign up or login with your details

Forgot password? Click here to reset