DeepAI AI Chat
Log In Sign Up

Tight Memory-Independent Parallel Matrix Multiplication Communication Lower Bounds

by   Hussam Al Daas, et al.
Wake Forest University
Science and Technology Facilities Council

Communication lower bounds have long been established for matrix multiplication algorithms. However, most methods of asymptotic analysis have either ignored the constant factors or not obtained the tightest possible values. Recent work has demonstrated that more careful analysis improves the best known constants for some classical matrix multiplication lower bounds and helps to identify more efficient algorithms that match the leading-order terms in the lower bounds exactly and improve practical performance. The main result of this work is the establishment of memory-independent communication lower bounds with tight constants for parallel matrix multiplication. Our constants improve on previous work in each of three cases that depend on the relative sizes of the aspect ratios of the matrices.


page 1

page 2

page 3

page 4


The I/O complexity of hybrid algorithms for square matrix multiplication

Asymptotically tight lower bounds are derived for the I/O complexity of ...

Communication-Optimal Tilings for Projective Nested Loops with Arbitrary Bounds

Reducing communication - either between levels of a memory hierarchy or ...

Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication

We propose COSMA: a parallel matrix-matrix multiplication algorithm that...

Matrix Multiplication: Verifying Strong Uniquely Solvable Puzzles

Cohn and Umans proposed a framework for developing fast matrix multiplic...

I/O-Optimal Algorithms for Symmetric Linear Algebra Kernels

In this paper, we consider two fundamental symmetric kernels in linear a...

Communication Lower Bounds for Nested Bilinear Algorithms

We develop lower bounds on communication in the memory hierarchy or betw...

Lower bounds for Combinatorial Algorithms for Boolean Matrix Multiplication

In this paper we propose models of combinatorial algorithms for the Bool...