DeepAI AI Chat
Log In Sign Up

MatRox: A Model-Based Algorithm with an Efficient Storage Format for Parallel HSS-Structured Matrix Approximations

12/18/2018
by   Bangtian Liu, et al.
UNIVERSITY OF TORONTO
Rutgers University
0

We present MatRox, a novel model-based algorithm and implementation of Hierarchically Semi-Separable (HSS) matrix computations on parallel architectures. MatRox uses a novel storage format to improve data locality and scalability of HSS matrix-matrix multiplications on shared memory multicore processors. We build a performance model for HSS matrix-matrix multiplications. Based on the performance model, a mixed-rank heuristic is introduced to find an optimal HSS-tree depth for a faster HSS matrix evaluation. Uniform sampling is used to improve the performance of HSS compression. MatRox outperforms state-of-the-art HSS matrix multiplication codes, GOFMM and STRUMPACK, with average speedups of 2.8x and 6.1x respectively on target multicore processors.

READ FULL TEXT
12/23/2011

Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

Sparse matrix-vector multiplication (spMVM) is the dominant operation in...
03/26/2023

Common Subexpression-based Compression and Multiplication of Sparse Constant Matrices

In deep learning inference, model parameters are pruned and quantized to...
03/26/2019

Matrix multiplication and universal scalability of the time on the Intel Scalable processors

Matrix multiplication is one of the core operations in many areas of sci...
04/11/2017

Strassen's Algorithm for Tensor Contraction

Tensor contraction (TC) is an important computational kernel widely used...
04/05/2018

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitiv...
05/10/2022

The spatial computer: A model for energy-efficient parallel computation

We present a new parallel model of computation suitable for spatial arch...