Matrix multiplication and universal scalability of the time on the Intel Scalable processors

03/26/2019
by   Alexander Russkov, et al.
0

Matrix multiplication is one of the core operations in many areas of scientific computing. We present the results of the experiments with the matrix multiplication of the big size comparable with the big size of the onboard memory, which is 1.5 terabyte in our case. We run experiments on the computing board with two sockets and with two Intel Xeon Platinum 8164 processors, each with 26 cores and with multi-threading. The most interesting result of our study is the observation of the perfect scalability law of the matrix multiplication, and of the universality of this law.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

General Matrix-Matrix Multiplication Using SIMD features of the PIII

Generalised matrix-matrix multiplication forms the kernel of many mathem...
research
08/16/2018

Novel Model-based Methods for Performance Optimization of Multithreaded 2D Discrete Fourier Transform on Multicore Processors

In this paper, we use multithreaded fast Fourier transforms provided in ...
research
11/07/2020

FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks

We develop a fused matrix multiplication kernel that unifies sampled den...
research
04/05/2018

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitiv...
research
12/18/2018

MatRox: A Model-Based Algorithm with an Efficient Storage Format for Parallel HSS-Structured Matrix Approximations

We present MatRox, a novel model-based algorithm and implementation of H...
research
05/03/2023

FT-GEMM: A Fault Tolerant High Performance GEMM Implementation on x86 CPUs

General matrix/matrix multiplication (GEMM) is crucial for scientific co...
research
10/17/2021

High Level Synthesis Implementation of a Three-dimensional Systolic Array Architecture for Matrix Multiplications on Intel Stratix 10 FPGAs

In this paper, we consider the HLS implementation of a three-dimensional...

Please sign up or login with your details

Forgot password? Click here to reset