Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU

09/17/2023
by   Sasindu Wijeratne, et al.
0

Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the most time-consuming compute kernel in sparse tensor decomposition. In this paper, we introduce a novel algorithm to minimize the execution time of spMTTKRP across all modes of an input tensor on multi-core CPU platform. The proposed algorithm leverages the FLYCOO tensor format to exploit data locality in external memory accesses. It effectively utilizes computational resources by enabling lock-free concurrent processing of independent partitions of the input tensor. The proposed partitioning ensures load balancing among CPU threads. Our dynamic tensor remapping technique leads to reduced communication overhead along all the modes. On widely used real-world tensors, our work achieves 2.12x - 9.01x speedup in total execution time across all modes compared with the state-of-the-art CPU implementations.

READ FULL TEXT

page 7

page 8

research
10/12/2022

cuFasterTucker: A Stochastic Optimization Strategy for Parallel Sparse FastTucker Decomposition on GPU Platform

Currently, the size of scientific data is growing at an unprecedented ra...
research
04/25/2018

On Optimizing Distributed Tucker Decomposition for Sparse Tensors

The Tucker decomposition generalizes the notion of Singular Value Decomp...
research
10/20/2020

Sparse Tucker Tensor Decomposition on a Hybrid FPGA-CPU Platform

Recommendation systems, social network analysis, medical imaging, and da...
research
03/18/2021

Enhanced AGCM3D: A Highly Scalable Dynamical Core of Atmospheric General Circulation Model Based on Leap-Format

The finite-difference dynamical core based on the equal-interval latitud...
research
02/05/2020

Hybrid CUR-type decomposition of tensors in the Tucker format

The paper introduces a hybrid approach to the CUR-type decomposition of ...
research
04/11/2017

Strassen's Algorithm for Tensor Contraction

Tensor contraction (TC) is an important computational kernel widely used...
research
09/18/2021

Reconfigurable Low-latency Memory System for Sparse Matricized Tensor Times Khatri-Rao Product on FPGA

Tensor decomposition has become an essential tool in many applications i...

Please sign up or login with your details

Forgot password? Click here to reset