Sparse Tucker Tensor Decomposition on a Hybrid FPGA-CPU Platform

10/20/2020
by   Weiyun Jiang, et al.
0

Recommendation systems, social network analysis, medical imaging, and data mining often involve processing sparse high-dimensional data. Such high-dimensional data are naturally represented as tensors, and they cannot be efficiently processed by conventional matrix or vector computations. Sparse Tucker decomposition is an important algorithm for compressing and analyzing these sparse high-dimensional data sets. When energy efficiency and data privacy are major concerns, hardware accelerators on resource-constraint platforms become crucial for the deployment of tensor algorithms. In this work, we propose a hybrid computing framework containing CPU and FPGA to accelerate sparse Tucker factorization. This algorithm has three main modules: tensor-times-matrix (TTM), Kronecker products, and QR decomposition with column pivoting (QRP). In addition, we accelerate the former two modules on a Xilinx FPGA and the latter one on a CPU. Our hybrid platform achieves 23.6 ×∼ 1091× speedup and over 93.519%∼ 99.514 % energy savings compared with CPU on the synthetic and real-world datasets.

READ FULL TEXT

page 1

page 8

page 9

research
06/28/2019

Tucker Tensor Decomposition on FPGA

Tensor computation has emerged as a powerful mathematical tool for solvi...
research
04/29/2020

Synergistic CPU-FPGA Acceleration of Sparse Linear Algebra

This paper describes REAP, a software-hardware approach that enables hig...
research
07/17/2022

Towards Programmable Memory Controller for Tensor Decomposition

Tensor decomposition has become an essential tool in many data science a...
research
09/17/2023

Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU

Sparse Matricized Tensor Times Khatri-Rao Product (spMTTKRP) is the most...
research
09/22/2020

A reduced-precision streaming SpMV architecture for Personalized PageRank on FPGA

Sparse matrix-vector multiplication is often employed in many data-analy...
research
09/08/2023

Trade-Offs in Decentralized Multi-Antenna Architectures: Sparse Combining Modules for WAX Decomposition

With the increase in the number of antennas at base stations (BSs), cent...
research
05/12/2020

Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations

Personalized recommendations are the backbone machine learning (ML) algo...

Please sign up or login with your details

Forgot password? Click here to reset