A Unified Optimization Approach for Sparse Tensor Operations on GPUs

05/28/2017
by   Bangtian Liu, et al.
0

Sparse tensors appear in many large-scale applications with multidimensional and sparse data. While multidimensional sparse data often need to be processed on manycore processors, attempts to develop highly-optimized GPU-based implementations of sparse tensor operations are rare. The irregular computation patterns and sparsity structures as well as the large memory footprints of sparse tensor operations make such implementations challenging. We leverage the fact that sparse tensor operations share similar computation patterns to propose a unified tensor representation called F-COO. Combined with GPU-specific optimizations, F-COO provides highly-optimized implementations of sparse tensor computations on GPUs. The performance of the proposed unified approach is demonstrated for tensor-based kernels such as the Sparse Matricized Tensor- Times-Khatri-Rao Product (SpMTTKRP) and the Sparse Tensor- Times-Matrix Multiply (SpTTM) and is used in tensor decomposition algorithms. Compared to state-of-the-art work we improve the performance of SpTTM and SpMTTKRP up to 3.7 and 30.6 times respectively on NVIDIA Titan-X GPUs. We implement a CANDECOMP/PARAFAC (CP) decomposition and achieve up to 14.9 times speedup using the unified method over state-of-the-art libraries on NVIDIA Titan-X GPUs.

READ FULL TEXT
research
04/06/2019

Load-Balanced Sparse MTTKRP on GPUs

Sparse matricized tensor times Khatri-Rao product (MTTKRP) is one of the...
research
05/14/2019

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-Core and Many-Core Systems

Sparse matrix-vector multiplication (SpMV) operations are commonly used ...
research
05/27/2020

Optimization of Tensor-product Operations in Nekbone on GPUs

In the CFD solver Nek5000, the computation is dominated by the evaluatio...
research
07/06/2023

Analyzing the Performance Portability of Tensor Decomposition

We employ pressure point analysis and roofline modeling to identify perf...
research
06/29/2017

Tensor-based approach to accelerate deformable part models

This article provides next step towards solving speed bottleneck of any ...
research
09/24/2018

Software for Sparse Tensor Decomposition on Emerging Computing Architectures

In this paper, we develop software for decomposing sparse tensors that i...
research
09/13/2023

Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization

Apache TVM (Tensor Virtual Machine), an open source machine learning com...

Please sign up or login with your details

Forgot password? Click here to reset