cu_FastTucker: A Faster and Stabler Stochastic Optimization for Parallel Sparse Tucker Decomposition on Multi-GPUs

04/14/2022
by   Zixuan Li, et al.
0

High-Order, High-Dimension, and Sparse Tensor (HOHDST) data originates from real industrial applications, i.e., social networks, recommender systems, bio-information, and traffic information. Sparse Tensor Decomposition (STD) can project the HOHDST data into low-rank space. In this work, a novel method for STD of Kruskal approximating the core tensor and stochastic strategy for approximating the whole gradient is proposed which comprises of the following two parts: (1) the matrization unfolding order of the Kruskal product for the core tensor follows the multiplication order of the factor matrix and then the proposed theorem can reduce the exponential computational overhead into linear one; (2) stochastic strategy adopts one-step random sampling set, the volume of which is much smaller than original one, to approximate the whole gradient. Meanwhile, this method can guarantee the convergence and save the memory overhead. Due to the compactness of the same order matrix multiplication and parallel access from stochastic strategy, the speed of cuFastTucker can be further reinforced by GPU. Furthermore, be accommodated in a single GPU, a data division and communication strategy of cuFastTucker is proposed for data accommodation on Multi-GPU. cuFastTucker can achieve the fastest speed and keep the same accuracy and much lower memory overhead than the SOTA algorithms, e.g., P-Tucker, Vest, and SGD_Tucker. The code and partial datasets are publically available on "https://github.com/ZixuanLi-China/FastTucker".

READ FULL TEXT
research
12/07/2020

SGD_Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition

Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a g...
research
05/03/2017

cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs

We introduce the CUDA Tensor Transpose (cuTT) library that implements hi...
research
05/14/2019

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-Core and Many-Core Systems

Sparse matrix-vector multiplication (SpMV) operations are commonly used ...
research
04/10/2023

Mixed-Precision Random Projection for RandNLA on Tensor Cores

Random projection can reduce the dimension of data while capturing its s...
research
01/29/2022

Efficient, Out-of-Memory Sparse MTTKRP on Massively Parallel Architectures

Tensor decomposition (TD) is an important method for extracting latent i...
research
11/23/2021

Locality Sensitive Hash Aggregated Nonlinear Neighbourhood Matrix Factorization for Online Sparse Big Data Analysis

Matrix factorization (MF) can extract the low-rank features and integrat...
research
01/03/2022

Squeeze: Efficient Compact Fractals for Tensor Core GPUs

This work presents Squeeze, an efficient compact fractal processing sche...

Please sign up or login with your details

Forgot password? Click here to reset