Batched Sparse Matrix Multiplication for Accelerating Graph Convolutional Networks

03/27/2019
by   Yusuke Nagasaka, et al.
0

Graph Convolutional Networks (GCNs) are recently getting much attention in bioinformatics and chemoinformatics as a state-of-the-art machine learning approach with high accuracy. GCNs process convolutional operations along with graph structures, and GPUs are used to process enormous operations including sparse-dense matrix multiplication (SpMM) when the graph structure is expressed as an adjacency matrix with sparse matrix format. However, the SpMM operation on small graph, where the number of nodes is tens or hundreds, hardly exploits high parallelism or compute power of GPU. Therefore, SpMM becomes a bottleneck of training and inference in GCNs applications. In order to improve the performance of GCNs applications, we propose new SpMM algorithm especially for small sparse matrix and Batched SpMM, which exploits high parallelism of GPU by processing multiple SpMM operations with single CUDA kernel. To the best of our knowledge, this is the first work of batched approach for SpMM. We evaluated the performance of the GCNs application on TSUBAME3.0 implementing NVIDIA Tesla P100 GPU, and our batched approach shows significant speedups of up to 1.59x and 1.37x in training and inference, respectively.

READ FULL TEXT

page 2

page 6

research
09/29/2020

Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

Sparse general matrix-matrix multiplication (spGEMM) is an essential com...
research
12/19/2021

FSpGEMM: An OpenCL-based HPC Framework for Accelerating General Sparse Matrix-Matrix Multiplication on FPGAs

General sparse matrix-matrix multiplication (SpGEMM) is an integral part...
research
07/26/2023

Observe Locally, Classify Globally: Using GNNs to Identify Sparse Matrix Structure

The performance of sparse matrix computation highly depends on the match...
research
09/24/2022

CryptoGCN: Fast and Scalable Homomorphically Encrypted Graph Convolutional Network Inference

Recently cloud-based graph convolutional network (GCN) has demonstrated ...
research
06/12/2022

A Graph Transformation Strategy for Optimizing SpTRSV

Sparse triangular solve (SpTRSV) is an extensively studied computational...
research
03/17/2022

Batched matrix operations on distributed GPUs with application in theoretical physics

One of the most important and commonly used operations in many linear al...
research
06/06/2023

Towards Memory-Efficient Training for Extremely Large Output Spaces – Learning with 500k Labels on a Single Commodity GPU

In classification problems with large output spaces (up to millions of l...

Please sign up or login with your details

Forgot password? Click here to reset