Blocking and sparsity for optimization of convolution calculation algorithm on GPUs

09/22/2019
by   Weizhi Xu, et al.
0

Convolution neural network (CNN) plays a paramount role in machine learning, which has made significant contributions, such as medical image classification, natural language processing, and recommender system. The success convolution neural network achieved excellent performance with fast execution time. Due to the convolution operation dominate the total operation time of Convolution neural network. In this paper, we propose a novel convolution method of Graphic Processing Units (GPUs), which reduce the convolution operation time and improve the execution speed approximately 2X than the state of the art convolution algorithm. Our work based on the observation is that the sparsity of the input feature map of convolution operation is relatively large, and the zero value of the feature map is redundancy for convolution result. Therefore, we skip the zero value calculation and improve the speed by compressing the feature map. Besides, the shape of the feature map for the deep network is small, and the number of threads is limited. Therefore, for a limited number of threads, it is necessary to reduce the amount of calculation to increase the calculation speed. Our algorithm has a good effect on the convolution operation of the feature map of the deep network with large sparsity and small size. In this work, our contributions can be summarized as follows: 1) A novel store format for hight-sparsity feature map. 2) A novel convolution algorithm based on block compression and Shared memory is proposed. 3) A feature map data-set for convolution algorithm optimization. 4) We performed a single-layer convolution comparison experiment with CuDNN for different models, and it is best to achieve 3.5X speedup. We also implemented the algorithm on the VGG-19 model, which can achieve 1.3X∼2.9X speedup in deep convolution operation, and the entire network can achieve 2.3X speedup.

READ FULL TEXT
research
09/22/2019

Performance optimization of convolution calculation by blocking and sparsity on GPU

Convolution neural network (CNN) plays a paramount role in machine learn...
research
09/06/2019

HNMTP Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

Convolution neural networks are widely used for mobile applications. How...
research
09/06/2019

ILP-M Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

Convolution neural networks are widely used for mobile applications. How...
research
07/23/2018

PCNNA: A Photonic Convolutional Neural Network Accelerator

Convolutional Neural Networks (CNN) have been the centerpiece of many ap...
research
01/07/2019

GASL: Guided Attention for Sparsity Learning in Deep Neural Networks

The main goal of network pruning is imposing sparsity on the neural netw...
research
04/06/2023

Tensor Slicing and Optimization for Multicore NPUs

Although code generation for Convolution Neural Network (CNN) models has...
research
10/17/2022

Approximating Continuous Convolutions for Deep Network Compression

We present ApproxConv, a novel method for compressing the layers of a co...

Please sign up or login with your details

Forgot password? Click here to reset