High Performance Convolution Using Sparsity and Patterns for Inference in Deep Convolutional Neural Networks

04/16/2021
by   hossam-amer, et al.
0

Deploying deep Convolutional Neural Networks (CNNs) is impacted by their memory footprint and speed requirements, which mainly come from convolution. Widely-used convolution algorithms, im2col and MEC, produce a lowered matrix from an activation map by redundantly storing the map's elements included at horizontal and/or vertical kernel overlappings without considering the sparsity of the map. Using the sparsity of the map, this paper proposes two new convolution algorithms dubbed Compressed Pattern Overlap (CPO) and Compressed Pattern Sets (CPS) that simultaneously decrease the memory footprint and increase the inference speed while preserving the accuracy. CPO recognizes non-zero elements (NZEs) at horizontal and vertical overlappings in the activation maps. CPS further improves the memory savings of CPO by compressing the index positions of neighboring NZEs. In both algorithms, channels/regions of the activation maps with all zeros are skipped. Then, CPO/CPS performs convolution via Sparse Matrix-Vector Multiplication (SpMv) done on their sparse representations. Experimental results conducted on CPUs show that average per-layer time savings reach up to 63 with respect to im2col. In some layers, our average per layer CPO/CPS time savings are better by 28 implementation of MEC. For a given CNN's inference, we offline select for each convolution layer the best convolutional algorithm in terms of time between either CPO or CPS and im2col. Our algorithms were selected up to 56 non-pointwise convolutional layers. Our offline selections yield CNN inference time savings up to 9

READ FULL TEXT
research
12/10/2018

Reliable Identification of Redundant Kernels for Convolutional Neural Network Compression

To compress deep convolutional neural networks (CNNs) with large memory ...
research
12/09/2019

An Empirical Study on Position of the Batch Normalization Layer in Convolutional Neural Networks

In this paper, we have studied how the training of the convolutional neu...
research
01/07/2018

SBNet: Sparse Blocks Network for Fast Inference

Conventional deep convolutional neural networks (CNNs) apply convolution...
research
12/03/2014

Memory Bounded Deep Convolutional Networks

In this work, we investigate the use of sparsity-inducing regularizers d...
research
01/18/2017

Parsimonious Inference on Convolutional Neural Networks: Learning and applying on-line kernel activation rules

A new, radical CNN design approach is presented in this paper, consideri...
research
02/21/2017

The Power of Sparsity in Convolutional Neural Networks

Deep convolutional networks are well-known for their high computational ...
research
01/31/2018

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in Convolutional Networks

While CNNs naturally lend themselves to densely sampled data, and sophis...

Please sign up or login with your details

Forgot password? Click here to reset