Cross-filter compression for CNN inference acceleration

05/18/2020
by   Fuyuan Lyu, et al.
0

Convolution neural network demonstrates great capability for multiple tasks, such as image classification and many others. However, much resource is required to train a network. Hence much effort has been made to accelerate neural network by reducing precision of weights, activation, and gradient. However, these filter-wise quantification methods exist a natural upper limit, caused by the size of the kernel. Meanwhile, with the popularity of small kernel, the natural limit further decrease. To address this issue, we propose a new cross-filter compression method that can provide ∼32× memory savings and 122× speed up in convolution operations. In our method, all convolution filters are quantized to given bits and spatially adjacent filters share the same scaling factor. Our compression method, based on Binary-Weight and XNOR-Net separately, is evaluated on CIFAR-10 and ImageNet dataset with widely used network structures, such as ResNet and VGG, and witness tolerable accuracy loss compared to state-of-the-art quantification methods.

READ FULL TEXT
research
02/08/2019

FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary

We present a novel method of compression of deep Convolutional Neural Ne...
research
07/20/2018

Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy

Deep convolution neural network has achieved great success in many artif...
research
03/11/2020

Kernel Quantization for Efficient Network Compression

This paper presents a novel network compression framework Kernel Quantiz...
research
10/06/2020

Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters

Deep Convolutional Neural Networks (CNN) have been successfully applied ...
research
11/04/2019

Ternary MobileNets via Per-Layer Hybrid Filter Banks

MobileNets family of computer vision neural networks have fueled tremend...
research
07/25/2022

C3-SL: Circular Convolution-Based Batch-Wise Compression for Communication-Efficient Split Learning

Most existing studies improve the efficiency of Split learning (SL) by c...
research
07/30/2021

Fourier Series Expansion Based Filter Parametrization for Equivariant Convolutions

It has been shown that equivariant convolution is very helpful for many ...

Please sign up or login with your details

Forgot password? Click here to reset