Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters

10/06/2020
by   Weichao Lan, et al.
0

Deep Convolutional Neural Networks (CNN) have been successfully applied to many real-life problems. However, the huge memory cost of deep CNN models poses a great challenge of deploying them on memory-constrained devices (e.g., mobile phones). One popular way to reduce the memory cost of deep CNN model is to train binary CNN where the weights in convolution filters are either 1 or -1 and therefore each weight can be efficiently stored using a single bit. However, the compression ratio of existing binary CNN models is upper bounded by around 32. To address this limitation, we propose a novel method to compress deep CNN model by stacking low-dimensional binary convolution filters. Our proposed method approximates a standard convolution filter by selecting and stacking filters from a set of low-dimensional binary convolution filters. This set of low-dimensional binary convolution filters is shared across all filters for a given convolution layer. Therefore, our method will achieve much larger compression ratio than binary CNN models. In order to train our proposed model, we have theoretically shown that our proposed model is equivalent to select and stack intermediate feature maps generated by low-dimensional binary filters. Therefore, our proposed model can be efficiently trained using the split-transform-merge strategy. We also provide detailed analysis of the memory and computation cost of our model in model inference. We compared the proposed method with other five popular model compression techniques on two benchmark datasets. Our experimental results have demonstrated that our proposed method achieves much higher compression ratio than existing methods while maintains comparable accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2019

Full-Stack Filters to Build Minimum Viable CNNs

Deep convolutional neural networks (CNNs) are usually over-parameterized...
research
02/08/2019

FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary

We present a novel method of compression of deep Convolutional Neural Ne...
research
09/20/2021

Learning Versatile Convolution Filters for Efficient Visual Recognition

This paper introduces versatile filters to construct efficient convoluti...
research
07/25/2022

C3-SL: Circular Convolution-Based Batch-Wise Compression for Communication-Efficient Split Learning

Most existing studies improve the efficiency of Split learning (SL) by c...
research
08/26/2023

MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree

Binary neural networks (BNNs) have been widely adopted to reduce the com...
research
05/18/2020

Cross-filter compression for CNN inference acceleration

Convolution neural network demonstrates great capability for multiple ta...
research
11/26/2018

Leveraging Filter Correlations for Deep Model Compression

We present a filter correlation based model compression approach for dee...

Please sign up or login with your details

Forgot password? Click here to reset