Compact representations of convolutional neural networks via weight pruning and quantization

08/28/2021
by   Giosuè Cataldo Marinò, et al.
5

The state-of-the-art performance for several real-world problems is currently reached by convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, typically leading to highly performing, yet very large neural networks with (at least) millions of parameters. As a result, the deployment of such models is not possible when only small amounts of RAM are available, or in general within resource-limited platforms, and strategies to compress CNNs became thus of paramount importance. In this paper we propose a novel lossless storage format for CNNs based on source coding and leveraging both weight pruning and quantization. We theoretically derive the space upper bounds for the proposed structures, showing their relationship with both sparsity and quantization levels of the weight matrices. Both compression rates and excution times have been tested against reference methods for matrix compression, and an empirical evaluation of state-of-the-art quantization schemes based on weight sharing is also discussed, to assess their impact on the performance when applied to both convolutional and fully connected layers. On four benchmarks for classification and regression problems and comparing to the baseline pre-trained uncompressed network, we achieved a reduction of space occupancy up to 0.6 connected layers and 5.44 competitive as the baseline.

READ FULL TEXT

page 10

page 17

page 18

research
07/15/2020

Compression strategies and space-conscious representations for deep neural networks

Recent advances in deep learning have made available large, powerful con...
research
05/23/2022

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

As Deep Neural Networks (DNNs) usually are overparameterized and have mi...
research
06/22/2020

Exploiting Weight Redundancy in CNNs: Beyond Pruning and Quantization

Pruning and quantization are proven methods for improving the performanc...
research
12/18/2014

Compressing Deep Convolutional Networks using Vector Quantization

Deep convolutional neural networks (CNN) has become the most promising m...
research
01/18/2021

Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes

Artificial neural network has achieved the state-of-art performance in f...
research
11/28/2017

WSNet: Compact and Efficient Networks with Weight Sampling

We present a new approach and a novel architecture, termed WSNet, for le...
research
02/17/2023

Less is More: The Influence of Pruning on the Explainability of CNNs

Modern, state-of-the-art Convolutional Neural Networks (CNNs) in compute...

Please sign up or login with your details

Forgot password? Click here to reset