A Framework for Fast and Efficient Neural Network Compression

11/30/2018
by   Hyeji Kim, et al.
0

Network compression reduces the computational complexity and memory consumption of deep neural networks by reducing the number of parameters. In SVD-based network compression, the right rank needs to be decided for every layer of the network. In this paper, we propose an efficient method for obtaining the rank configuration of the whole network. Unlike previous methods which consider each layer separately, our method considers the whole network to choose the right rank configuration. We propose novel accuracy metrics to represent the accuracy and complexity relationship for a given neural network. We use these metrics in a non-iterative fashion to obtain the right rank configuration which satisfies the constraints on FLOPs and memory while maintaining sufficient accuracy. Experiments show that our method provides better compromise between accuracy and computational complexity/memory consumption while performing compression at much higher speed. For VGG-16 our network can reduce the FLOPs by 25 the baseline, while requiring only 3 minutes on a CPU to search for the right rank configuration. Previously, similar results were achieved in 4 hours with 8 GPUs. The proposed method can be used for lossless compression of neural network as well. The better accuracy and complexity compromise, as well as the extremely fast speed of our method makes it suitable for neural network compression.

READ FULL TEXT
research
07/13/2021

Data-Driven Low-Rank Neural Network Compression

Despite many modern applications of Deep Neural Networks (DNNs), the lar...
research
06/05/2023

Computational Complexity of Detecting Proximity to Losslessly Compressible Neural Network Parameters

To better understand complexity in neural networks, we theoretically inv...
research
06/11/2020

Convolutional neural networks compression with low rank and sparse tensor decompositions

Convolutional neural networks show outstanding results in a variety of c...
research
09/03/2020

A Partial Regularization Method for Network Compression

Deep Neural Networks have achieved remarkable success relying on the dev...
research
07/02/2021

Neural Network Layer Algebra: A Framework to Measure Capacity and Compression in Deep Learning

We present a new framework to measure the intrinsic properties of (deep)...
research
04/11/2018

Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory

Binarization is an extreme network compression approach that provides la...
research
06/28/2018

Automatic Rank Selection for High-Speed Convolutional Neural Network

Low-rank decomposition plays a central role in accelerating convolutiona...

Please sign up or login with your details

Forgot password? Click here to reset