How Compact?: Assessing Compactness of Representations through Layer-Wise Pruning

01/09/2019
by   Hyun-Joo Jung, et al.
0

Various forms of representations may arise in the many layers embedded in deep neural networks (DNNs). Of these, where can we find the most compact representation? We propose to use a pruning framework to answer this question: How compact can each layer be compressed, without losing performance? Most of the existing DNN compression methods do not consider the relative compressibility of the individual layers. They uniformly apply a single target sparsity to all layers or adapt layer sparsity using heuristics and additional training. We propose a principled method that automatically determines the sparsity of individual layers derived from the importance of each layer. To do this, we consider a metric to measure the importance of each layer based on the layer-wise capacity. Given the trained model and the total target sparsity, we first evaluate the importance of each layer from the model. From the evaluated importance, we compute the layer-wise sparsity of each layer. The proposed method can be applied to any DNN architecture and can be combined with any pruning method that takes the total target sparsity as a parameter. To validate the proposed method, we carried out an image classification task with two types of DNN architectures on two benchmark datasets and used three pruning methods for compression. In case of VGG-16 model with weight pruning on the ImageNet dataset, we achieved up to 75 the baseline under the same total target sparsity. Furthermore, we analyzed where the maximum compression can occur in the network. This kind of analysis can help us identify the most compact representation within a deep neural network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2019

OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks

Channel pruning can significantly accelerate and compress deep neural ne...
research
05/23/2022

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

As Deep Neural Networks (DNNs) usually are overparameterized and have mi...
research
01/31/2022

SPDY: Accurate Pruning with Speedup Guarantees

The recent focus on the efficiency of deep neural networks (DNNs) has le...
research
11/24/2021

Graph Modularity: Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks

There are good arguments to support the claim that feature representatio...
research
11/19/2021

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

This paper investigates deep neural network (DNN) compression from the p...
research
05/22/2017

Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

How to develop slim and accurate deep neural networks has become crucial...
research
11/20/2018

Multi-layer Pruning Framework for Compressing Single Shot MultiBox Detector

We propose a framework for compressing state-of-the-art Single Shot Mult...

Please sign up or login with your details

Forgot password? Click here to reset