HRel: Filter Pruning based on High Relevance between Activation Maps and Class Labels

02/22/2022

∙

This paper proposes an Information Bottleneck theory based filter pruning method that uses a statistical measure called Mutual Information (MI). The MI between filters and class labels, also called Relevance, is computed using the filter's activation maps and the annotations. The filters having High Relevance (HRel) are considered to be more important. Consequently, the least important filters, which have lower Mutual Information with the class labels, are pruned. Unlike the existing MI based pruning methods, the proposed method determines the significance of the filters purely based on their corresponding activation map's relationship with the class labels. Architectures such as LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50 are utilized to demonstrate the efficacy of the proposed pruning method over MNIST, CIFAR-10 and ImageNet datasets. The proposed method shows the state-of-the-art pruning results for LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50 architectures. In the experiments, we prune 97.98 %, 84.85 %, 76.89%, 76.95%, and 63.99% of Floating Point Operation (FLOP)s from LeNet-5, VGG-16, ResNet-56, ResNet-110, and ResNet-50 respectively. The proposed HRel pruning method outperforms recent state-of-the-art filter pruning methods. Even after pruning the filters from convolutional layers of LeNet-5 drastically (i.e. from 20, 50 to 2, 3, respectively), only a small accuracy drop of 0.52% is observed. Notably, for VGG-16, 94.98% parameters are reduced, only with a drop of 0.36% in top-1 accuracy. ResNet-50 has shown a 1.17% drop in the top-5 accuracy after pruning 66.42% of the FLOPs. In addition to pruning, the Information Plane dynamics of Information Bottleneck theory is analyzed for various Convolutional Neural Network architectures with the effect of pruning.

READ FULL TEXT

HRel: Filter Pruning based on High Relevance between Activation Maps and Class Labels

Deep Model Compression based on the Training History

MINT: Deep Network Compression via Mutual Information-based Neuron Trimming

Directed-Weighting Group Lasso for Eltwise Blocked CNN Pruning

Pruning CNN's with linear filter ensembles

Filter Pruning based on Information Capacity and Independence

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

CPOT: Channel Pruning via Optimal Transport

HRel: Filter Pruning based on High Relevance between Activation Maps and Class Labels

Related Research

Deep Model Compression based on the Training History

MINT: Deep Network Compression via Mutual Information-based Neuron Trimming

Directed-Weighting Group Lasso for Eltwise Blocked CNN Pruning

Pruning CNN's with linear filter ensembles

Filter Pruning based on Information Capacity and Independence

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

CPOT: Channel Pruning via Optimal Transport