Shapley Value as Principled Metric for Structured Network Pruning

06/02/2020
by   Marco Ancona, et al.
4

Structured pruning is a well-known technique to reduce the storage size and inference cost of neural networks. The usual pruning pipeline consists of ranking the network internal filters and activations with respect to their contributions to the network performance, removing the units with the lowest contribution, and fine-tuning the network to reduce the harm induced by pruning. Recent results showed that random pruning performs on par with other metrics, given enough fine-tuning resources. In this work, we show that this is not true on a low-data regime when fine-tuning is either not possible or not effective. In this case, reducing the harm caused by pruning becomes crucial to retain the performance of the network. First, we analyze the problem of estimating the contribution of hidden units with tools suggested by cooperative game theory and propose Shapley values as a principled ranking metric for this task. We compare with several alternatives proposed in the literature and discuss how Shapley values are theoretically preferable. Finally, we compare all ranking metrics on the challenging scenario of low-data pruning, where we demonstrate how Shapley values outperform other heuristics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2022

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-tr...
research
04/25/2022

Fine-tuning Pruned Networks with Linear Over-parameterization

Structured pruning compresses neural networks by reducing channels (filt...
research
07/08/2021

Weight Reparametrization for Budget-Aware Network Pruning

Pruning seeks to design lightweight architectures by removing redundant ...
research
02/18/2019

Speeding up convolutional networks pruning with coarse ranking

Channel-based pruning has achieved significant successes in accelerating...
research
03/20/2023

Greedy Pruning with Group Lasso Provably Generalizes for Matrix Sensing and Neural Networks with Quadratic Activations

Pruning schemes have been widely used in practice to reduce the complexi...
research
09/10/2021

Block Pruning For Faster Transformers

Pre-training has improved model accuracy for both classification and gen...
research
10/01/2021

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

A central goal in deep learning is to learn compact representations of f...

Please sign up or login with your details

Forgot password? Click here to reset