Max-Affine Spline Insights Into Deep Network Pruning

01/07/2021
by   Randall Balestriero, et al.
0

In this paper, we study the importance of pruning in Deep Networks (DNs) and motivate it based on the current absence of data aware weight initialization. Current DN initializations, focusing primarily at maintaining first order statistics of the feature maps through depth, force practitioners to overparametrize a model in order to reach high performances. This overparametrization can then be pruned a posteriori, leading to a phenomenon known as "winning tickets". However, the pruning literature still relies on empirical investigations, lacking a theoretical understanding of (1) how pruning affects the decision boundary, (2) how to interpret pruning, (3) how to design principled pruning techniques, and (4) how to theoretically study pruning. To tackle those questions, we propose to employ recent advances in the theoretical analysis of Continuous Piecewise Affine (CPA) DNs. From this viewpoint, we can study the DNs' input space partitioning and detect the early-bird (EB) phenomenon, guide practitioners by identifying when to stop the first training step, provide interpretability into current pruning techniques, and develop a principled pruning criteria towards efficient DN training. Finally, we conduct extensive experiments to show the effectiveness of the proposed spline pruning criteria in terms of both layerwise and global pruning over state-of-the-art pruning methods.

READ FULL TEXT

page 12

page 14

research
10/22/2021

When to Prune? A Policy towards Early Structural Pruning

Pruning enables appealing reductions in network memory footprint and tim...
research
08/28/2021

Layer-wise Model Pruning based on Mutual Information

The proposed pruning strategy offers merits over weight-based pruning te...
research
10/28/2021

An Operator Theoretic Perspective on Pruning Deep Neural Networks

The discovery of sparse subnetworks that are able to perform as well as ...
research
07/30/2020

Growing Efficient Deep Networks by Structured Continuous Sparsification

We develop an approach to training deep networks while dynamically adjus...
research
09/29/2022

Batch Normalization Explained

A critically important, ubiquitous, and yet poorly understood ingredient...
research
10/17/2022

Principled Pruning of Bayesian Neural Networks through Variational Free Energy Minimization

Bayesian model reduction provides an efficient approach for comparing th...
research
05/17/2018

A Spline Theory of Deep Networks (Extended Version)

We build a rigorous bridge between deep networks (DNs) and approximation...

Please sign up or login with your details

Forgot password? Click here to reset