SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning

10/19/2021
by   Manuel Nonnenmacher, et al.
0

Pruning neural networks reduces inference time and memory costs. On standard hardware, these benefits will be especially prominent if coarse-grained structures, like feature maps, are pruned. We devise two novel saliency-based methods for second-order structured pruning (SOSP) which include correlations among all structures and layers. Our main method SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products. SOSP-H thereby scales like a first-order method despite taking into account the full Hessian. We validate SOSP-H by comparing it to our second method SOSP-I that uses a well-established Hessian approximation, and to numerous state-of-the-art methods. While SOSP-H performs on par or better in terms of accuracy, it has clear advantages in terms of scalability and efficiency. This allowed us to scale SOSP-H to large-scale vision tasks, even though it captures correlations across all layers of the network. To underscore the global nature of our pruning methods, we evaluate their performance not only by removing structures from a pretrained network, but also by detecting architectural bottlenecks. We show that our algorithms allow to systematically reveal architectural bottlenecks, which we then remove to further increase the accuracy of the networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

WoodFisher: Efficient second-order approximations for model compression

Second-order information, in the form of Hessian- or Inverse-Hessian-vec...
research
07/07/2021

Efficient Matrix-Free Approximations of Second-Order Information, with Applications to Pruning and Optimization

Efficiently approximating local curvature information of the loss functi...
research
08/03/2022

Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs

Training deep neural networks consumes increasing computational resource...
research
01/22/2021

Hessian-Aware Pruning and Optimal Neural Implant

Pruning is an effective method to reduce the memory footprint and FLOPs ...
research
06/22/2020

Revisiting Loss Modelling for Unstructured Pruning

By removing parameters from deep neural networks, unstructured pruning m...
research
10/13/2021

ES-Based Jacobian Enables Faster Bilevel Optimization

Bilevel optimization (BO) has arisen as a powerful tool for solving many...

Please sign up or login with your details

Forgot password? Click here to reset