Weight-dependent Gates for Network Pruning

07/04/2020
by   Yun Li, et al.
1

In this paper, we propose a simple and effective network pruning framework, which introduces novel weight-dependent gates (W-Gates) to prune filter adaptively. We argue that the pruning decision should depend on the convolutional weights, in other words, it should be a learnable function of filter weights. We thus construct the Filter Gates Learning Module (FGL) to learn the information from convolutional weights and obtain binary W-Gates to prune or keep the filters automatically. To prune the network under hardware constraint, we train a Latency Predict Net (LPNet) to estimate the hardware latency of candidate pruned networks. Based on the proposed LPNet, we can optimize W-Gates and the pruning ratio of each layer under latency constraint. The whole framework is differentiable and can be optimized by gradient-based method to achieve a compact network with better trade-off between accuracy and efficiency. We have demonstrated the effectiveness of our method on Resnet34, Resnet50 and MobileNet V2, achieving up to 1.33/1.28/1.1 higher Top-1 accuracy with lower hardware latency on ImageNet. Compared with state-of-the-art pruning methods, our method achieves superior performance.

READ FULL TEXT

page 1

page 4

page 6

page 8

page 10

page 12

page 16

page 17

research
09/30/2020

Pruning Filter in Filter

Pruning has become a very powerful and effective technique to compress a...
research
07/11/2020

To filter prune, or to layer prune, that is the question

Recent advances in pruning of neural networks have made it possible to r...
research
05/23/2023

Layer-adaptive Structured Pruning Guided by Latency

Structured pruning can simplify network architecture and improve inferen...
research
10/20/2021

HALP: Hardware-Aware Latency Pruning

Structural pruning can simplify network architecture and improve inferen...
research
11/01/2021

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach

As a deep learning model typically contains millions of trainable weight...
research
01/13/2021

ABS: Automatic Bit Sharing for Model Compression

We present Automatic Bit Sharing (ABS) to automatically search for optim...
research
04/22/2022

Depth Pruning with Auxiliary Networks for TinyML

Pruning is a neural network optimization technique that sacrifices accur...

Please sign up or login with your details

Forgot password? Click here to reset