BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method

01/23/2020
by   Xiaolong Ma, et al.
7

Accelerating DNN execution on various resource-limited computing platforms has been a long-standing problem. Prior works utilize l1-based group lasso or dynamic regularization such as ADMM to perform structured pruning on DNN models to leverage the parallel computing architectures. However, both of the pruning dimensions and pruning methods lack universality, which leads to degraded performance and limited applicability. To solve the problem, we propose a new block-based pruning framework that comprises a general and flexible structured pruning dimension as well as a powerful and efficient reweighted regularization method. Our framework is universal, which can be applied to both CNNs and RNNs, implying complete support for the two major kinds of computation-intensive layers (i.e., CONV and FC layers). To complete all aspects of the pruning-for-acceleration task, we also integrate compiler-based code optimization into our framework that can perform DNN inference in a real-time manner. To the best of our knowledge, it is the first time that the weight pruning framework achieves universal coverage for both CNNs and RNNs with real-time mobile acceleration and no accuracy compromise.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

research
01/20/2020

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

Weight pruning has been widely acknowledged as a straightforward and eff...
research
01/30/2023

DepGraph: Towards Any Structural Pruning

Structural pruning enables model acceleration by removing structurally-g...
research
11/20/2018

Structured Pruning for Efficient ConvNets via Incremental Regularization

Parameter pruning is a promising approach for CNN compression and accele...
research
11/02/2022

SIMD-size aware weight regularization for fast neural vocoding on CPU

This paper proposes weight regularization for a faster neural vocoder. P...
research
08/27/2019

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

The state-of-art DNN structures involve intensive computation and high m...
research
07/07/2021

Immunization of Pruning Attack in DNN Watermarking Using Constant Weight Code

To ensure protection of the intellectual property rights of DNN models, ...
research
07/04/2022

CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution

Mobile devices run deep learning models for various purposes, such as im...

Please sign up or login with your details

Forgot password? Click here to reset