An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

01/20/2020
by   Xiaolong Ma, et al.
18

Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms. However, most of the pruning techniques are essentially trade-offs between model accuracy and regularity which lead to impaired inference accuracy and limited on-device acceleration performance. To solve the problem, we introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. With carefully designed patterns, the proposed pruning unprecedentedly and consistently achieves accuracy enhancement and better feature extraction ability on different DNN structures and datasets, and our pattern-aware pruning framework also achieves pattern library extraction, pattern selection, pattern and connectivity pruning and weight training simultaneously. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms. To the best of our knowledge, it is the first time that mobile devices achieve real-time inference for the large-scale DNN models thanks to the unique spatial property of pattern-based sparsity and the help of the code generation capability of compilers.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 8

page 9

research
09/06/2019

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices

Model compression techniques on Deep Neural Network (DNN) have been wide...
research
01/23/2020

BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method

Accelerating DNN execution on various resource-limited computing platfor...
research
04/22/2020

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

High-end mobile platforms rapidly serve as primary computing devices for...
research
07/20/2020

Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices

Mobile devices are becoming an important carrier for deep learning tasks...
research
07/04/2022

CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution

Mobile devices run deep learning models for various purposes, such as im...
research
05/31/2021

1×N Block Pattern for Network Sparsity

Though network sparsity emerges as a promising direction to overcome the...
research
05/02/2019

26ms Inference Time for ResNet-50: Towards Real-Time Execution of all DNNs on Smartphone

With the rapid emergence of a spectrum of high-end mobile devices, many ...

Please sign up or login with your details

Forgot password? Click here to reset