DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

06/02/2022
by   Yonggan Fu, et al.
10

Efficient deep neural network (DNN) models equipped with compact operators (e.g., depthwise convolutions) have shown great potential in reducing DNNs' theoretical complexity (e.g., the total number of weights/operations) while maintaining a decent model accuracy. However, existing efficient DNNs are still limited in fulfilling their promise in boosting real-hardware efficiency, due to their commonly adopted compact operators' low hardware utilization. In this work, we open up a new compression paradigm for developing real-hardware efficient DNNs, leading to boosted hardware efficiency while maintaining model accuracy. Interestingly, we observe that while some DNN layers' activation functions help DNNs' training optimization and achievable accuracy, they can be properly removed after training without compromising the model accuracy. Inspired by this observation, we propose a framework dubbed DepthShrinker, which develops hardware-friendly compact networks via shrinking the basic building blocks of existing efficient DNNs that feature irregular computation patterns into dense ones with much improved hardware utilization and thus real-hardware efficiency. Excitingly, our DepthShrinker framework delivers hardware-friendly compact networks that outperform both state-of-the-art efficient DNNs and compression techniques, e.g., a 3.06% higher accuracy and 1.53× throughput on Tesla V100 over SOTA channel-wise pruning method MetaPruning. Our codes are available at: https://github.com/RICE-EIC/DepthShrinker.

READ FULL TEXT
research
10/24/2022

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

Multiplication is arguably the most cost-dominant operation in modern de...
research
02/24/2016

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

Recent research on deep neural networks has focused primarily on improvi...
research
01/25/2021

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Low-precision deep neural network (DNN) training has gained tremendous a...
research
05/30/2018

MPDCompress - Matrix Permutation Decomposition Algorithm for Deep Neural Network Compression

Deep neural networks (DNNs) have become the state-of-the-art technique f...
research
08/17/2022

Restructurable Activation Networks

Is it possible to restructure the non-linear activation functions in a d...
research
03/02/2020

Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets

Deep neural nets (DNNs) compression is crucial for adaptation to mobile ...
research
02/20/2020

Neural Network Compression Framework for fast model inference

In this work we present a new framework for neural networks compression ...

Please sign up or login with your details

Forgot password? Click here to reset