ADC: Automated Deep Compression and Acceleration with Reinforcement Learning

02/10/2018
by   Yihui He, et al.
0

Model compression is an effective technique facilitating the deployment of neural network models on mobile devices that have limited computation resources and a tight power budget. However, conventional model compression techniques use hand-crafted features and require domain experts to explore the large design space trading off model size, speed, and accuracy, which is usually sub-optimal and time-consuming. In this paper, we propose Automated Deep Compression (ADC) that leverages reinforcement learning in order to efficiently sample the design space and greatly improve the model compression quality. We achieved state-of-the-art model compression results in a fully automated way without any human efforts. Under 4x FLOPs reduction, we achieved 2.7 accuracy than hand-crafted model compression method for VGG-16 on ImageNet. We applied this automated, push-the-button compression pipeline to MobileNet and achieved a 2x reduction in FLOPs, and a speedup of 1.49x on Titan Xp and 1.65x on an Android phone (Samsung Galaxy S7), with negligible loss of accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2018

Auto Deep Compression by Reinforcement Learning Based Actor-Critic Structure

Model-based compression is an effective, facilitating, and expanded mode...
research
11/21/2018

HAQ: Hardware-Aware Automated Quantization

Model quantization is a widely used technique to compress and accelerate...
research
01/28/2021

AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications

There are many deep learning (e.g., DNN) powered mobile and wearable app...
research
02/09/2022

Exploring Structural Sparsity in Neural Image Compression

Neural image compression have reached or out-performed traditional metho...
research
09/22/2021

High-dimensional Bayesian Optimization for CNN Auto Pruning with Clustering and Rollback

Pruning has been widely used to slim convolutional neural network (CNN) ...
research
11/10/2020

Neural Network Compression Via Sparse Optimization

The compression of deep neural networks (DNNs) to reduce inference cost ...
research
07/08/2019

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

End-to-end automatic speech recognition (ASR) models are increasingly la...

Please sign up or login with your details

Forgot password? Click here to reset