Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

04/13/2023
by   Ziwei Wang, et al.
0

In this paper, we propose an ultrafast automated model compression framework called SeerNet for flexible network deployment. Conventional non-differen-tiable methods discretely search the desirable compression policy based on the accuracy from exhaustively trained lightweight models, and existing differentiable methods optimize an extremely large supernet to obtain the required compressed model for deployment. They both cause heavy computational cost due to the complex compression policy search and evaluation process. On the contrary, we obtain the optimal efficient networks by directly optimizing the compression policy with an accurate performance predictor, where the ultrafast automated model compression for various computational cost constraint is achieved without complex compression policy search and evaluation. Specifically, we first train the performance predictor based on the accuracy from uncertain compression policies actively selected by efficient evolutionary search, so that informative supervision is provided to learn the accurate performance predictor with acceptable cost. Then we leverage the gradient that maximizes the predicted performance under the barrier complexity constraint for ultrafast acquisition of the desirable compression policy, where adaptive update stepsizes with momentum are employed to enhance optimality of the acquired pruning and quantization strategy. Compared with the state-of-the-art automated model compression methods, experimental results on image classification and object detection show that our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.

READ FULL TEXT
research
12/31/2021

Multi-Dimensional Model Compression of Vision Transformer

Vision transformers (ViT) have recently attracted considerable attention...
research
08/05/2021

Generalizable Mixed-Precision Quantization via Attribution Rank Preservation

In this paper, we propose a generalizable mixed-precision quantization (...
research
12/15/2022

Towards Hardware-Specific Automatic Compression of Neural Networks

Compressing neural network architectures is important to allow the deplo...
research
09/18/2023

Pruning Large Language Models via Accuracy Predictor

Large language models(LLMs) containing tens of billions of parameters (o...
research
01/24/2022

AutoMC: Automated Model Compression based on Domain Knowledge and Progressive search strategy

Model compression methods can reduce model complexity on the premise of ...
research
01/13/2021

ABS: Automatic Bit Sharing for Model Compression

We present Automatic Bit Sharing (ABS) to automatically search for optim...
research
07/13/2020

Towards practical lipreading with distilled and efficient models

Lipreading has witnessed a lot of progress due to the resurgence of neur...

Please sign up or login with your details

Forgot password? Click here to reset