Multi-Dimensional Model Compression of Vision Transformer

12/31/2021
by   Zejiang Hou, et al.
0

Vision transformers (ViT) have recently attracted considerable attentions, but the huge computational cost remains an issue for practical deployment. Previous ViT pruning methods tend to prune the model along one dimension solely, which may suffer from excessive reduction and lead to sub-optimal model quality. In contrast, we advocate a multi-dimensional ViT compression paradigm, and propose to harness the redundancy reduction from attention head, neuron and sequence dimensions jointly. We firstly propose a statistical dependence based pruning criterion that is generalizable to different dimensions for identifying deleterious components. Moreover, we cast the multi-dimensional compression as an optimization, learning the optimal pruning policy across the three dimensions that maximizes the compressed model's accuracy under a computational budget. The problem is solved by our adapted Gaussian process search with expected improvement. Experimental results show that our method effectively reduces the computational cost of various ViT models. For example, our method reduces 40% FLOPs without top-1 accuracy loss for DeiT and T2T-ViT models, outperforming previous state-of-the-arts.

READ FULL TEXT
research
05/26/2023

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

Attention-based vision models, such as Vision Transformer (ViT) and its ...
research
04/13/2023

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

In this paper, we propose an ultrafast automated model compression frame...
research
01/31/2021

AACP: Model Compression by Accurate and Automatic Channel Pruning

Channel pruning is formulated as a neural architecture search (NAS) prob...
research
05/29/2023

DiffRate : Differentiable Compression Rate for Efficient Vision Transformers

Token compression aims to speed up large-scale vision transformers (e.g....
research
11/30/2021

A Unified Pruning Framework for Vision Transformers

Recently, vision transformer (ViT) and its variants have achieved promis...
research
08/14/2020

AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

Convolutional Neural Networks (CNNs) achieved great cognitive performanc...
research
05/18/2020

Joint Multi-Dimension Pruning

We present joint multi-dimension pruning (named as JointPruning), a new ...

Please sign up or login with your details

Forgot password? Click here to reset