An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

10/14/2020
by   Zihan Zhao, et al.
3

Recently, pre-trained language models like BERT have shown promising performance on multiple natural language processing tasks. However, the application of these models has been limited due to their huge size. To reduce its size, a popular and efficient way is quantization. Nevertheless, most of the works focusing on BERT quantization adapted primary linear clustering as the quantization scheme, and few works try to upgrade it. That limits the performance of quantization significantly. In this paper, we implement k-means quantization and compare its performance on the fix-precision quantization of BERT with linear quantization. Through the comparison, we verify that the effect of the underlying quantization scheme upgrading is underestimated and there is a huge development potential of k-means quantization. Besides, we also compare the two quantization schemes on ALBERT models to explore the robustness differences between different pre-trained models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2021

Prune Once for All: Sparse Pre-Trained Language Models

Transformer-based language models are applied to a wide range of applica...
research
06/24/2021

Quantization Aware Training, ERNIE and Kurtosis Regularizer: a short empirical study

Pre-trained language models like Ernie or Bert are currently used in man...
research
01/15/2021

KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization

Recently, transformer-based language models such as BERT have shown trem...
research
09/30/2021

Towards Efficient Post-training Quantization of Pre-trained Language Models

Network quantization has gained increasing attention with the rapid grow...
research
01/02/2011

Improving the Performance of K-Means for Color Quantization

Color quantization is an important operation with many applications in g...
research
06/16/2021

TSSuBERT: Tweet Stream Summarization Using BERT

The development of deep neural networks and the emergence of pre-trained...
research
09/23/2017

Calibrated steganalysis of mp3stego in multi-encoder scenario

Comparing popularity of mp3 and wave with the amount of works published ...

Please sign up or login with your details

Forgot password? Click here to reset