Elbert: Fast Albert with Confidence-Window Based Early Exit

07/01/2021
by   Keli Xie, et al.
0

Despite the great success in Natural Language Processing (NLP) area, large pre-trained language models like BERT are not well-suited for resource-constrained or real-time applications owing to the large number of parameters and slow inference speed. Recently, compressing and accelerating BERT have become important topics. By incorporating a parameter-sharing strategy, ALBERT greatly reduces the number of parameters while achieving competitive performance. Nevertheless, ALBERT still suffers from a long inference time. In this work, we propose the ELBERT, which significantly improves the average inference speed compared to ALBERT due to the proposed confidence-window based early exit mechanism, without introducing additional parameters or extra training overhead. Experimental results show that ELBERT achieves an adaptive inference speedup varying from 2× to 10× with negligible accuracy degradation compared to ALBERT on various datasets. Besides, ELBERT achieves higher accuracy than existing early exit methods used for accelerating BERT under the same computation cost. Furthermore, to understand the principle of the early exit mechanism, we also visualize the decision-making process of it in ELBERT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2022

Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT

As an application of Natural Language Processing (NLP) techniques, finan...
research
03/25/2022

MKQ-BERT: Quantized BERT with 4-bits Weights and Activations

Recently, pre-trained Transformer based language models, such as BERT, h...
research
03/16/2023

SmartBERT: A Promotion of Dynamic Early Exiting Mechanism for Accelerating BERT Inference

Dynamic early exiting has been proven to improve the inference speed of ...
research
11/27/2020

CoRe: An Efficient Coarse-refined Training Framework for BERT

In recent years, BERT has made significant breakthroughs on many natural...
research
10/30/2021

Magic Pyramid: Accelerating Inference with Early Exiting and Token Pruning

Pre-training and then fine-tuning large language models is commonly used...
research
05/28/2021

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Both performance and efficiency are crucial factors for sequence labelin...
research
04/05/2020

FastBERT: a Self-distilling BERT with Adaptive Inference Time

Pre-trained language models like BERT have proven to be highly performan...

Please sign up or login with your details

Forgot password? Click here to reset