BERT Loses Patience: Fast and Robust Inference with Early Exit

06/07/2020
by   Wangchunshu Zhou, et al.
0

In this paper, we propose Patience-based Early Exit, a straightforward yet effective inference method that can be used as a plug-and-play technique to simultaneously improve the efficiency and robustness of a pretrained language model (PLM). To achieve this, our approach couples an internal-classifier with each layer of a PLM and dynamically stops inference when the intermediate predictions of the internal classifiers do not change for a pre-defined number of steps. Our approach improves inference efficiency as it allows the model to make a prediction with fewer layers. Meanwhile, experimental results with an ALBERT model show that our method can improve the accuracy and robustness of the model by preventing it from overthinking and exploiting multiple classifiers for prediction, yielding a better accuracy-speed trade-off compared to existing early exit methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2023

SmartBERT: A Promotion of Dynamic Early Exiting Mechanism for Accelerating BERT Inference

Dynamic early exiting has been proven to improve the inference speed of ...
research
05/28/2021

Early Exiting with Ensemble Internal Classifiers

As a simple technique to accelerate inference of large-scale pre-trained...
research
06/04/2023

Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings

Adaptive inference is a simple method for reducing inference costs. The ...
research
01/28/2023

Anticipate, Ensemble and Prune: Improving Convolutional Neural Networks via Aggregated Early Exits

Today, artificial neural networks are the state of the art for solving a...
research
01/11/2023

Dynamic Data Assimilation of MPAS-O and the Global Drifter Dataset

In this study, we propose a new method for combining in situ buoy measur...
research
03/25/2019

Active Learning of Spin Network Models

Complex networks can be modeled as a probabilistic graphical model, wher...
research
06/09/2021

Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

The problem of reducing processing time of large deep learning models is...

Please sign up or login with your details

Forgot password? Click here to reset