Accelerating Pre-trained Language Models via Calibrated Cascade

12/29/2020
by   Lei Li, et al.
6

Dynamic early exiting aims to accelerate pre-trained language models' (PLMs) inference by exiting in shallow layer without passing through the entire model. In this paper, we analyze the working mechanism of dynamic early exiting and find it cannot achieve a satisfying trade-off between inference speed and performance. On one hand, the PLMs' representations in shallow layers are not sufficient for accurate prediction. One the other hand, the internal off-ramps cannot provide reliable exiting decisions. To remedy this, we instead propose CascadeBERT, which dynamically selects a proper-sized, complete model in a cascading manner. To obtain more reliable model selection, we further devise a difficulty-aware objective, encouraging the model output class probability to reflect the real difficulty of each instance. Extensive experimental results demonstrate the superiority of our proposal over strong baseline models of PLMs' acceleration including both dynamic early exiting and knowledge distillation methods.

READ FULL TEXT
research
06/07/2021

RoSearch: Search for Robust Student Architectures When Distilling Pre-trained Language Models

Pre-trained language models achieve outstanding performance in NLP tasks...
research
05/11/2023

THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval

Legal case retrieval techniques play an essential role in modern intelli...
research
10/14/2022

EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning

Pre-trained vision-language models (VLMs) have achieved impressive resul...
research
08/01/2023

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

Recently, the efficient deployment and acceleration of powerful vision t...
research
06/29/2022

Knowledge Distillation of Transformer-based Language Models Revisited

In the past few years, transformer-based pre-trained language models hav...
research
07/01/2021

Knowledge Distillation for Quality Estimation

Quality Estimation (QE) is the task of automatically predicting Machine ...

Please sign up or login with your details

Forgot password? Click here to reset