Embedding Recycling for Language Models

07/11/2022
by   Jon Saad-Falcon, et al.
13

Training and inference with large neural models is expensive. However, for many application domains, while new tasks and models arise frequently, the underlying documents being modeled remain mostly unchanged. We study how to decrease computational cost in such settings through embedding recycling (ER): re-using activations from previous model runs when performing training or inference. In contrast to prior work focusing on freezing small classification heads for finetuning which often leads to notable drops in performance, we propose caching an intermediate layer's output from a pretrained model and finetuning the remaining layers for new tasks. We show that our method provides a 100 negligible impacts on accuracy for text classification and entity recognition tasks in the scientific domain. For general-domain question answering tasks, ER offers a similar speedup and lowers accuracy by a small amount. Finally, we identify several open challenges and future directions for ER.

READ FULL TEXT
research
04/19/2019

Unifying Question Answering and Text Classification via Span Extraction

Even as pre-trained language encoders such as BERT are shared across man...
research
09/07/2018

Convolutional Neural Network: Text Classification Model for Open Domain Question Answering System

Recently machine learning is being applied to almost every data domain o...
research
09/22/2021

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be ...
research
12/02/2019

EduBERT: Pretrained Deep Language Models for Learning Analytics

The use of large pretrained neural networks to create contextualized wor...
research
03/09/2022

Pretrained Domain-Specific Language Model for General Information Retrieval Tasks in the AEC Domain

As an essential task for the architecture, engineering, and construction...
research
08/19/2023

Open, Closed, or Small Language Models for Text Classification?

Recent advancements in large language models have demonstrated remarkabl...

Please sign up or login with your details

Forgot password? Click here to reset