General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference

04/29/2020
by   Jingfei Du, et al.
0

The state of the art on many NLP tasks is currently achieved by large pre-trained language models, which require a considerable amount of computation. We explore a setting where many different predictions are made on a single piece of text. In that case, some of the computational cost during inference can be amortized over the different tasks using a shared text encoder. We compare approaches for training such an encoder and show that encoders pre-trained over multiple tasks generalize well to unseen tasks. We also compare ways of extracting fixed- and limited-size representations from this encoder, including different ways of pooling features extracted from multiple layers or positions. Our best approach compares favorably to knowledge distillation, achieving higher accuracy and lower computational cost once the system is handling around 7 tasks. Further, we show that through binary quantization, we can reduce the size of the extracted representations by a factor of 16 making it feasible to store them for later use. The resulting method offers a compelling solution for using large-scale pre-trained models at a fraction of the computational cost when multiple tasks are performed on the same text.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

Many real-world applications require making multiple predictions from th...
research
06/01/2023

Can Large Pre-trained Models Help Vision Models on Perception Tasks?

The recent upsurge in pre-trained large models (e.g. GPT-4) has swept ac...
research
05/04/2022

RecipeSnap – a lightweight image-to-recipe model

In this paper we want to address the problem of automation for recogniti...
research
10/20/2021

EBJR: Energy-Based Joint Reasoning for Adaptive Inference

State-of-the-art deep learning models have achieved significant performa...
research
03/01/2022

E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models

Building huge and highly capable language models has been a trend in the...
research
08/20/2023

Scaled-up Discovery of Latent Concepts in Deep NLP Models

Pre-trained language models (pLMs) learn intricate patterns and contextu...
research
06/29/2023

Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

Many recent improvements in NLP stem from the development and use of lar...

Please sign up or login with your details

Forgot password? Click here to reset