Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model

05/22/2023
by   Xiao Wang, et al.
0

Pretrained language models have achieved remarkable success in various natural language processing tasks. However, pretraining has recently shifted toward larger models and larger data, and this has resulted in significant computational and energy costs. In this paper, we propose Influence Subset Selection (ISS) for language model, which explicitly utilizes end-task knowledge to select a tiny subset of the pretraining corpus. Specifically, the ISS selects the samples that will provide the most positive influence on the performance of the end-task. Furthermore, we design a gradient matching based influence estimation method, which can drastically reduce the computation time of influence. With only 0.45 computational cost, ISS outperformed pretrained models (e.g., RoBERTa) on eight datasets covering four domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

Pretrained Language Model Embryology: The Birth of ALBERT

While behaviors of pretrained language models (LMs) have been thoroughly...
research
04/15/2021

Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution

The use of pretrained masked language models (MLMs) has drastically impr...
research
07/22/2021

Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification

Language models (LMs) pretrained on a large text corpus and fine-tuned o...
research
04/07/2020

Byte Pair Encoding is Suboptimal for Language Model Pretraining

The success of pretrained transformer language models in natural languag...
research
10/20/2022

Automatic Document Selection for Efficient Encoder Pretraining

Building pretrained language models is considered expensive and data-int...
research
08/11/2023

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Generative language models (LMs) have become omnipresent across data sci...
research
08/25/2022

A Compact Pretraining Approach for Neural Language Models

Domain adaptation for large neural language models (NLMs) is coupled wit...

Please sign up or login with your details

Forgot password? Click here to reset