Pretrained Language Model Embryology: The Birth of ALBERT

10/06/2020
by   David C. Chiang, et al.
0

While behaviors of pretrained language models (LMs) have been thoroughly examined, what happened during pretraining is rarely studied. We thus investigate the developmental process from a set of randomly initialized parameters to a totipotent language model, which we refer to as the embryology of a pretrained language model. Our results show that ALBERT learns to reconstruct and predict tokens of different parts of speech (POS) in different learning speeds during pretraining. We also find that linguistic knowledge and world knowledge do not generally improve as pretraining proceeds, nor do downstream tasks' performance. These findings suggest that knowledge of a pretrained model varies during pretraining, and having more pretrain steps does not necessarily provide a model with more comprehensive knowledge. We will provide source codes and pretrained models to reproduce our results at https://github.com/d223302/albert-embryology.

READ FULL TEXT
research
03/29/2022

LinkBERT: Pretraining Language Models with Document Links

Language model (LM) pretraining can learn various knowledge from text co...
research
04/07/2020

Byte Pair Encoding is Suboptimal for Language Model Pretraining

The success of pretrained transformer language models in natural languag...
research
06/29/2020

Knowledge-Aware Language Model Pretraining

How much knowledge do pretrained language models hold? Recent research o...
research
06/02/2016

Multi-pretrained Deep Neural Network

Pretraining is widely used in deep neutral network and one of the most f...
research
05/22/2023

Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model

Pretrained language models have achieved remarkable success in various n...
research
03/25/2023

Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining

The model's ability to understand synonymous expression is crucial in ma...
research
02/13/2023

Guiding Pretraining in Reinforcement Learning with Large Language Models

Reinforcement learning algorithms typically struggle in the absence of a...

Please sign up or login with your details

Forgot password? Click here to reset