Curriculum learning for language modeling

08/04/2021
by   Daniel Campos, et al.
0

Language Models like ELMo and BERT have provided robust representations of natural language, which serve as the language understanding component for a diverse range of downstream tasks.Curriculum learning is a method that employs a structured training regime instead, which has been leveraged in computer vision and machine translation to improve model training speed and model performance. While language models have proven transformational for the natural language processing community, these models have proven expensive, energy-intensive, and challenging to train. In this work, we explore the effect of curriculum learning on language model pretraining using various linguistically motivated curricula and evaluate transfer performance on the GLUE Benchmark. Despite a broad variety of training methodologies and experiments we do not find compelling evidence that curriculum learning methods improve language model training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2023

Unsupervised Improvement of Factual Knowledge in Language Models

Masked language modeling (MLM) plays a key role in pretraining large lan...
research
04/13/2022

Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding

In the age of large transformer language models, linguistic evaluation p...
research
12/15/2022

Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Masked language modeling (MLM) has been widely used for pre-training eff...
research
03/29/2021

Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies

In this work, we retrained the distilled BERT language model for Walmart...
research
01/18/2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

In this work, we explore joint energy-based model (EBM) training during ...
research
10/25/2020

A Comprehensive Survey on Curriculum Learning

Curriculum learning (CL) is a training strategy that trains a machine le...
research
12/28/2022

Cramming: Training a Language Model on a Single GPU in One Day

Recent trends in language modeling have focused on increasing performanc...

Please sign up or login with your details

Forgot password? Click here to reset