Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

12/15/2022
by   Mingyu Lee, et al.
0

Masked language modeling (MLM) has been widely used for pre-training effective bidirectional representations, but incurs substantial training costs. In this paper, we propose a novel concept-based curriculum masking (CCM) method to efficiently pre-train a language model. CCM has two key differences from existing curriculum learning approaches to effectively reflect the nature of MLM. First, we introduce a carefully-designed linguistic difficulty criterion that evaluates the MLM difficulty of each token. Second, we construct a curriculum that gradually masks words related to the previously masked words by retrieving a knowledge graph. Experimental results show that CCM significantly improves pre-training efficiency. Specifically, the model trained with CCM shows comparative performance with the original BERT on the General Language Understanding Evaluation benchmark at half of the training cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/21/2020

Curriculum Pre-training for End-to-End Speech Translation

End-to-end speech translation poses a heavy burden on the encoder, becau...
research
11/18/2021

SDCUP: Schema Dependency-Enhanced Curriculum Pre-Training for Table Semantic Parsing

Recently pre-training models have significantly improved the performance...
research
08/16/2023

Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval

In this paper, we systematically study the potential of pre-training wit...
research
08/04/2021

Curriculum learning for language modeling

Language Models like ELMo and BERT have provided robust representations ...
research
07/17/2023

Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach

A curriculum is a planned sequence of learning materials and an effectiv...
research
07/02/2021

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Human language understanding operates at multiple levels of granularity ...
research
11/29/2017

Curriculum Q-Learning for Visual Vocabulary Acquisition

The structure of curriculum plays a vital role in our learning process, ...

Please sign up or login with your details

Forgot password? Click here to reset