A Comprehensive Exploration of Pre-training Language Models

06/22/2021
by   Tong Guo, et al.
0

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of transformer-based models with the same amount of text and the same training steps. The experimental results shows that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for the transformer-encoder layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2021

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Large, pre-trained transformer-based language models such as BERT have d...
research
04/30/2020

SegaBERT: Pre-training of Segment-aware BERT for Language Understanding

Pre-trained language models have achieved state-of-the-art results in va...
research
04/20/2019

Language Models with Transformers

The Transformer architecture is superior to RNN-based models in computat...
research
04/19/2021

Probing for Bridging Inference in Transformer Language Models

We probe pre-trained transformer language models for bridging inference....
research
10/26/2020

Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping

Recently, Transformer-based language models have demonstrated remarkable...
research
03/27/2023

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

Through exploiting a high level of parallelism enabled by graphics proce...
research
07/07/2020

Evaluating German Transformer Language Models with Syntactic Agreement Tests

Pre-trained transformer language models (TLMs) have recently refashioned...

Please sign up or login with your details

Forgot password? Click here to reset