A Comprehensive Exploration of Pre-training Language Models

06/22/2021
by   Tong Guo, et al.
0

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of transformer-based models with the same amount of text and the same training steps. The experimental results shows that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for the transformer-encoder layers.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset