DeepAI AI Chat
Log In Sign Up

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

by   Wei Wang, et al.

Recently, the pre-trained language model, BERT (Devlin et al.(2018)Devlin, Chang, Lee, and Toutanova), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering. Inspired by the linearization exploration work of Elman (Elman(1990)), we extend BERT to a new model, StructBERT, by incorporating language structures into pretraining. Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. As a result, the new model is adapted to different levels of language understanding required by downstream tasks. The StructBERT with structural pre-training gives surprisingly good empirical results on a variety of downstream tasks, including pushing the state-of-the-art on the GLUE benchmark to 84.5 (with Top 1 achievement on the Leaderboard at the time of paper submission), the F1 score on SQuAD v1.1 question answering to 93.0, the accuracy on SNLI to 91.7.


page 1

page 2

page 3

page 4


Unified Language Model Pre-training for Natural Language Understanding and Generation

This paper presents a new Unified pre-trained Language Model (UniLM) tha...

SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs

In this paper, we propose SPBERT, a transformer-based language model pre...

Rethinking Relational Encoding in Language Model: Pre-Training for General Sequences

Language model pre-training (LMPT) has achieved remarkable results in na...

lamBERT: Language and Action Learning Using Multimodal BERT

Recently, the bidirectional encoder representations from transformers (B...

E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce

Pre-trained language models such as BERT have achieved great success in ...