Data Annealing for Informal Language Understanding Tasks

04/24/2020
by   Jing Gu, et al.
0

There is a huge performance gap between formal and informal language understanding tasks. The recent pre-trained models that improved the performance of formal language understanding tasks did not achieve a comparable result on informal language. We pro-pose a data annealing transfer learning procedure to bridge the performance gap on informal natural language understanding tasks. It successfully utilizes a pre-trained model such as BERT in informal language. In our data annealing procedure, the training set contains mainly formal text data at first; then, the proportion of the informal text data is gradually increased during the training process. Our data annealing procedure is model-independent and can be applied to various tasks. We validate its effectiveness in exhaustive experiments. When BERT is implemented with our learning procedure, it outperforms all the state-of-the-art models on the three common informal language tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

RoBERTuito: a pre-trained language model for social media text in Spanish

Since BERT appeared, Transformer language models and transfer learning h...
research
12/27/2020

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

One of the biggest challenges that prohibit the use of many current NLP ...
research
01/12/2023

Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks

Foundation models or pre-trained models have substantially improved the ...
research
04/21/2020

DIET: Lightweight Language Understanding for Dialogue Systems

Large-scale pre-trained language models have shown impressive results on...
research
10/21/2021

Improving Non-autoregressive Generation with Mixup Training

While pre-trained language models have achieved great success on various...
research
04/18/2022

Imagination-Augmented Natural Language Understanding

Human brains integrate linguistic and perceptual information simultaneou...
research
05/21/2022

Calibration of Natural Language Understanding Models with Venn–ABERS Predictors

Transformers, currently the state-of-the-art in natural language underst...

Please sign up or login with your details

Forgot password? Click here to reset