Language Model Pre-training on True Negatives

12/01/2022
by   Zhuosheng Zhang, et al.
0

Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Taking the former text as positive and the latter as negative samples, the PLM can be trained effectively for contextualized representation. However, the training of such a type of PLMs highly relies on the quality of the automatically constructed samples. Existing PLMs simply treat all corrupted texts as equal negative without any examination, which actually lets the resulting model inevitably suffer from the false negative issue where training is carried out on pseudo-negative data and leads to less efficiency and less robustness in the resulting PLMs. In this work, on the basis of defining the false negative issue in discriminative PLMs that has been ignored for a long time, we design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives by correcting the harmful gradient updates subject to false negative predictions. Experimental results on GLUE and SQuAD benchmarks show that our counter-false-negative pre-training methods indeed bring about better performance together with stronger robustness.

READ FULL TEXT

page 2

page 4

page 6

research
05/21/2023

Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

Randomly masking text spans in ordinary texts in the pre-training stage ...
research
10/27/2022

Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning

Self-supervised pre-training methods based on contrastive learning or re...
research
07/16/2021

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Recent years pre-trained language models hit a success on modeling natur...
research
07/15/2020

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

In this work, we formulate cross-lingual language model pre-training as ...
research
08/20/2021

Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need

Designing pre-training objectives that more closely resemble the downstr...
research
08/08/2022

UFNRec: Utilizing False Negative Samples for Sequential Recommendation

Sequential recommendation models are primarily optimized to distinguish ...
research
07/26/2021

Exploiting Language Model for Efficient Linguistic Steganalysis

Recent advances in linguistic steganalysis have successively applied CNN...

Please sign up or login with your details

Forgot password? Click here to reset