DeepAI AI Chat
Log In Sign Up

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

12/20/2022
by   Jian Yang, et al.
Beihang University
Microsoft
The University of Hong Kong
0

Pre-trained models have achieved remarkable success in natural language processing (NLP). However, existing pre-training methods underutilize the benefits of language understanding for generation. Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model. Our model, named as GanLM, is trained with two pre-training objectives: replaced token detection and replaced token denoising. Specifically, given masked source sentences, the generator outputs the target distribution and the discriminator predicts whether the target sampled tokens from distribution are incorrect. The target sentence is replaced with misclassified tokens to construct noisy previous context, which is used to generate the gold sentence. In general, both tasks improve the ability of language understanding and generation by selectively using the denoising data. Extensive experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models (PLMs) and achieves state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/30/2020

SegaBERT: Pre-training of Segment-aware BERT for Language Understanding

Pre-trained language models have achieved state-of-the-art results in va...
11/18/2021

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

This paper presents a new pre-trained language model, DeBERTaV3, which i...
06/10/2020

MC-BERT: Efficient Language Pre-Training via a Meta Controller

Pre-trained contextual representations (e.g., BERT) have become the foun...
06/25/2021

Learning to Sample Replacements for ELECTRA Pre-Training

ELECTRA pretrains a discriminator to detect replaced tokens, where the r...
06/28/2021

Knowledge Transfer by Discriminative Pre-training for Academic Performance Prediction

The needs for precisely estimating a student's academic performance have...
05/09/2022

Improving negation detection with negation-focused pre-training

Negation is a common linguistic feature that is crucial in many language...
12/31/2020

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding

Advances in English language representation enabled a more sample-effici...