Improving Text Generation with Student-Forcing Optimal Transport

10/12/2020
by   Guoyin Wang, et al.
2

Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

06/14/2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

Advanced large-scale neural language models have led to significant succ...
01/31/2023

Dynamic Scheduled Sampling with Imitation Loss for Neural Text Generation

State-of-the-art neural text generation models are typically trained to ...
01/18/2019

Improving Sequence-to-Sequence Learning via Optimal Transport

Sequence-to-sequence models are commonly trained via maximum likelihood ...
01/23/2018

MaskGAN: Better Text Generation via Filling in the ______

Neural text generation models are often autoregressive language models o...
12/14/2020

Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Recently, sequence-to-sequence (seq2seq) models with the Transformer arc...
12/08/2022

Momentum Calibration for Text Generation

The input and output of most text generation tasks can be transformed to...

Please sign up or login with your details

Forgot password? Click here to reset