Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

06/14/2021
by   Xiang Lin, et al.
14

Advanced large-scale neural language models have led to significant success in many language generation tasks. However, the most commonly used training objective, Maximum Likelihood Estimation (MLE), has been shown problematic, where the trained model prefers using dull and repetitive phrases. In this work, we introduce ScaleGrad, a modification straight to the gradient of the loss function, to remedy the degeneration issue of the standard MLE objective. By directly maneuvering the gradient information, ScaleGrad makes the model learn to use novel tokens. Empirical results show the effectiveness of our method not only in open-ended generation, but also in directed generation tasks. With the simplicity in architecture, our method can serve as a general training objective that is applicable to most of the neural text generation tasks.

READ FULL TEXT
research
03/15/2018

Neural Text Generation: Past, Present and Beyond

This paper presents a systematic survey on recent development of neural ...
research
10/12/2020

Improving Text Generation with Student-Forcing Optimal Transport

Neural language models are often trained with maximum likelihood estimat...
research
05/01/2020

POINTER: Constrained Text Generation via Insertion-based Generative Pre-training

Large-scale pre-trained language models, such as BERT and GPT-2, have ac...
research
12/28/2020

Neural Text Generation with Artificial Negative Examples

Neural text generation models conditioning on given input (e.g. machine ...
research
06/06/2022

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

While large-scale neural language models, such as GPT2 and BART, have ac...
research
02/26/2023

Tailoring Language Generation Models under Total Variation Distance

The standard paradigm of neural language generation adopts maximum likel...
research
09/09/2021

Graphine: A Dataset for Graph-aware Terminology Definition Generation

Precisely defining the terminology is the first step in scientific commu...

Please sign up or login with your details

Forgot password? Click here to reset