Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

06/06/2022
by   Jin Xu, et al.
5

While large-scale neural language models, such as GPT2 and BART, have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (e.g., greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in human corpora (e.g., 0.02% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probabilities of the repetitive tokens and their previous repetitions in the context. Through our quantitative experiments, we find that 1) Language models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a self-reinforcement effect: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method DITTO (PseuDo-RepetITion PenalizaTiOn), where the model learns to penalize probabilities of sentence-level repetitions from pseudo repetitive data. Although our method is motivated by mitigating repetitions, experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.

READ FULL TEXT
research
07/04/2023

Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation

Despite the huge progress in myriad generation tasks, pretrained languag...
research
04/13/2021

From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic Text Generation

We study a new application for text generation – idiomatic sentence gene...
research
06/09/2022

Factuality Enhanced Language Models for Open-Ended Text Generation

Pretrained language models (LMs) are susceptible to generate text with n...
research
06/14/2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

Advanced large-scale neural language models have led to significant succ...
research
01/02/2021

On-the-Fly Attention Modularization for Neural Generation

Despite considerable advancements with deep neural language models (LMs)...
research
09/18/2023

Search and Learning for Unsupervised Text Generation

With the advances of deep learning techniques, text generation is attrac...
research
06/29/2020

Learning Sparse Prototypes for Text Generation

Prototype-driven text generation uses non-parametric models that first c...

Please sign up or login with your details

Forgot password? Click here to reset