Sparse Text Generation

04/06/2020
by   Pedro Henrique Martins, et al.
0

Current state-of-the-art text generators build on powerful language models such as GPT-2, which have impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, as in top-k or nucleus sampling. This creates a mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch. The result is a text generator with favorable performance in terms of fluency and consistency, fewer repetitions, and n-gram diversity closer to human text. In order to evaluate our model, we propose three new metrics that are tailored for comparing sparse or truncated distributions: ϵ-perplexity, sparsemax score, and Jensen-Shannon divergence. Human-evaluated experiments in story completion and dialogue generation show that entmax sampling leads to more engaging and coherent stories and conversations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2022

Event Transition Planning for Open-ended Text Generation

Open-ended text generation tasks, such as dialogue generation and story ...
research
06/09/2022

Factuality Enhanced Language Models for Open-Ended Text Generation

Pretrained language models (LMs) are susceptible to generate text with n...
research
05/04/2020

Distributional Discrepancy: A Metric for Unconditional Text Generation

The goal of unconditional text generation is training a model with real ...
research
10/20/2018

Hierarchical Text Generation using an Outline

Many challenges in natural language processing require generating text, ...
research
04/22/2019

The Curious Case of Neural Text Degeneration

Despite considerable advancements with deep neural language models, the ...
research
05/22/2023

A Frustratingly Simple Decoding Method for Neural Text Generation

We introduce a frustratingly simple, super efficient and surprisingly ef...

Please sign up or login with your details

Forgot password? Click here to reset