Reverse Engineering Configurations of Neural Text Generation Models

04/13/2020
by   Yi Tay, et al.
0

This paper seeks to develop a deeper understanding of the fundamental properties of neural text generations models. The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area. Previously, the extent and degree to which these artifacts surface in generated text has not been well studied. In the spirit of better understanding generative text models and their artifacts, we propose the new task of distinguishing which of several variants of a given model generated a piece of text, and we conduct an extensive suite of diagnostic tests to observe whether modeling choices (e.g., sampling methods, top-k probabilities, model architectures, etc.) leave detectable artifacts in the text they generate. Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by observing the generated text alone. This suggests that neural text generators may be more sensitive to various modeling choices than previously thought.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2019

GLTR: Statistical Detection and Visualization of Generated Text

The rapid improvement of language models has raised the specter of abuse...
research
06/07/2019

Real or Fake? Learning to Discriminate Machine from Human Generated Text

Recent advances in generative modeling of text have demonstrated remarka...
research
06/22/2022

Understanding the Properties of Generated Corpora

Models for text generation have become focal for many research tasks and...
research
04/21/2018

Eval all, trust a few, do wrong to none: Comparing sentence generation models

In this paper, we study recent neural generative models for text generat...
research
09/14/2021

The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation

Recent text generation research has increasingly focused on open-ended d...
research
08/13/2020

On the design of text editors

Text editors are written by and for developers. They come with a large s...
research
09/18/2021

BERT-Beta: A Proactive Probabilistic Approach to Text Moderation

Text moderation for user generated content, which helps to promote healt...

Please sign up or login with your details

Forgot password? Click here to reset