Assessing the Stylistic Properties of Neurally Generated Text in Authorship Attribution

08/18/2017
by   E. Manjavacas, et al.
0

Recent applications of neural language models have led to an increased interest in the automatic generation of natural language. However impressive, the evaluation of neurally generated text has so far remained rather informal and anecdotal. Here, we present an attempt at the systematic assessment of one aspect of the quality of neurally generated text. We focus on a specific aspect of neural language generation: its ability to reproduce authorial writing styles. Using established models for authorship attribution, we empirically assess the stylistic qualities of neurally generated text. In comparison to conventional language models, neural models generate fuzzier text that is relatively harder to attribute correctly. Nevertheless, our results also suggest that neurally generated text offers more valuable perspectives for the augmentation of training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2021

Language Model Evaluation Beyond Perplexity

We propose an alternate approach to quantifying how well language models...
research
10/17/2022

RARR: Researching and Revising What Language Models Say, Using Language Models

Language models (LMs) now excel at many tasks such as few-shot learning,...
research
10/11/2018

Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity

We present a comparison of word-based and character-based sequence-to-se...
research
03/18/2022

Are You Robert or RoBERTa? Deceiving Online Authorship Attribution Models Using Neural Text Generators

Recently, there has been a rise in the development of powerful pre-train...
research
11/18/2021

How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN

Current language models can generate high-quality text. Are they simply ...
research
12/23/2021

Measuring Attribution in Natural Language Generation Models

With recent improvements in natural language generation (NLG) models for...
research
10/28/2021

BERTian Poetics: Constrained Composition with Masked LMs

Masked language models have recently been interpreted as energy-based se...

Please sign up or login with your details

Forgot password? Click here to reset