Same Author or Just Same Topic? Towards Content-Independent Style Representations

04/11/2022
by   Anna Wegmann, et al.
0

Linguistic style is an integral component of language. Recent advances in the development of style representations have increasingly used training objectives from authorship verification (AV): Do two texts have the same author? The assumption underlying the AV training task (same author approximates same writing style) enables self-supervised and, thus, extensive training. However, a good performance on the AV task does not ensure good "general-purpose" style representations. For example, as the same author might typically write about certain topics, representations trained on AV might also encode content information instead of style alone. We introduce a variation of the AV training task that controls for content using conversation or domain labels. We evaluate whether known style dimensions are represented and preferred over content information through an original variation to the recently proposed STEL framework. We find that representations trained by controlling for conversation are better than representations trained with domain or no content control at representing style independent from content.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Can Authorship Representation Learning Capture Stylistic Features?

Automatically disentangling an author's style from the content of their ...
research
09/07/2021

Idiosyncratic but not Arbitrary: Learning Idiolects in Online Registers Reveals Distinctive yet Consistent Individual Styles

An individual's variation in writing style is often a function of both s...
research
05/22/2023

Learning Interpretable Style Embeddings via Prompting LLMs

Style representation learning builds content-independent representations...
research
01/14/2020

Adversarial Disentanglement with Grouped Observations

We consider the disentanglement of the representations of the relevant a...
research
05/29/2020

The Importance of Suppressing Domain Style in Authorship Analysis

The prerequisite of many approaches to authorship analysis is a represen...
research
09/10/2021

Does It Capture STEL? A Modular, Similarity-based Linguistic Style Evaluation Framework

Style is an integral part of natural language. However, evaluation metho...
research
02/24/2019

Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace?

Textual deception constitutes a major problem for online security. Many ...

Please sign up or login with your details

Forgot password? Click here to reset