Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic Trace?

02/24/2019
by   Tommi Gröndahl, et al.
0

Textual deception constitutes a major problem for online security. Many studies have argued that deceptiveness leaves traces in writing style, which could be detected using text classification techniques. By conducting an extensive literature review of existing empirical work, we demonstrate that while certain linguistic features have been indicative of deception in certain corpora, they fail to generalize across divergent semantic domains. We suggest that deceptiveness as such leaves no content-invariant stylistic trace, and textual similarity measures provide superior means of classifying texts as potentially deceptive. Additionally, we discuss forms of deception beyond semantic content, focusing on hiding author identity by writing style obfuscation. Surveying the literature on both author identification and obfuscation techniques, we conclude that current style transformation methods fail to achieve reliable obfuscation while simultaneously ensuring semantic faithfulness to the original text. We propose that future work in style transformation should pay particular attention to disallowing semantically drastic changes.

READ FULL TEXT
research
08/22/2023

Can Authorship Representation Learning Capture Stylistic Features?

Automatically disentangling an author's style from the content of their ...
research
05/31/2019

Effective writing style imitation via combinatorial paraphrasing

Stylometry can be used to profile authors based on their written text. T...
research
07/12/2017

The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation

Users posting online expect to remain anonymous unless they have logged ...
research
06/03/2016

Learning Stylometric Representations for Authorship Analysis

Authorship analysis (AA) is the study of unveiling the hidden properties...
research
12/31/2018

Unary and Binary Classification Approaches and their Implications for Authorship Verification

Retrieving indexed documents, not by their topical content but their wri...
research
04/11/2022

Same Author or Just Same Topic? Towards Content-Independent Style Representations

Linguistic style is an integral component of language. Recent advances i...
research
05/29/2020

The Importance of Suppressing Domain Style in Authorship Analysis

The prerequisite of many approaches to authorship analysis is a represen...

Please sign up or login with your details

Forgot password? Click here to reset