Authorship attribution via network motifs identification

07/23/2016
by   Vanessa Queiroz Marinho, et al.
0

Concepts and methods of complex networks can be used to analyse texts at their different complexity levels. Examples of natural language processing (NLP) tasks studied via topological analysis of networks are keyword identification, automatic extractive summarization and authorship attribution. Even though a myriad of network measurements have been applied to study the authorship attribution problem, the use of motifs for text analysis has been restricted to a few works. The goal of this paper is to apply the concept of motifs, recurrent interconnection patterns, in the authorship attribution task. The absolute frequencies of all thirteen directed motifs with three nodes were extracted from the co-occurrence networks and used as classification features. The effectiveness of these features was verified with four machine learning methods. The results show that motifs are able to distinguish the writing style of different authors. In our best scenario, 57.5 classified. The chance baseline for this problem is 12.5 found that function words play an important role in these recurrent patterns. Taken together, our findings suggest that motifs should be further explored in other related linguistic tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2016

A Supervised Authorship Attribution Framework for Bengali Language

Authorship Attribution is a long-standing problem in Natural Language Pr...
research
05/11/2017

On the role of words in the network structure of texts: application to authorship attribution

Well-established automatic analyses of texts mainly consider frequencies...
research
11/15/2018

Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

The automatic development of phenotype algorithms from Electronic Health...
research
05/29/2017

On the "Calligraphy" of Books

Authorship attribution is a natural language processing task that has be...
research
08/15/2022

Reproduction and Replication of an Adversarial Stylometry Experiment

Maintaining anonymity while communicating using natural language remains...
research
07/25/2018

Who is the director of this movie? Automatic style recognition based on shot features

We show how low-level formal features, such as shot duration, meant as l...

Please sign up or login with your details

Forgot password? Click here to reset