Concentric network symmetry grasps authors' styles in word adjacency networks

04/09/2015
by   Diego R. Amancio, et al.
0

Several characteristics of written texts have been inferred from statistical analysis derived from networked models. Even though many network measurements have been adapted to study textual properties at several levels of complexity, some textual aspects have been disregarded. In this paper, we study the symmetry of word adjacency networks, a well-known representation of text as a graph. A statistical analysis of the symmetry distribution performed in several novels showed that most of the words do not display symmetric patterns of connectivity. More specifically, the merged symmetry displayed a distribution similar to the ubiquitous power-law distribution. Our experiments also revealed that the studied metrics do not correlate with other traditional network measurements, such as the degree or betweenness centrality. The effectiveness of the symmetry measurements was verified in the authorship attribution task. Interestingly, we found that specific authors prefer particular types of symmetric motifs. As a consequence, the authorship of books could be accurately identified in 82.5 authors. Because the proposed measurements for text analysis are complementary to the traditional approach, they can be used to improve the characterization of text networks, which might be useful for related applications, such as those relying on the identification of topical words and information retrieval.

READ FULL TEXT

page 3

page 4

page 6

research
08/16/2018

Linguistic data mining with complex networks: a stylometric-oriented approach

By representing a text by a set of words and their co-occurrences, one o...
research
10/20/2016

Authorship Attribution Based on Life-Like Network Automata

The authorship attribution is a problem of considerable practical and te...
research
12/29/2014

Probing the topological properties of complex networks modeling short written texts

In recent years, graph theory has been widely employed to probe several ...
research
07/28/2015

Classifying informative and imaginative prose using complex networks

Statistical methods have been widely employed in recent years to grasp m...
research
12/31/2018

Pull out all the stops: Textual analysis via punctuation sequences

Whether enjoying the lucid prose of a favorite author or slogging throug...
research
09/22/2018

Relating Zipf's law to textual information

Zipf's law is the main regularity of quantitative linguistics. Despite o...
research
01/17/2022

Accessibility and Trajectory-Based Text Characterization

Several complex systems are characterized by presenting intricate charac...

Please sign up or login with your details

Forgot password? Click here to reset