Word frequency-rank relationship in tagged texts

02/07/2021
by   A. Chacoma, et al.
0

We analyze the frequency-rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency-ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency-rank relationships may reflect linguistic features associated with grammatical function.

READ FULL TEXT
research
05/02/2022

A Two Parameters Equation for Word Rank-Frequency Relation

Let f (·) be the absolute frequency of words and r be the rank of words ...
research
05/09/2022

Approaches to the classification of complex systems: Words, texts, and more

The Chapter starts with introductory information about quantitative ling...
research
01/07/2020

Heaps' law and Heaps functions in tagged texts: Evidences of their linguistic relevance

We study the relationship between vocabulary size and text length in a c...
research
10/05/2015

Stochastic model for phonemes uncovers an author-dependency of their usage

We study rank-frequency relations for phonemes, the minimal units that s...
research
04/09/2020

Two halves of a meaningful text are statistically different

Which statistical features distinguish a meaningful text (possibly writt...
research
04/05/2016

Mental Lexicon Growth Modelling Reveals the Multiplexity of the English Language

In this work we extend previous analyses of linguistic networks by adopt...
research
05/05/2020

Self-organizing Pattern in Multilayer Network for Words and Syllables

One of the ultimate goals for linguists is to find universal properties ...

Please sign up or login with your details

Forgot password? Click here to reset