Co-occurrence of the Benford-like and Zipf Laws Arising from the Texts Representing Human and Artificial Languages

03/06/2018
by   Evgeny Shulzinger, et al.
0

We demonstrate that large texts, representing human (English, Russian, Ukrainian) and artificial (C++, Java) languages, display quantitative patterns characterized by the Benford-like and Zipf laws. The frequency of a word following the Zipf law is inversely proportional to its rank, whereas the total numbers of a certain word appearing in the text generate the uneven Benford-like distribution of leading numbers. Excluding the most popular words essentially improves the correlation of actual textual data with the Zipfian distribution, whereas the Benford distribution of leading numbers (arising from the overall amount of a certain word) is insensitive to the same elimination procedure. The calculated values of the moduli of slopes of double logarithmical plots for artificial languages (C++, Java) are markedly larger than those for human ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2019

Universal and non-universal text statistics: Clustering coefficient for language identification

In this work we analyze statistical properties of 91 relatively small te...
research
06/30/2021

Zipf's laws of meaning in Catalan

In his pioneering research, G. K. Zipf formulated a couple of statistica...
research
12/16/2014

Scaling laws in human speech, decreasing emergence of new words and a generalized model

Human language, as a typical complex system, its organization and evolut...
research
03/01/2015

Variation of word frequencies in Russian literary texts

We study the variation of word frequencies in Russian literary texts. Ou...
research
02/07/2022

L^2-Betti numbers and computability of reals

We study the computability degree of real numbers arising as L^2-Betti n...
research
11/23/2018

Rank-frequency distribution of natural languages: a difference of probabilities approach

The time variation of the rank k of words for six Indo-European language...
research
06/04/2019

Optimal coding and the origins of Zipfian laws

The problem of compression in standard information theory consists of as...

Please sign up or login with your details

Forgot password? Click here to reset