The Dependence of Frequency Distributions on Multiple Meanings of Words, Codes and Signs

09/28/2017
by   Xiaoyong Yan, et al.
0

The dependence of the frequency distributions due to multiple meanings of words in a text is investigated by deleting letters. By coding the words with fewer letters the number of meanings per coded word increases. This increase is measured and used as an input in a predictive theory. For a text written in English, the word-frequency distribution is broad and fat-tailed, whereas if the words are only represented by their first letter the distribution becomes exponential. Both distribution are well predicted by the theory, as is the whole sequence obtained by consecutively representing the words by the first L=6,5,4,3,2,1 letters. Comparisons of texts written by Chinese characters and the same texts written by letter-codes are made and the similarity of the corresponding frequency-distributions are interpreted as a consequence of the multiple meanings of Chinese characters. This further implies that the difference of the shape for word-frequencies for an English text written by letters and a Chinese text written by Chinese characters is due to the coding and not to the language per se.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2007

The predictability of letters in written english

We show that the predictability of letters in written English texts depe...
research
05/26/2020

The 'Letter' Distribution in the Chinese Language

Corpus-based statistical analysis plays a significant role in linguistic...
research
07/11/2017

On the letter frequencies and entropy of written Marathi

We carry out a comprehensive analysis of letter frequencies in contempor...
research
04/17/2021

Customized determination of stop words using Random Matrix Theory approach

The distances between words calculated in word units are studied and com...
research
03/01/2015

Variation of word frequencies in Russian literary texts

We study the variation of word frequencies in Russian literary texts. Ou...
research
01/15/2021

Motion-Based Handwriting Recognition and Word Reconstruction

In this project, we leverage a trained single-letter classifier to predi...
research
05/02/2018

Robustness of sentence length measures in written texts

Hidden structural patterns in written texts have been subject of conside...

Please sign up or login with your details

Forgot password? Click here to reset