Estimation of the Frequency of Occurrence of Italian Phonemes in Text

by   Javi Arango, et al.

The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes - plus consonant geminated counterparts - of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6,000 phonemes (approx. 1,250 words), varying by <0.025 single source and as an average across sources.



page 20

page 22

page 24

page 25

page 27


Label or Message: A Large-Scale Experimental Survey of Texts and Objects Co-Occurrence

Our daily life is surrounded by textual information. Nowadays, the autom...

Labelled network subgraphs reveal stylistic subtleties in written texts

The vast amount of data and increase of computational capacity have allo...

Variation of word frequencies in Russian literary texts

We study the variation of word frequencies in Russian literary texts. Ou...

The 'Letter' Distribution in the Chinese Language

Corpus-based statistical analysis plays a significant role in linguistic...

Latin writing styles analysis with Machine Learning: New approach to old questions

In the Middle Ages texts were learned by heart and spread using oral mea...

Improving Yorùbá Diacritic Restoration

Yorùbá is a widely spoken West African language with a writing system ri...

Reconstructing Maps from Text

Previous research has demonstrated that Distributional Semantic Models (...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.