Comparing morphological complexity of Spanish, Otomi and Nahuatl

08/13/2018
by   Ximena Gutierrez-Vasques, et al.
0

We use two small parallel corpora for comparing the morphological complexity of Spanish, Otomi and Nahuatl. These are languages that belong to different linguistic families, the latter are low-resourced. We take into account two quantitative criteria, on one hand the distribution of types over tokens in a corpus, on the other, perplexity and entropy as indicators of word structure predictability. We show that a language can be complex in terms of how many different morphological word forms can produce, however, it may be less complex in terms of predictability of its internal structure of words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2017

Role of Morphology Injection in Statistical Machine Translation

Phrase-based Statistical models are more commonly used as they perform o...
research
02/13/2020

Comparison of Turkish Word Representations Trained on Different Morphological Forms

Increased popularity of different text representations has also brought ...
research
07/08/2018

On the Complexity and Typology of Inflectional Morphological Systems

We quantify the linguistic complexity of different languages' morphologi...
research
05/10/2023

K-UniMorph: Korean Universal Morphology and its Feature Schema

We present in this work a new Universal Morphology dataset for Korean. P...
research
05/03/2020

Bootstrapping Techniques for Polysynthetic Morphological Analysis

Polysynthetic languages have exceptionally large and sparse vocabularies...
research
03/03/2015

Complexity and universality in the long-range order of words

As is the case of many signals produced by complex systems, language pre...
research
05/13/2020

Validation and Normalization of DCS corpus using Sanskrit Heritage tools to build a tagged Gold Corpus

The Digital Corpus of Sanskrit records around 650,000 sentences along wi...

Please sign up or login with your details

Forgot password? Click here to reset