Information-Theoretic Characterization of Vowel Harmony: A Cross-Linguistic Study on Word Lists

08/09/2023
by   Julius Steuer, et al.
0

We present a cross-linguistic study that aims to quantify vowel harmony using data-driven computational modeling. Concretely, we define an information-theoretic measure of harmonicity based on the predictability of vowels in a natural language lexicon, which we estimate using phoneme-level language models (PLMs). Prior quantitative studies have relied heavily on inflected word-forms in the analysis of vowel harmony. We instead train our models using cross-linguistically comparable lemma forms with little or no inflection, which enables us to cover more under-studied languages. Training data for our PLMs consists of word lists with a maximum of 1000 entries per language. Despite the fact that the data we employ are substantially smaller than previously used corpora, our experiments demonstrate the neural PLMs capture vowel harmony patterns in a set of languages that exhibit this phenomenon. Our work also demonstrates that word lists are a valuable resource for typological research, and offers new possibilities for future studies on low-resource, under-studied languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2021

Finding Concept-specific Biases in Form–Meaning Associations

This work presents an information-theoretic operationalisation of cross-...
research
02/20/2019

Phoneme Level Language Models for Sequence Based Low Resource ASR

Building multilingual and crosslingual models help bring different langu...
research
12/30/2021

Utilizing Wordnets for Cognate Detection among Indian Languages

Automatic Cognate Detection (ACD) is a challenging task which has been u...
research
06/10/2018

Are All Languages Equally Hard to Language-Model?

For general modeling methods applied to diverse languages, a natural que...
research
04/01/2021

Mining Wikidata for Name Resources for African Languages

This work supports further development of language technology for the la...
research
06/27/2019

Morphological Irregularity Correlates with Frequency

We present a study of morphological irregularity. Following recent work,...
research
02/17/2018

Global-scale phylogenetic linguistic inference from lexical resources

Automatic phylogenetic inference plays an increasingly important role in...

Please sign up or login with your details

Forgot password? Click here to reset