Phonotactic Complexity and its Trade-offs

05/07/2020
by   Tiago Pimentel, et al.
0

We present methods for calculating a measure of phonotactic complexity—bits per phoneme—that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language's phonotactics are. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of -0.74 between bits per phoneme and the average length of words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2015

Complexity and universality in the long-range order of words

As is the case of many signals produced by complex systems, language pre...
research
03/06/2017

Word forms - not just their lengths- are optimized for efficient communication

The inverse relationship between the length of a word and the frequency ...
research
06/22/2016

The word entropy of natural languages

The average uncertainty associated with words is an information-theoreti...
research
03/05/2020

Generating a Gray code for prefix normal words in amortized polylogarithmic time per word

A prefix normal word is a binary word with the property that no substrin...
research
10/07/2018

Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources

Transliteration converts words in a source language (e.g., English) into...
research
02/03/2021

Disambiguatory Signals are Stronger in Word-initial Positions

Psycholinguistic studies of human word processing and lexical access pro...
research
12/01/2017

Fundamental Limits on Data Acquisition: Trade-offs between Sample Complexity and Query Difficulty

In this paper, we consider query-based data acquisition and the correspo...

Please sign up or login with your details

Forgot password? Click here to reset