Word forms - not just their lengths- are optimized for efficient communication

03/06/2017
by   Stephan C. Meylan, et al.
0

The inverse relationship between the length of a word and the frequency of its use, first identified by G.K. Zipf in 1935, is a classic empirical law that holds across a wide range of human languages. We demonstrate that length is one aspect of a much more general property of words: how distinctive they are with respect to other words in a language. Distinctiveness plays a critical role in recognizing words in fluent speech, in that it reflects the strength of potential competitors when selecting the best candidate for an ambiguous signal. Phonological information content, a measure of a word's string probability under a statistical model of a language's sound or character sequences, concisely captures distinctiveness. Examining large-scale corpora from 13 languages, we find that distinctiveness significantly outperforms word length as a predictor of frequency. This finding provides evidence that listeners' processing constraints shape fine-grained aspects of word forms across languages.

READ FULL TEXT

page 7

page 8

research
08/22/2022

The optimality of word lengths. Theoretical foundations and an empirical study

One of the most robust patterns found in human languages is Zipf's law o...
research
03/17/2023

Direct and indirect evidence of compression of word lengths. Zipf's law of abbreviation revisited

Zipf's law of abbreviation, the tendency of more frequent words to be sh...
research
05/07/2020

Phonotactic Complexity and its Trade-offs

We present methods for calculating a measure of phonotactic complexity—b...
research
03/05/2020

Generating a Gray code for prefix normal words in amortized polylogarithmic time per word

A prefix normal word is a binary word with the property that no substrin...
research
05/08/2021

Inside the Binary Reflected Gray Code: Flip-Swap Languages in 2-Gray Code Order

A flip-swap language is a set S of binary strings of length n such that ...
research
09/01/2020

Hearings and mishearings: decrypting the spoken word

We propose a model of the speech perception of individual words in the p...
research
04/19/2015

Compression and the origins of Zipf's law of abbreviation

Languages across the world exhibit Zipf's law of abbreviation, namely mo...

Please sign up or login with your details

Forgot password? Click here to reset