Trimming Phonetic Alignments Improves the Inference of Sound Correspondence Patterns from Multilingual Wordlists

03/31/2023
by   Frederic Blum, et al.
0

Sound correspondence patterns form the basis of cognate detection and phonological reconstruction in historical language comparison. Methods for the automatic inference of correspondence patterns from phonetically aligned cognate sets have been proposed, but their application to multilingual wordlists requires extremely well annotated datasets. Since annotation is tedious and time consuming, it would be desirable to find ways to improve aligned cognate data automatically. Taking inspiration from trimming techniques in evolutionary biology, which improve alignments by excluding problematic sites, we propose a workflow that trims phonetic alignments in comparative linguistics prior to the inference of correspondence patterns. Testing these techniques on a large standardized collection of ten datasets with expert annotations from different language families, we find that the best trimming technique substantially improves the overall consistency of the alignments. The results show a clear increase in the proportion of frequent correspondence patterns and words exhibiting regular cognate relations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2022

A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns

Computational approaches in historical linguistics have been increasingl...
research
03/27/2021

Supersense and Sensibility: Proxy Tasks for Semantic Annotation of Prepositions

Prepositional supersense annotation is time-consuming and requires exper...
research
02/16/2017

Fast and unsupervised methods for multilingual cognate clustering

In this paper we explore the use of unsupervised methods for detecting c...
research
04/14/2002

Belief Revision and Rational Inference

The (extended) AGM postulates for belief revision seem to deal with the ...
research
04/15/2018

Are Automatic Methods for Cognate Detection Good Enough for Phylogenetic Reconstruction in Historical Linguistics?

We evaluate the performance of state-of-the-art algorithms for automatic...
research
10/06/2011

A Comparison of Different Machine Transliteration Models

Machine transliteration is a method for automatically converting words i...
research
02/01/2023

Inference of Partial Colexifications from Multilingual Wordlists

The past years have seen a drastic rise in studies devoted to the invest...

Please sign up or login with your details

Forgot password? Click here to reset