Neural Unsupervised Reconstruction of Protolanguage Word Forms

11/16/2022
by   Andre He, et al.
0

We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms. Previous work in this domain used expectation-maximization to predict simple phonological changes between ancient word forms and their cognates in modern languages. We extend this work with neural models that can capture more complicated phonological and morphological changes. At the same time, we preserve the inductive biases from classical methods by building monotonic alignment constraints into the model and deliberately underfitting during the maximization step. We evaluate our performance on the task of reconstructing Latin from a dataset of cognates across five Romance languages, achieving a notable reduction in edit distance from the target word forms compared to previous methods.

READ FULL TEXT
research
08/22/2019

Unsupervised Lemmatization as Embeddings-Based Word Clustering

We focus on the task of unsupervised lemmatization, i.e. grouping togeth...
research
11/04/2016

Morphological Inflection Generation with Hard Monotonic Attention

We present a neural model for morphological inflection generation which ...
research
08/07/2019

Ab Antiquo: Proto-language Reconstruction with RNNs

Historical linguists have identified regularities in the process of hist...
research
08/29/2018

Neural Metaphor Detection in Context

We present end-to-end neural models for detecting metaphorical word use ...
research
06/10/2018

Unsupervised Disambiguation of Syncretism in Inflected Lexicons

Lexical ambiguity makes it difficult to compute various useful statistic...
research
09/28/2020

Neural Baselines for Word Alignment

Word alignments identify translational correspondences between words in ...
research
04/17/2021

Minimal Supervision for Morphological Inflection

Neural models for the various flavours of morphological inflection tasks...

Please sign up or login with your details

Forgot password? Click here to reset