The Paradigm Discovery Problem

05/04/2020
by   Alexander Erdmann, et al.
0

This work treats the paradigm discovery problem (PDP), the task of learning an inflectional morphological system from unannotated sentences. We formalize the PDP and develop evaluation metrics for judging systems. Using currently available resources, we construct datasets for the task. We also devise a heuristic benchmark for the PDP and report empirical results on five diverse languages. Our benchmark system first makes use of word embeddings and string similarity to cluster forms by cell and by paradigm. Then, we bootstrap a neural transducer on top of the clustered data to predict words to realize the empty paradigm slots. An error analysis of our system suggests clustering by cell across different inflection classes is the most pressing challenge for future work. Our code and data are available for public use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2018

On the Complexity and Typology of Inflectional Morphological Systems

We quantify the linguistic complexity of different languages' morphologi...
research
11/15/2017

Unsupervised Morphological Expansion of Small Datasets for Improving Word Embeddings

We present a language independent, unsupervised method for building word...
research
05/03/2020

Unsupervised Morphological Paradigm Completion

We propose the task of unsupervised morphological paradigm completion. G...
research
05/28/2020

The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

In this paper, we describe the findings of the SIGMORPHON 2020 shared ta...
research
01/13/2020

Visual Storytelling via Predicting Anchor Word Embeddings in the Stories

We propose a learning model for the task of visual storytelling. The mai...
research
04/17/2021

Minimal Supervision for Morphological Inflection

Neural models for the various flavours of morphological inflection tasks...
research
06/10/2018

Unsupervised Disambiguation of Syncretism in Inflected Lexicons

Lexical ambiguity makes it difficult to compute various useful statistic...

Please sign up or login with your details

Forgot password? Click here to reset