Don't Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction

09/06/2019
by   Paula Czarnowska, et al.
0

Human translators routinely have to translate rare inflections of words - due to the Zipfian distribution of words in a language. When translating from Spanish, a good translator would have no problem identifying the proper translation of a statistically rare inflection such as habláramos. Note the lexeme itself, hablar, is relatively common. In this work, we investigate whether state-of-the-art bilingual lexicon inducers are capable of learning this kind of generalization. We introduce 40 morphologically complete dictionaries in 10 languages and evaluate three of the state-of-the-art models on the task of translation of less frequent morphological forms. We demonstrate that the performance of state-of-the-art models drops considerably when evaluated on infrequent morphological inflections and then show that adding a simple morphological constraint at training time improves the performance, proving that the bilingual lexicon inducers can benefit from better encoding of morphology.

READ FULL TEXT

page 5

page 7

research
06/03/2019

Better Character Language Modeling Through Morphology

We incorporate morphological supervision into character language models ...
research
09/16/2017

Role of Morphology Injection in Statistical Machine Translation

Phrase-based Statistical models are more commonly used as they perform o...
research
08/31/2018

Cognate-aware morphological segmentation for multilingual neural translation

This article describes the Aalto University entry to the WMT18 News Tran...
research
10/05/2017

Morphology Generation for Statistical Machine Translation

When translating into morphologically rich languages, Statistical MT app...
research
10/06/2020

A Novel Challenge Set for Hebrew Morphological Disambiguation and Diacritics Restoration

One of the primary tasks of morphological parsers is the disambiguation ...
research
04/23/2018

On the Diachronic Stability of Irregularity in Inflectional Morphology

Many languages' inflectional morphological systems are replete with irre...
research
05/25/2023

Morphological Inflection: A Reality Check

Morphological inflection is a popular task in sub-word NLP with both pra...

Please sign up or login with your details

Forgot password? Click here to reset