Can a Transformer Pass the Wug Test? Tuning Copying Bias in Neural Morphological Inflection Models

04/13/2021
by   Ling Liu, et al.
0

Deep learning sequence models have been successfully applied to the task of morphological inflection. The results of the SIGMORPHON shared tasks in the past several years indicate that such models can perform well, but only if the training data cover a good amount of different lemmata, or if the lemmata that are inflected at test time have also been seen in training, as has indeed been largely the case in these tasks. Surprisingly, standard models such as the Transformer almost completely fail at generalizing inflection patterns when asked to inflect previously unseen lemmata – i.e. under "wug test"-like circumstances. While established data augmentation techniques can be employed to alleviate this shortcoming by introducing a copying bias through hallucinating synthetic new word forms using the alphabet in the language at hand, we show that, to be more effective, the hallucination process needs to pay attention to substrings of syllable-like length rather than individual characters or stems. We report a significant performance improvement with our substring-based hallucination model over previous data hallucination methods when training and test data do not overlap in their lemmata.

READ FULL TEXT
research
06/21/2023

Morphological Inflection with Phonological Features

Recent years have brought great advances into solving morphological task...
research
11/03/2022

Exploring the State-of-the-Art Language Modeling Methods and Data Augmentation Techniques for Multilingual Clause-Level Morphology

This paper describes the KUIS-AI NLP team's submission for the 1^st Shar...
research
06/21/2020

The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2

We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on ...
research
09/14/2021

A Three Step Training Approach with Data Augmentation for Morphological Inflection

We present the BME submission for the SIGMORPHON 2021 Task 0 Part 1, Gen...
research
06/27/2017

CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages

The CoNLL-SIGMORPHON 2017 shared task on supervised morphological genera...
research
08/12/2021

(Un)solving Morphological Inflection: Lemma Overlap Artificially Inflates Models' Performance

In the domain of Morphology, Inflection is a fundamental and important t...
research
06/11/2019

Cued@wmt19:ewc&lms

Two techniques provide the fabric of the Cambridge University Engineerin...

Please sign up or login with your details

Forgot password? Click here to reset