Better Character Language Modeling Through Morphology

06/03/2019
by   Terra Blevins, et al.
University of Washington
0

We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint. Analyzing the CLMs shows that inflected words benefit more from explicitly modeling morphology than uninflected words, and that morphological supervision improves performance even as the amount of language modeling data grows. We then transfer morphological supervision across languages to improve language modeling performance in the low-resource setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/28/2018

What do character-level models learn about morphology? The case of dependency parsing

When parsing morphologically-rich languages with neural models, it is be...
12/11/2020

Morphology Matters: A Multilingual Language Modeling Analysis

Prior studies in multilingual language modeling (e.g., Cotterell et al.,...
04/26/2017

From Characters to Words to in Between: Do We Capture Morphology?

Words can be represented by composing the representations of subword uni...
04/23/2018

On the Diachronic Stability of Irregularity in Inflectional Morphology

Many languages' inflectional morphological systems are replete with irre...
05/26/2023

An Investigation of Noise in Morphological Inflection

With a growing focus on morphological inflection systems for languages w...

Please sign up or login with your details

Forgot password? Click here to reset