From Characters to Words to in Between: Do We Capture Morphology?

04/26/2017
by   Clara Vania, et al.
0

Words can be represented by composing the representations of subword units such as word segments, characters, and/or character n-grams. While such representations are effective and may capture the morphological regularities of words, they have not been systematically compared, and it is not understood how they interact with different morphological typologies. On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled. Our results extend previous findings that character representations are effective across typologies, and we find that a previously unstudied combination of character trigram representations composed with bi-LSTMs outperforms most others. But we also find room for improvement: none of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Better Character Language Modeling Through Morphology

We incorporate morphological supervision into character language models ...
research
06/08/2016

A Joint Model for Word Embedding and Word Morphology

This paper presents a joint model for performing unsupervised morphologi...
research
01/10/2022

Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model

This study proposes a method to develop neural models of the morphologic...
research
07/08/2015

What Your Username Says About You

Usernames are ubiquitous on the Internet, and they are often suggestive ...
research
08/31/2018

Indicatements that character language models learn English morpho-syntactic units and regularities

Character language models have access to surface morphological patterns,...
research
05/30/2018

Character-Level Models versus Morphology in Semantic Role Labeling

Character-level models have become a popular approach specially for thei...
research
03/12/2019

Character Eyes: Seeing Language through Character-Level Taggers

Character-level models have been used extensively in recent years in NLP...

Please sign up or login with your details

Forgot password? Click here to reset