MorphNet: A sequence-to-sequence model that combines morphological analysis and disambiguation

05/21/2018
by   Erenay Dayanık, et al.
0

We introduce MorphNet, a single model that combines morphological analysis and disambiguation. Traditionally, analysis of morphologically complex languages has been performed in two stages: (i) A morphological analyzer based on finite-state transducers produces all possible morphological analyses of a word, (ii) A statistical disambiguation model picks the correct analysis based on the context for each word. MorphNet uses a sequence-to-sequence recurrent neural network to combine analysis and disambiguation. We show that when trained with text labeled with correct morphological analyses, MorphNet obtains state-of-the art or comparable results for nine different datasets in seven different languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2015

Morphological Inflection Generation Using Character Sequence to Sequence Learning

Morphological inflection generation is the task of generating the inflec...
research
02/03/2019

Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

In this paper we present a novel lemmatization method based on a sequenc...
research
02/17/2018

Building a Word Segmenter for Sanskrit Overnight

There is an abundance of digitised texts available in Sanskrit. However,...
research
04/01/2021

Do RNN States Encode Abstract Phonological Processes?

Sequence-to-sequence models have delivered impressive results in word fo...
research
06/12/2017

SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models

This paper describes the Stockholm University/University of Groningen (S...
research
07/12/2020

Neural disambiguation of lemma and part of speech in morphologically rich languages

We consider the problem of disambiguating the lemma and part of speech o...
research
10/11/2019

Neural Generation for Czech: Data and Baselines

We present the first dataset targeted at end-to-end NLG in Czech in the ...

Please sign up or login with your details

Forgot password? Click here to reset