Morphological Disambiguation from Stemming Data

11/11/2020
by   Antoine Nzeyimana, et al.
0

Morphological analysis and disambiguation is an important task and a crucial preprocessing step in natural language processing of morphologically rich languages. Kinyarwanda, a morphologically rich language, currently lacks tools for automated morphological analysis. While linguistically curated finite state tools can be easily developed for morphological analysis, the morphological richness of the language allows many ambiguous analyses to be produced, requiring effective disambiguation. In this paper, we propose learning to morphologically disambiguate Kinyarwanda verbal forms from a new stemming dataset collected through crowd-sourcing. Using feature engineering and a feed-forward neural network based classifier, we achieve about 89 non-contextualized disambiguation accuracy. Our experiments reveal that inflectional properties of stems and morpheme association rules are the most discriminative features for disambiguation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2017

A Morphology-aware Network for Morphological Disambiguation

Agglutinative languages such as Turkish, Finnish and Hungarian require m...
research
07/23/2017

Rule-Based Spanish Morphological Analyzer Built From Spell Checking Lexicon

Preprocessing tools for automated text analysis have become more widely ...
research
03/25/2017

Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System

Maltese is a morphologically rich language with a hybrid morphological s...
research
07/12/2020

Neural disambiguation of lemma and part of speech in morphologically rich languages

We consider the problem of disambiguating the lemma and part of speech o...
research
04/06/2020

A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages

This work describes experiments which probe the hidden representations o...
research
10/24/2020

A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis

This paper presents first benchmark corpus of Sanskrit Pratyaya (suffix)...
research
09/14/2021

Hunspell for Sorani Kurdish Spell Checking and Morphological Analysis

Spell checking and morphological analysis are two fundamental tasks in t...

Please sign up or login with your details

Forgot password? Click here to reset