Nefnir: A high accuracy lemmatizer for Icelandic

Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages. We describe and evaluate Nefnir, a new open source lemmatizer for Icelandic. Nefnir uses suffix substitution rules, derived from a large morphological database, to lemmatize tagged text. Evaluation shows that for correctly tagged text, Nefnir obtains an accuracy of 99.55 text tagged with a PoS tagger, the accuracy obtained is 96.88

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2017

A Morphology-aware Network for Morphological Disambiguation

Agglutinative languages such as Turkish, Finnish and Hungarian require m...
research
09/14/2021

Hunspell for Sorani Kurdish Spell Checking and Morphological Analysis

Spell checking and morphological analysis are two fundamental tasks in t...
research
02/13/2020

Comparison of Turkish Word Representations Trained on Different Morphological Forms

Increased popularity of different text representations has also brought ...
research
06/29/2020

Towards the Study of Morphological Processing of the Tangkhul Language

There is no or little work on natural language processing of Tangkhul la...
research
09/17/2021

CKMorph: A Comprehensive Morphological Analyzer for Central Kurdish

A morphological analyzer, which is a significant component of many natur...
research
03/25/2015

Morphological Analyzer and Generator for Russian and Ukrainian Languages

pymorphy2 is a morphological analyzer and generator for Russian and Ukra...
research
05/11/2020

Neural Polysynthetic Language Modelling

Research in natural language processing commonly assumes that approaches...

Please sign up or login with your details

Forgot password? Click here to reset