Feature-Rich Part-of-speech Tagging for Morphologically Complex Languages: Application to Bulgarian

11/26/2019
by   Georgi Georgiev, et al.
5

We present experiments with part-of-speech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-annotated corpus, achieving accuracy of 97.98 state-of-the-art for Bulgarian.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2018

LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs

We present LemmaTag, a featureless recurrent neural network architecture...
research
06/28/2018

Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging

Due to the fact that Korean is a highly agglutinative, character-rich la...
research
12/02/2019

Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models

In this paper, we present the first publicly available part-of-speech an...
research
07/21/2019

Augmenting a BiLSTM tagger with a Morphological Lexicon and a Lexical Category Identification Step

Previous work on using BiLSTM models for PoS tagging has primarily focus...
research
12/12/2014

A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-of-Speech Tagging

In this paper, we propose a new approach to construct a system of transf...
research
10/11/2021

A Review on Part-of-Speech Technologies

Developing an automatic part-of-speech (POS) tagging for any new languag...
research
04/03/2022

A Part-of-Speech Tagger for Yiddish: First Steps in Tagging the Yiddish Book Center Corpus

We describe the construction and evaluation of a part-of-speech tagger f...

Please sign up or login with your details

Forgot password? Click here to reset