Turkish PoS Tagging by Reducing Sparsity with Morpheme Tags in Small Datasets

03/09/2017
by   Burcu Can, et al.
0

Sparsity is one of the major problems in natural language processing. The problem becomes even more severe in agglutinating languages that are highly prone to be inflected. We deal with sparsity in Turkish by adopting morphological features for part-of-speech tagging. We learn inflectional and derivational morpheme tags in Turkish by using conditional random fields (CRF) and we employ the morpheme tags in part-of-speech (PoS) tagging by using hidden Markov models (HMMs) to mitigate sparsity. Results show that using morpheme tags in PoS tagging helps alleviate the sparsity in emission probabilities. Our model outperforms other hidden Markov model based PoS tagging models for small training datasets in Turkish. We obtain an accuracy of 94.1 tagging and 89.2

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2017

Joint PoS Tagging and Stemming for Agglutinative Languages

The number of word forms in agglutinative languages is theoretically inf...
research
01/10/2018

Unsupervised Part-of-Speech Induction

Part-of-Speech (POS) tagging is an old and fundamental task in natural l...
research
08/04/2020

Reliable Part-of-Speech Tagging of Historical Corpora through Set-Valued Prediction

Syntactic annotation of corpora in the form of part-of-speech (POS) tags...
research
06/11/2020

RTEX: A novel methodology for Ranking, Tagging, and Explanatory diagnostic captioning of radiography exams

This paper introduces RTEx, a novel methodology for a) ranking radiograp...
research
05/21/2020

Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging

The ability to take into account the characteristics - also called featu...
research
01/11/2017

Decoding with Finite-State Transducers on GPUs

Weighted finite automata and transducers (including hidden Markov models...
research
04/05/2019

Diversified Hidden Markov Models for Sequential Labeling

Labeling of sequential data is a prevalent meta-problem for a wide range...

Please sign up or login with your details

Forgot password? Click here to reset