Improving Part-of-Speech Tagging for NLP Pipelines

by   Vishaal Jatav, et al.

This paper outlines the results of sentence level linguistics based rules for improving part-of-speech tagging. It is well known that the performance of complex NLP systems is negatively affected if one of the preliminary stages is less than perfect. Errors in the initial stages in the pipeline have a snowballing effect on the pipeline's end performance. We have created a set of linguistics based rules at the sentence level which adjust part-of-speech tags from state-of-the-art taggers. Comparison with state-of-the-art taggers on widely used benchmarks demonstrate significant improvements in tagging accuracy and consequently in the quality and accuracy of NLP systems.


page 1

page 2

page 3

page 4


Cross-Register Projection for Headline Part of Speech Tagging

Part of speech (POS) tagging is a familiar NLP task. State of the art ta...

A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-of-Speech Tagging

In this paper, we propose a new approach to construct a system of transf...

Larger-Context Tagging: When and Why Does It Work?

The development of neural networks and pretraining techniques has spawne...

Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks

This paper describes our submission to CoNLL 2018 UD Shared Task. We hav...

BERT Rediscovers the Classical NLP Pipeline

Pre-trained text encoders have rapidly advanced the state of the art on ...

An Experimental Investigation of Part-Of-Speech Taggers for Vietnamese

Part-of-speech (POS) tagging plays an important role in Natural Language...

Genetic approach for arabic part of speech tagging

With the growing number of textual resources available, the ability to u...