ThamizhiUDp: A Dependency Parser for Tamil

by   Kengatharaiyer Sarveswaran, et al.

This paper describes how we developed a neural-based dependency parser, namely ThamizhiUDp, which provides a complete pipeline for the dependency parsing of the Tamil language text using Universal Dependency formalism. We have considered the phases of the dependency parsing pipeline and identified tools and resources in each of these phases to improve the accuracy and to tackle data scarcity. ThamizhiUDp uses Stanza for tokenisation and lemmatisation, ThamizhiPOSt and ThamizhiMorph for generating Part of Speech (POS) and Morphological annotations, and uuparser with multilingual training for dependency parsing. ThamizhiPOSt is our POS tagger, which is based on the Stanza, trained with Amrita POS-tagged corpus. It is the current state-of-the-art in Tamil POS tagging with an F1 score of 93.27. Our morphological analyzer, ThamizhiMorph is a rule-based system with a very good coverage of Tamil. Our dependency parser ThamizhiUDp was trained using multilingual data. It shows a Labelled Assigned Score (LAS) of 62.39, 4 points higher than the current best achieved for Tamil dependency parsing. Therefore, we show that breaking up the dependency parsing pipeline to accommodate existing tools and resources is a viable approach for low-resource languages.



There are no comments yet.


page 1

page 2

page 3

page 4


Exploiting Cross-Dialectal Gold Syntax for Low-Resource Historical Languages: Towards a Generic Parser for Pre-Modern Slavic

This paper explores the possibility of improving the performance of spec...

A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning

Fully data-driven, deep learning-based models are usually designed as la...

Semi-Supervised Neural System for Tagging, Parsing and Lematization

This paper describes the ICS PAS system which took part in CoNLL 2018 sh...

POS tagging, lemmatization and dependency parsing of West Frisian

We present a lemmatizer/POS-tagger/dependency parser for West Frisian us...

Scene Graph Parsing as Dependency Parsing

In this paper, we study the problem of parsing structured knowledge grap...

Universal Dependency Parsing from Scratch

This paper describes Stanford's system at the CoNLL 2018 UD Shared Task....

Spatial Dependency Parsing for 2D Document Understanding

Information Extraction (IE) for document images is often approached as a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.