Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

07/24/2022
by   Büşra Marşan, et al.
0

In this study, we aim to offer linguistically motivated solutions to resolve the issues of the lack of representation of null morphemes, highly productive derivational processes, and syncretic morphemes of Turkish in the BOUN Treebank without diverging from the Universal Dependencies framework. In order to tackle these issues, new annotation conventions were introduced by splitting certain lemmas and employing the MISC (miscellaneous) tab in the UD framework to denote derivation. Representational capabilities of the re-annotated treebank were tested on a LSTM-based dependency parser and an updated version of the BoAT Tool is introduced.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2020

Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank and the BoAT Annotation Tool

In this paper, we describe our contributions and efforts to develop Turk...
research
04/22/2020

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

Universal Dependencies is an open community effort to create cross-lingu...
research
02/22/2021

Creating a Universal Dependencies Treebank of Spoken Frisian-Dutch Code-switched Data

This paper explores the difficulties of annotating transcribed spoken Du...
research
11/26/2016

Fill it up: Exploiting partial dependency annotations in a minimum spanning tree parser

Unsupervised models of dependency parsing typically require large amount...
research
04/23/2018

Parsing Tweets into Universal Dependencies

We study the problem of analyzing tweets with Universal Dependencies. We...
research
11/24/2021

For the Purpose of Curry: A UD Treebank for Ashokan Prakrit

We present the first linguistically annotated treebank of Ashokan Prakri...

Please sign up or login with your details

Forgot password? Click here to reset