Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

04/22/2020
by   Joakim Nivre, et al.
0

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. The annotation consists in a linguistically motivated word segmentation; a morphological layer comprising lemmas, universal part-of-speech tags, and standardized morphological features; and a syntactic layer focusing on syntactic relations between predicates, arguments and modifiers. In this paper, we describe version 2 of the guidelines (UD v2), discuss the major changes from UD v1 to UD v2, and give an overview of the currently available treebanks for 90 languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/26/2022

Developing Universal Dependency Treebanks for Magahi and Braj

In this paper, we discuss the development of treebanks for two low-resou...
06/24/2019

On the Definition of Japanese Word

The annotation guidelines for Universal Dependencies (UD) stipulate that...
05/04/2019

Contextualization of Morphological Inflection

Critical to natural language generation is the production of correctly i...
12/16/2015

Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning

Morpho-syntactic lexicons provide information about the morphological an...
11/03/2020

Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations

This article presents a discussion on the main linguistic phenomena whic...
11/24/2021

For the Purpose of Curry: A UD Treebank for Ashokan Prakrit

We present the first linguistically annotated treebank of Ashokan Prakri...
07/24/2022

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

In this study, we aim to offer linguistically motivated solutions to res...