Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

by   Joakim Nivre, et al.

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. The annotation consists in a linguistically motivated word segmentation; a morphological layer comprising lemmas, universal part-of-speech tags, and standardized morphological features; and a syntactic layer focusing on syntactic relations between predicates, arguments and modifiers. In this paper, we describe version 2 of the guidelines (UD v2), discuss the major changes from UD v1 to UD v2, and give an overview of the currently available treebanks for 90 languages.


page 1

page 2

page 3

page 4


Developing Universal Dependency Treebanks for Magahi and Braj

In this paper, we discuss the development of treebanks for two low-resou...

On the Definition of Japanese Word

The annotation guidelines for Universal Dependencies (UD) stipulate that...

Contextualization of Morphological Inflection

Critical to natural language generation is the production of correctly i...

Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning

Morpho-syntactic lexicons provide information about the morphological an...

Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations

This article presents a discussion on the main linguistic phenomena whic...

For the Purpose of Curry: A UD Treebank for Ashokan Prakrit

We present the first linguistically annotated treebank of Ashokan Prakri...

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

In this study, we aim to offer linguistically motivated solutions to res...