For the Purpose of Curry: A UD Treebank for Ashokan Prakrit

11/24/2021
by   Adam Farris, et al.
0

We present the first linguistically annotated treebank of Ashokan Prakrit, an early Middle Indo-Aryan dialect continuum attested through Emperor Ashoka Maurya's 3rd century BCE rock and pillar edicts. For annotation, we used the multilingual Universal Dependencies (UD) formalism, following recent UD work on Sanskrit and other Indo-Aryan languages. We touch on some interesting linguistic features that posed issues in annotation: regnal names and other nominal compounds, "proto-ergative" participial constructions, and possible grammaticalizations evidenced by sandhi (phonological assimilation across morpheme boundaries). Eventually, we plan for a complete annotation of all attested Ashokan texts, towards the larger goals of improving UD coverage of different diachronic stages of Indo-Aryan and studying language change in Indo-Aryan using computational methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2022

MASALA: Modelling and Analysing the Semantics of Adpositions in Linguistic Annotation of Hindi

We present a completed, publicly available corpus of annotated semantic ...
research
04/22/2020

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

Universal Dependencies is an open community effort to create cross-lingu...
research
10/01/2022

CGELBank: CGEL as a Framework for English Syntax Annotation

We introduce the syntactic formalism of the Cambridge Grammar of the Eng...
research
08/12/2020

The Annotation Guideline of LST20 Corpus

This report presents the annotation guideline for LST20, a large-scale c...
research
03/17/2022

Towards Responsible Natural Language Annotation for the Varieties of Arabic

When building NLP models, there is a tendency to aim for broader coverag...
research
11/22/2020

Standardizing linguistic data: method and tools for annotating (pre-orthographic) French

With the development of big corpora of various periods, it becomes cruci...
research
07/24/2022

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

In this study, we aim to offer linguistically motivated solutions to res...

Please sign up or login with your details

Forgot password? Click here to reset