Cross-Lingual Constituency Parsing for Middle High German: A Delexicalized Approach

08/09/2023
βˆ™
by   Ercong Nie, et al.
βˆ™
0
βˆ™

Constituency parsing plays a fundamental role in advancing natural language processing (NLP) tasks. However, training an automatic syntactic analysis system for ancient languages solely relying on annotated parse data is a formidable task due to the inherent challenges in building treebanks for such languages. It demands extensive linguistic expertise, leading to a scarcity of available resources. To overcome this hurdle, cross-lingual transfer techniques which require minimal or even no annotated data for low-resource target languages offer a promising solution. In this study, we focus on building a constituency parser for 𝐌iddle 𝐇igh 𝐆erman πŒπ‡π† under realistic conditions, where no annotated MHG treebank is available for training. In our approach, we leverage the linguistic continuity and structural similarity between MHG and 𝐌odern 𝐆erman πŒπ†, along with the abundance of MG treebank resources. Specifically, by employing the 𝑑𝑒𝑙𝑒π‘₯π‘–π‘π‘Žπ‘™π‘–π‘§π‘Žπ‘‘π‘–π‘œπ‘› method, we train a constituency parser on MG parse datasets and perform cross-lingual transfer to MHG parsing. Our delexicalized constituency parser demonstrates remarkable performance on the MHG test set, achieving an F1-score of 67.3 zero-shot cross-lingual baseline by a margin of 28.6 results underscore the practicality and potential for automatic syntactic analysis in other ancient languages that face similar challenges as MHG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 09/26/2022

Meta-Learning a Cross-lingual Manifold for Semantic Parsing

Localizing a semantic parser to support new languages requires effective...
research
βˆ™ 05/24/2022

Universal Dependency Treebank for Odia Language

This paper presents the first publicly available treebank of Odia, a mor...
research
βˆ™ 04/16/2020

Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers

Current methods of cross-lingual parser transfer focus on predicting the...
research
βˆ™ 05/07/2020

Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences

The patterns in which the syntax of different languages converges and di...
research
βˆ™ 12/14/2021

Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

AMR parsing has experienced an unprecendented increase in performance in...
research
βˆ™ 03/24/2020

Cross-Lingual Adaptation Using Universal Dependencies

We describe a cross-lingual adaptation method based on syntactic parse t...
research
βˆ™ 10/17/2022

Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization

We present Expected Statistic Regularization (ESR), a novel regularizati...

Please sign up or login with your details

Forgot password? Click here to reset