Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization

10/17/2022
by   Thomas Effland, et al.
0

We present Expected Statistic Regularization (ESR), a novel regularization technique that utilizes low-order multi-task structural statistics to shape model distributions for semi-supervised learning on low-resource datasets. We study ESR in the context of cross-lingual transfer for syntactic analysis (POS tagging and labeled dependency parsing) and present several classes of low-order statistic functions that bear on model behavior. Experimentally, we evaluate the proposed statistics with ESR for unsupervised transfer on 5 diverse target languages and show that all statistics, when estimated accurately, yield improvements to both POS and LAS, with the best statistic improving POS by +7.0 and LAS by +8.5 on average. We also present semi-supervised transfer and learning curve experiments that show ESR provides significant gains over strong cross-lingual-transfer-plus-fine-tuning baselines for modest amounts of label data. These results indicate that ESR is a promising and complementary approach to model-transfer approaches for cross-lingual parsing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2019

Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections

Cross-lingual transfer is an effective way to build syntactic analysis t...
research
04/10/2021

Meta-learning for fast cross-lingual adaptation in dependency parsing

Meta-learning, or learning to learn, is a technique that can help to ove...
research
01/06/2017

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages

In cross-lingual dependency annotation projection, information is often ...
research
01/27/2021

PPT: Parsimonious Parser Transfer for Unsupervised Cross-Lingual Adaptation

Cross-lingual transfer is a leading technique for parsing low-resource l...
research
08/09/2023

Cross-Lingual Constituency Parsing for Middle High German: A Delexicalized Approach

Constituency parsing plays a fundamental role in advancing natural langu...
research
01/27/2022

Systematic Investigation of Strategies Tailored for Low-Resource Settings for Sanskrit Dependency Parsing

Existing state of the art approaches for Sanskrit Dependency Parsing (SD...
research
06/11/2018

Part-of-Speech Tagging on an Endangered Language: a Parallel Griko-Italian Resource

Most work on part-of-speech (POS) tagging is focused on high resource la...

Please sign up or login with your details

Forgot password? Click here to reset