Unsupervised Cross-Lingual Transfer of Structured Predictors without Source Data

10/08/2021
by   Kemal Kurniawan, et al.
0

Providing technologies to communities or domains where training data is scarce or protected e.g., for privacy reasons, is becoming increasingly important. To that end, we generalise methods for unsupervised transfer from multiple input models for structured prediction. We show that the means of aggregating over the input models is critical, and that multiplying marginal probabilities of substructures to obtain high-probability structures for distant supervision is substantially better than taking the union of such structures over the input models, as done in prior work. Testing on 18 languages, we demonstrate that the method works in a cross-lingual setting, considering both dependency parsing and part-of-speech structured prediction problems. Our analyses show that the proposed method produces less noisy labels for the distant supervision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2021

PPT: Parsimonious Parser Transfer for Unsupervised Cross-Lingual Adaptation

Cross-lingual transfer is a leading technique for parsing low-resource l...
research
06/06/2019

Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections

Cross-lingual transfer is an effective way to build syntactic analysis t...
research
11/01/2018

Near or Far, Wide Range Zero-Shot Cross-Lingual Dependency Parsing

Cross-lingual transfer is the major means toleverage knowledge from high...
research
01/06/2017

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages

In cross-lingual dependency annotation projection, information is often ...
research
05/31/2022

Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

Bilingual Word Embeddings (BWEs) are one of the cornerstones of cross-li...
research
08/29/2017

Navigating the Data Lake with Datamaran: Automatically Extracting Structure from Log Datasets

Organizations routinely accumulate semi-structured log datasets generate...

Please sign up or login with your details

Forgot password? Click here to reset