Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages

03/09/2020
by   Machel Reid, et al.
0

The contrast between the need for large amounts of data for current Natural Language Processing (NLP) techniques, and the lack thereof, is accentuated in the case of African languages, most of which are considered low-resource. To help circumvent this issue, we explore techniques exploiting the qualities of morphologically rich languages (MRLs), while leveraging pretrained word vectors in well-resourced languages. In our exploration, we show that a meta-embedding approach combining both pretrained and morphologically-informed word embeddings performs best in the downstream task of Xhosa-English translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2019

Mapping Supervised Bilingual Word Embeddings from English to low-resource languages

It is very challenging to work with low-resource languages due to the in...
research
09/16/2021

Revisiting Tri-training of Dependency Parsers

We compare two orthogonal semi-supervised learning techniques, namely tr...
research
07/13/2018

Low-Resource Text Classification using Domain-Adversarial Learning

Deep learning techniques have recently shown to be successful in many na...
research
05/09/2018

LearningWord Embeddings for Low-resource Languages by PU Learning

Word embedding is a key component in many downstream applications in pro...
research
09/20/2018

Predicting Argumenthood of English Preposition Phrases

Distinguishing between core and non-core dependents (i.e., arguments and...
research
10/22/2020

Investigating the True Performance of Transformers in Low-Resource Languages: A Case Study in Automatic Corpus Creation

Transformers represent the state-of-the-art in Natural Language Processi...
research
06/09/2022

Predicting Embedding Reliability in Low-Resource Settings Using Corpus Similarity Measures

This paper simulates a low-resource setting across 17 languages in order...

Please sign up or login with your details

Forgot password? Click here to reset