Mapping Supervised Bilingual Word Embeddings from English to low-resource languages

10/14/2019
by   Sourav Dutta, et al.
0

It is very challenging to work with low-resource languages due to the inadequate availability of data. Using a dictionary to map independently trained word embeddings into a shared vector space has proved to be very useful in learning bilingual embeddings in the past. Here we have tried to map individual embeddings of words in English and their corresponding translated words in low-resource languages like Estonian, Slovenian, Slovakian, and Hungarian. We have used a supervised learning approach. We report accuracy scores through various retrieval strategies which show that it is possible to approach challenging tasks in Natural Language Processing like machine translation for such languages, provided that we have at least some amount of proper bilingual data. We also conclude that we can follow an unsupervised learning path on monolingual text data as that is more suitable for low-resource languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2020

Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages

The contrast between the need for large amounts of data for current Natu...
research
06/22/2020

Dirichlet-Smoothed Word Embeddings for Low-Resource Settings

Nowadays, classical count-based word embeddings using positive pointwise...
research
03/11/2020

Visual Grounding in Video for Unsupervised Word Translation

There are thousands of actively spoken languages on Earth, but a single ...
research
11/19/2017

Intelligent Word Embeddings of Free-Text Radiology Reports

Radiology reports are a rich resource for advancing deep learning applic...
research
08/25/2023

Media of Langue

This paper aims to archive the materials behind "Media of Langue" by Gok...
research
09/16/2021

Revisiting Tri-training of Dependency Parsers

We compare two orthogonal semi-supervised learning techniques, namely tr...
research
10/18/2022

RAPO: An Adaptive Ranking Paradigm for Bilingual Lexicon Induction

Bilingual lexicon induction induces the word translations by aligning in...

Please sign up or login with your details

Forgot password? Click here to reset