Unsupervised Bilingual Lexicon Induction Across Writing Systems

01/31/2020
by   Parker Riley, et al.
0

Recent embedding-based methods in unsupervised bilingual lexicon induction have shown good results, but generally have not leveraged orthographic (spelling) information, which can be helpful for pairs of related languages. This work augments a state-of-the-art method with orthographic features, and extends prior work in this space by proposing methods that can learn and utilize orthographic correspondences even between languages with different scripts. We demonstrate this by experimenting on three language pairs with different scripts and varying degrees of lexical similarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

A Simple Method for Unsupervised Bilingual Lexicon Induction for Data-Imbalanced, Closely Related Language Pairs

Existing approaches for unsupervised bilingual lexicon induction (BLI) o...
research
03/29/2021

Unsupervised Machine Translation On Dravidian Languages

Unsupervised neural machine translation (UNMT) is beneficial especially ...
research
09/03/2019

Duality Regularization for Unsupervised Bilingual Lexicon Induction

Unsupervised bilingual lexicon induction naturally exhibits duality, whi...
research
05/26/2021

Word Embedding Transformation for Robust Unsupervised Bilingual Lexicon Induction

Great progress has been made in unsupervised bilingual lexicon induction...
research
01/13/2015

Annotating Cognates and Etymological Origin in Turkic Languages

Turkic languages exhibit extensive and diverse etymological relationship...
research
11/30/2020

A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction

Unsupervised Bilingual Dictionary Induction methods based on the initial...
research
10/24/2020

Clustering Contextualized Representations of Text for Unsupervised Syntax Induction

We explore clustering of contextualized text representations for two uns...

Please sign up or login with your details

Forgot password? Click here to reset