DeepAI
Log In Sign Up

Unsupervised Bilingual Lexicon Induction Across Writing Systems

01/31/2020
by   Parker Riley, et al.
0

Recent embedding-based methods in unsupervised bilingual lexicon induction have shown good results, but generally have not leveraged orthographic (spelling) information, which can be helpful for pairs of related languages. This work augments a state-of-the-art method with orthographic features, and extends prior work in this space by proposing methods that can learn and utilize orthographic correspondences even between languages with different scripts. We demonstrate this by experimenting on three language pairs with different scripts and varying degrees of lexical similarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/29/2021

Unsupervised Machine Translation On Dravidian Languages

Unsupervised neural machine translation (UNMT) is beneficial especially ...
05/26/2021

Word Embedding Transformation for Robust Unsupervised Bilingual Lexicon Induction

Great progress has been made in unsupervised bilingual lexicon induction...
09/03/2019

Duality Regularization for Unsupervised Bilingual Lexicon Induction

Unsupervised bilingual lexicon induction naturally exhibits duality, whi...
01/13/2015

Annotating Cognates and Etymological Origin in Turkic Languages

Turkic languages exhibit extensive and diverse etymological relationship...
11/30/2020

A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction

Unsupervised Bilingual Dictionary Induction methods based on the initial...
10/24/2020

Clustering Contextualized Representations of Text for Unsupervised Syntax Induction

We explore clustering of contextualized text representations for two uns...
04/30/2015

Detecting and ordering adjectival scalemates

This paper presents a pattern-based method that can be used to infer adj...