Unsupervised Alignment of Distributional Word Embeddings

03/09/2022
by   Aissatou Diallo, et al.
0

Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. In this paper, we propose stochastic optimization approach for aligning probabilistic embeddings. Finally, we evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs and performs better than the point-vector based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2018

Gromov-Wasserstein Alignment of Word Embedding Spaces

Cross-lingual or cross-domain correspondences play key roles in tasks ra...
research
05/29/2018

Unsupervised Alignment of Embeddings with Wasserstein Procrustes

We consider the task of aligning two sets of points in high dimension, w...
research
09/06/2022

Monolingual alignment of word senses and definitions in lexicographical resources

The focus of this thesis is broadly on the alignment of lexicographical ...
research
11/15/2017

An Unsupervised Approach for Mapping between Vector Spaces

We present a language independent, unsupervised approach for transformin...
research
04/18/2016

Clustering Comparable Corpora of Russian and Ukrainian Academic Texts: Word Embeddings and Semantic Fingerprints

We present our experience in applying distributional semantics (neural w...
research
01/01/2021

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment

Bilingual lexicons map words in one language to their translations in an...
research
11/01/2018

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy

Cross-lingual word embeddings aim to capture common linguistic regularit...

Please sign up or login with your details

Forgot password? Click here to reset