Weakly-Supervised Concept-based Adversarial Learning for Cross-lingual Word Embeddings

04/20/2019
by   Haozhou Wang, et al.
0

Distributed representations of words which map each word to a continuous vector have proven useful in capturing important linguistic information not only in a single language but also across different languages. Current unsupervised adversarial approaches show that it is possible to build a mapping matrix that align two sets of monolingual word embeddings together without high quality parallel data such as a dictionary or a sentence-aligned corpus. However, without post refinement, the performance of these methods' preliminary mapping is not good, leading to poor performance for typologically distant languages. In this paper, we propose a weakly-supervised adversarial training method to overcome this limitation, based on the intuition that mapping across languages is better done at the concept level than at the word level. We propose a concept-based adversarial training method which for most languages improves the performance of previous unsupervised adversarial methods, especially for typologically distant language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2020

Multi-Adversarial Learning for Cross-Lingual Word Embeddings

Generative adversarial networks (GANs) have succeeded in inducing cross-...
research
04/04/2019

Revisiting Adversarial Autoencoder for Unsupervised Word Translation with Cycle Consistency and Improved Training

Adversarial training has shown impressive success in learning bilingual ...
research
07/06/2019

Weakly-supervised Knowledge Graph Alignment with Adversarial Learning

This paper studies aligning knowledge graphs from different sources or l...
research
05/29/2018

Unsupervised Alignment of Embeddings with Wasserstein Procrustes

We consider the task of aligning two sets of points in high dimension, w...
research
04/20/2018

Improving Supervised Bilingual Mapping of Word Embeddings

Continuous word representations, learned on different languages, can be ...
research
05/31/2022

Don't Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings

Bilingual Word Embeddings (BWEs) are one of the cornerstones of cross-li...
research
06/25/2018

Mapping Unparalleled Clinical Professional and Consumer Languages with Embedding Alignment

Mapping and translating professional but arcane clinical jargons to cons...

Please sign up or login with your details

Forgot password? Click here to reset