Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow Interpolation

10/07/2022
by   Liping Tang, et al.
0

This paper investigates an unsupervised approach towards deriving a universal, cross-lingual word embedding space, where words with similar semantics from different languages are close to one another. Previous adversarial approaches have shown promising results in inducing cross-lingual word embedding without parallel data. However, the training stage shows instability for distant language pairs. Instead of mapping the source language space directly to the target language space, we propose to make use of a sequence of intermediate spaces for smooth bridging. Each intermediate space may be conceived as a pseudo-language space and is introduced via simple linear interpolation. This approach is modeled after domain flow in computer vision, but with a modified objective function. Experiments on intrinsic Bilingual Dictionary Induction tasks show that the proposed approach can improve the robustness of adversarial models with comparable and even better precision. Further experiments on the downstream task of Cross-Lingual Natural Language Inference show that the proposed model achieves significant performance improvement for distant language pairs in downstream tasks compared to state-of-the-art adversarial and non-adversarial models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2018

Unsupervised Cross-lingual Transfer of Word Embedding Spaces

Cross-lingual transfer of word embeddings aims to establish the semantic...
research
04/04/2019

Density Matching for Bilingual Word Embedding

Recent approaches to cross-lingual word embedding have generally been ba...
research
04/10/2020

A Simple Approach to Learning Unsupervised Multilingual Embeddings

Recent progress on unsupervised learning of cross-lingual embeddings in ...
research
10/11/2022

IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces

The ability to extract high-quality translation dictionaries from monoli...
research
10/31/2019

Neural Cross-Lingual Relation Extraction Based on Bilingual Word Embedding Mapping

Relation extraction (RE) seeks to detect and classify semantic relations...
research
02/08/2021

SLUA: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning

Word alignment is essential for the down-streaming cross-lingual languag...
research
10/28/2019

Cross-Domain Ambiguity Detection using Linear Transformation of Word Embedding Spaces

The requirements engineering process is a crucial stage of the software ...

Please sign up or login with your details

Forgot password? Click here to reset