Filtered Inner Product Projection for Multilingual Embedding Alignment

06/05/2020
by   Vin Sachidananda, et al.
0

Due to widespread interest in machine translation and transfer learning, there are numerous algorithms for mapping multiple embeddings to a shared representation space. Recently, these algorithms have been studied in the setting of bilingual dictionary induction where one seeks to align the embeddings of a source and a target language such that translated word pairs lie close to one another in a common representation space. In this paper, we propose a method, Filtered Inner Product Projection (FIPP), for mapping embeddings to a common representation space and evaluate FIPP in the context of bilingual dictionary induction. As semantic shifts are pervasive across languages and domains, FIPP first identifies the common geometric structure in both embeddings and then, only on the common structure, aligns the Gram matrices of these embeddings. Unlike previous approaches, FIPP is applicable even when the source and target embeddings are of differing dimensionalities. We show that our approach outperforms existing methods on the MUSE dataset for various language pairs. Furthermore, FIPP provides computational benefits both in ease of implementation and scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2018

Learning Multilingual Word Embeddings in a Latent Metric Space: A Geometric Approach

We propose a novel geometric approach for learning bilingual mappings gi...
research
08/27/2018

Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach

We propose a novel geometric approach for learning bilingual mappings gi...
research
08/04/2016

UsingWord Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval

Cross-Language Information Retrieval (CLIR) has become an important prob...
research
06/07/2019

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

Word embedding is central to neural machine translation (NMT), which has...
research
07/19/2021

Cross-Lingual BERT Contextual Embedding Space Mapping with Isotropic and Isometric Conditions

Typically, a linearly orthogonal transformation mapping is learned by al...
research
08/25/2023

Media of Langue

This paper aims to archive the materials behind "Media of Langue" by Gok...

Please sign up or login with your details

Forgot password? Click here to reset